Skip to contents

To help address concerns around commercial sensitivity, movenet includes a range of functions to make livestock movement data and/or holding data non-identifiable. This vignette shows you how to use these.

library(movenet)

# Load a combined movement and holding config file:
load_config(system.file("configurations", "fakeScotEID_combined_config.yml", package="movenet"))
#> Successfully loaded config file: C:/Users/cboga/OneDrive - University of Glasgow/Documents/R/win-library/4.1/movenet/configurations/fakeScotEID_combined_config.yml

# Load example movenet-format movement and holding tibbles into the global environment:
data(example_movement_data, package = "movenet")
data(example_holding_data, package = "movenet")

Pseudonymising holding identifiers

The function anonymise() pseudonymises holding identifiers in movenet-format movement or holding data tibbles, by replacing these identifiers with a number and an optional prefix (e.g. “FARM”).

It returns a pseudonymised data tibble and the applied pseudonymisation key. This key can optionally be saved to recover the original identifiers at a later date, or for application to an overlapping dataset.

# Pseudonymise movement_data by changing identifiers to FARM1-N:
pseudonymised <- anonymise(example_movement_data, prefix = "FARM")
pseudonymised_movement_data <- pseudonymised$data
pseudonymisation_key <- pseudonymised$key

head(pseudonymised_movement_data) # Inspect pseudonymised movement data
#> # A tibble: 6 x 5
#>   departure_cph dest_cph departure_date qty_pigs movement_reference
#>   <chr>         <chr>    <date>            <dbl>              <dbl>
#> 1 FARM152       FARM216  2019-02-08           97             304781
#> 2 FARM186       FARM466  2019-08-15          167             229759
#> 3 FARM435       FARM70   2019-09-15          115              36413
#> 4 FARM438       FARM337  2019-10-26          125             488616
#> 5 FARM292       FARM164  2019-10-17          109             581785
#> 6 FARM46        FARM373  2019-10-06           72             564911
head(pseudonymisation_key) # Inspect pseudonymisation key
#> 96/999/4677 79/642/5562 86/867/7476 75/345/2020 76/613/8076 67/158/5432 
#>     "FARM1"     "FARM2"     "FARM3"     "FARM4"     "FARM5"     "FARM6"

anonymise() also takes an optional key argument, with which you can apply an existing pseudonymisation key to the data tibble:

# Use the same key from above to substitute holding identifiers in holding_data:
pseudonymised_holding <- anonymise(example_holding_data, key = pseudonymisation_key) 
pseudonymised_holding_data <- pseudonymised_holding$data
# Update saved key, in case additional identifiers were added from the holding datafile:
pseudonymisation_key <- pseudonymised_holding$key 

head(pseudonymised_holding_data) # Inspect pseudonymised holding data
#> # A tibble: 6 x 4
#>   cph     holding_type herd_size           coordinates
#>   <chr>   <chr>            <dbl>           <POINT [°]>m
#> 1 FARM202 GXFSR             2111   (3.718568 52.69096)
#> 2 FARM213 SCHZQ             2134  (-4.959035 51.88195)
#> 3 FARM460 HEJDE             2140   (5.709143 51.97547)
#> 4 FARM70  IQALL             2141  (-2.983365 57.55851)
#> 5 FARM238 YUFUC             2148   (5.586477 50.50902)
#> 6 FARM330 IATKP             2151 (-0.3323066 58.84295)

This allows multiple datasets to be pseudonymised in a consistent way, so that it is possible to subsequently merge the datasets by pseudonymised identifier.

Modifying dates, weights, and optional numeric data columns

movenet also has functions to modify movement dates or weights by applying a small amount of noise (jittering) or by rounding:

  • jitter_dates(data, range) adds random noise of up to range days to movement dates.

  • jitter_weights(data, range, column) adds random noise of up to range to a numeric column in the movement data, by default the “weight” column.

  • round_dates(data, unit, week_start, sum_weight, ...) rounds movement dates down to the first day of the specified time unit. For rounding down to weeks, set the starting day of the week with week_start. By default, weights are aggregated for all movements between the same holdings over the indicated time unit (sum_weight = TRUE); to keep movements separate, set sum_weight = FALSE. Alternative or additional summary functions can be applied through ..., using tidy evaluation rules.

  • round_weights(data, unit, column) rounds data in a numeric column, by default the “weight” column, to multiples of unit.

# Add jitter of up to ±5 days to movement dates::
movedata_datesj5 <- jitter_dates(example_movement_data, range = 5) 
head(example_movement_data) # Inspect original
#> # A tibble: 6 x 5
#>   departure_cph dest_cph    departure_date qty_pigs movement_reference
#>   <chr>         <chr>       <date>            <dbl>              <dbl>
#> 1 95/216/1100   19/818/9098 2019-02-08           97             304781
#> 2 69/196/5890   71/939/3228 2019-08-15          167             229759
#> 3 52/577/5349   82/501/8178 2019-09-15          115              36413
#> 4 39/103/5541   13/282/1763 2019-10-26          125             488616
#> 5 41/788/6464   57/418/6011 2019-10-17          109             581785
#> 6 69/393/9398   39/947/2201 2019-10-06           72             564911
head(movedata_datesj5) # Inspect jittered dates
#> # A tibble: 6 x 5
#>   departure_cph dest_cph    departure_date qty_pigs movement_reference
#>   <chr>         <chr>       <date>            <dbl>              <dbl>
#> 1 95/216/1100   19/818/9098 2019-02-06           97             304781
#> 2 69/196/5890   71/939/3228 2019-08-10          167             229759
#> 3 52/577/5349   82/501/8178 2019-09-14          115              36413
#> 4 39/103/5541   13/282/1763 2019-10-31          125             488616
#> 5 41/788/6464   57/418/6011 2019-10-19          109             581785
#> 6 69/393/9398   39/947/2201 2019-10-05           72             564911

# Add jitter of up to ±10 to movement weights::
movedata_weightsj10 <- jitter_weights(example_movement_data, range = 10)
head(movedata_weightsj10) # Inspect jittered weights
#> # A tibble: 6 x 5
#>   departure_cph dest_cph    departure_date qty_pigs movement_reference
#>   <chr>         <chr>       <date>            <dbl>              <dbl>
#> 1 95/216/1100   19/818/9098 2019-02-08        106.              304781
#> 2 69/196/5890   71/939/3228 2019-08-15        174.              229759
#> 3 52/577/5349   82/501/8178 2019-09-15        122.               36413
#> 4 39/103/5541   13/282/1763 2019-10-26        116.              488616
#> 5 41/788/6464   57/418/6011 2019-10-17        103.              581785
#> 6 69/393/9398   39/947/2201 2019-10-06         74.8             564911

# Round movement dates down to the first day of the month, but do not aggregate:
movedata_months <- round_dates(example_movement_data, unit = "month", sum_weight = FALSE) 
head(movedata_months) # Inspect rounded dates
#> # A tibble: 6 x 5
#>   departure_cph dest_cph    departure_date qty_pigs movement_reference
#>   <chr>         <chr>       <date>            <dbl>              <dbl>
#> 1 95/216/1100   19/818/9098 2019-02-01           97             304781
#> 2 69/196/5890   71/939/3228 2019-08-01          167             229759
#> 3 52/577/5349   82/501/8178 2019-09-01          115              36413
#> 4 39/103/5541   13/282/1763 2019-10-01          125             488616
#> 5 41/788/6464   57/418/6011 2019-10-01          109             581785
#> 6 69/393/9398   39/947/2201 2019-10-01           72             564911

# Round movement dates down to the first day of the month, aggregate weights, and list reference numbers:
movedata_months_aggr <- round_dates(example_movement_data, unit = "month", sum_weight = TRUE,
                                    movement_reference = list(movement_reference))
# Inspect aggregated record for holdings which have 2 movements in the same month:
movedata_months_aggr[which(sapply(movedata_months_aggr$movement_reference, length) == 2),] 
#> # A tibble: 1 x 5
#>   departure_cph dest_cph    departure_date qty_pigs movement_reference
#>   <chr>         <chr>       <date>            <dbl> <list>            
#> 1 57/427/5455   21/771/7140 2019-09-01          156 <dbl [2]>

# Round movement reference numbers to the nearest multiple of 10:
movedata_ref10 <- round_weights(example_movement_data, unit = 10, column = "movement_reference")
head(movedata_ref10) # Inspect rounded movement reference numbers
#> # A tibble: 6 x 5
#>   departure_cph dest_cph    departure_date qty_pigs movement_reference
#>   <chr>         <chr>       <date>            <dbl>              <dbl>
#> 1 95/216/1100   19/818/9098 2019-02-08           97             304780
#> 2 69/196/5890   71/939/3228 2019-08-15          167             229760
#> 3 52/577/5349   82/501/8178 2019-09-15          115              36410
#> 4 39/103/5541   13/282/1763 2019-10-26          125             488620
#> 5 41/788/6464   57/418/6011 2019-10-17          109             581780
#> 6 69/393/9398   39/947/2201 2019-10-06           72             564910

Modifying holding coordinates

A function to resample holding coordinates in a density-dependent manner is under development within the hexscape package.