How do privacy-enhancing modifications affect network analyses?
anonymisation-effect-analysis.Rmd
To protect the privacy of the livestock industry, livestock movement data often need to be made less identifiable before being shared more widely, e.g. with researchers at universities. Modifications that enhance privacy can include jittering (addition of random noise) or rounding of numeric data, such as movement weights and dates. However, it is not always clear how these modifications affect the accuracy of epidemiologically relevant network analyses.
This vignette demonstrates how to use the
create_anonymisation_effect_analysis_report()
function to
investigate this for your favourite movement dataset. The aim is to help
you find a suitable balance between privacy and accuracy, when you are
considering privacy enhancement for your data.
N.B. This function is currently only implemented for investigating the effects of jittering/rounding movement weights on a selection of weighted network properties. An implementation for investigating the effects of jittering/rounding movement dates on temporal network properties is in the pipeline.
Setting up
To get started, first load the movenet package.
Then, load a configuration file that can be used with the example
movement dataset that we will use in this vignette
(example_movement_data
). The configuration file tells
movenet which columns in the dataset contain which data types, so that
the correct columns get modified when using the privacy-enhancing
functions. For more details on configuration files see
vignette("movenet")
and
vignette("configurations")
.
library(movenet)
# Load a movement config file:
load_config(system.file("configurations", "ScotEID.yml", package="movenet"))
#> Successfully loaded config file: C:/Users/cboga/AppData/Local/Temp/Rtmp25xhub/temp_libpath11484970381d/movenet/configurations/ScotEID.yml
Investigating the effects of jittering and rounding movement weights on network proprties
The function
create_anonymisation_effect_analysis_report()
takes as
input a livestock movement dataset, and analyses how network analyses of
these data are affected by jittering or rounding of movement weights. It
creates an html report with visualisations of the effects of different
amounts of jittering and rounding on a selection of epidemiologically
relevant global network properties as well as on the ranking of holdings
according to various centrality measures.
The function uses the following arguments:
-
movement_data
: A movenet-format movement data tibble for which to analyse the effects of privacy enhancing modifications. Here we use the example movement data provided with movenet. -
output_file
: The path to the output file where the report will be saved. If the file name does not include a path, the report will be saved in your current working directory. Here we save the file to a temporary directory. -
modify_weights
: A logical indicating whether to modify the weights in the dataset. This needs to beTRUE
for the current implementation. - (
modify_dates
: a logical indicating whether to modify the dates in the dataset. This is not yet implemented and is currently automatically set toFALSE
.) -
n_jitter_sim
: The number of times to runjitter_weights()
onmovement_data
, to allow for random variation in the results. Results are averaged over these runs. -
time_unit
: The time unit over which to analyse the data, e.g."1 week"
,"28 days"
or"1 month"
. Here we use"28 days"
, meaning that the overall movement data will be converted into a series of networks for all consecutive 28-day periods in the data, and network properties will be calculated for each of these 28-day networks. -
data_reference
: An optional string that can be included as a subtitle in the report
to reference (for example) the dataset used. -
verbose
: A logical indicating whether to print progress messages to the console.
create_anonymisation_effect_analysis_report(example_movement_data,
output_file = file.path(tempdir(),"network_report.html"),
modify_weights = TRUE,
n_jitter_sim = 3,
time_unit = "28 days",
data_reference = "movenet's example_movement_data",
verbose = TRUE)
This creates an html document that can be opened in a browser, and contains a full report as shown below: