Anonymise data by replacing holding identifiers with prefix-integer combinations
anonymise.Rd
anonymise()
anonymises a holding or movement data frame by replacing
holding identifiers with prefix-integer combinations. Both the anonymised
data frame and the anonymisation key are returned. By default, a new
anonymisation key is generated; alternatively, an existing key can be
provided.
Arguments
- data
A holding or movement data frame.
- prefix
Character string, to form the basis of anonymised holding identifiers. An integer will be appended to form this new identifier.
- key
A named character vector to be used as anonymisation key, or
NULL
(default) to generate a new key. A providedkey
should have original holding identifiers as names, and new (anonymised) identifiers as values.
Value
A named list with two elements:
data
containing the anonymised data framekey
containing the applied anonymisation key. This has the form of a named character vector, with original holding identifiers as names, and new (anonymised) identifiers as values.
Details
Requires that the appropriate config file is loaded, to identify the
column(s) in data
that contain(s) holding identifiers: origin (from
) and
destination (to
) columns for movement data, or the id
column for holding
data.
If key == NULL
(default), a new anonymisation key is generated, with
holdings being given new identifiers consisting of prefix
followed by an
integer ranging between 1 and the total number of holdings. Integers are
assigned to holdings in a random order.
If an existing key
is provided, its coverage of holding identifiers in
data
is checked. If all holding identifiers in data
are present among
element names in key
, the key
is used for anonymisation as-is: holding
identifiers in data
are replaced with the values of elements of the same
name in key
. Otherwise, if data
contains holding identifiers that are not
present in key
, the key
is expanded by adding additional prefix
-integer
combinations.