- Do not modify variable names in
ck_setup()
[#17]
- Performance Fix: Do not compute contributing indices (relevant only for the perturbation of numerical variables) if no such variables were specified in
ck_setup()
- Prefix logging-Outputs with a timestamp
- Updated Unit-Tests to be compatible with CRAN checks for MacOs
- Update References in Package Vignette
- first version on CRAN
- updated due to changes in Package
ptable
- parallel computation is controlled using environment-variable
CK_RUN_PARALLEL
. If this is set toTRUE
, parallel computation is enabled, otherwise it is disabled. By default, parallel computing is disabled.
- do not rename columns of
ptable
input object - fix vignette and tests
- bump requirements
- performance improvement: do not compute max contributions if no numeric key variable was specified
- feature: new method
$supp_cells()
that allows to specifiy sensitive cells based on names - bugfix: replace
sign()
withifelse
to enforce perturbation of cells that require additional protection
- bugfix: do not perturb cells with value 0 using flex-approach
- bugfix: use only actual number of contributors in flex-approach when perturbing numvars
- performance-improvement: speed-up computation of contributing units to cells
- bugfix: scrambling cell keys having not enough digits
- bugfix: computation separation point in flex-approach
- set default value of argument
w = NULL
inck_setup()
- improve some error messages/outputs
- bugfix when looking up perturbation values; small_cells and others must not intersect
- fixing issues with empty cells for magnitude tables
- fixed computation of weighted spreads, thx @staudtlex
- do not perturb cells with 0 contributors in the flex-approach
- update tests due to updates in
digest
-pkg - correctly compute weighted spread
- do not convert variable names to lowercase
- document R6 methods/classes via
roxygen2
- new method
hierarchy_info()
containing some important information for each dimension - reference new methods
create_cnt_ptable()
andcreate_num_ptable()
from ptable-pkg - update vignette
- improve documentation of
ck_params_nums()
- update tests due to updates in ptable-pkg
- feature: allow objects from
ptable::pt_create_pParams
as input inck_params_cnts()
- feature: allow objects from
ptable::pt_create_pParams
as input inck_params_nums()
- bugfix: fallback to use a single core on windows-machines, fixing
issue #131
- updating dependencies and required versions
- fix vignette due to updates in
ptable
-pkg - simplify examples by using examplary ptable from
ptable
-pkg usingptable::pt_ex_cnts()
andptable::pt_ex_nums()
- feature: allow tabulation of non-perturbed variables in
freqtab()
- feature: allow tabulation of non-perturbed variables in
numtab()
- code linting
- bugfix: fixing issues with "simple" approach; harmonizing and code-cleanup
- allow to save perturbation-schemes for different variables in
params_cnts_set()
andparams_nums_set()
- allow return current active perturbation parameters for variables with
params_cnts_get()
andparams_nums_get()
- added new convenience methods
allvars()
,numvars()
andcntvars()
returning variable names eligable for perturbation - implemented the perturbation of numerical variables
- new method
ck_params_nums()
to define perturbation parameters for continuous variables along with helper-functionsck_flexparams()
andck_simpleparams()
- new method
numtab()
to extract numerical tables - new method
mod_nums()
returning modifications for numerical variables
- new method
- updated methods
print()
andsummary()
to include information about perturbed continuous variables - new methods
reset_cntvars()
,reset_numvars()
andreset_allvars()
to remove perturbation results and provided perturbation parameters - new methods to identify sensitive cells
$supp_freq(v, max_n)
$supp_nk(v, max_n)
$supp_p(v, max_n)
$supp_pq(v, max_n)
- added test-cases and improved coverage
- make use of ptables from
ptable
-pkg - Reproducibility:
- allow to write perturbation parameters as yaml in
ck_params_nums()
andck_params_cnts()
- allow to import such parameters with
ck_read_yaml()
- allow to write perturbation parameters as yaml in
- updated and extended package vignette
- correctly compute perturbed weighted counts
- fixed issue in lookup up values for frequency tables
- adding parameter
exclude_zero_cells
tock_cnt_measures()
- updated documentation
- force usage of functionality from
sdcHierarchies
to define hierarchies - removed features to perturb magnitude tables for now as parametrisation from
ptable
-pkg is not yet defined - removed possiblity to specify parameters for count variables using the
ABS
definition - removed
by
-argument in$perturb()
method - rewrite frequency table perturbation using
R6
classes- new function
ck_setup()
to define a table - allow multiple variables in method
$perturb()
-method - removed
ck_export_table()
and added arguments to methodfreqtab()
- new method
$print()
for R6 objects - new method
$summary()
for R6 objects - new method
$mod_cnts()
returning modifications for count variables - new method
$params_cnts()
that allow to query and set count parameters
- new function
- updated
ck_cnt_measures()
- renamed
false_positives
tofalse_nonzero
- improved documentation
- add table of to
ck_cnt_measures
showing exact perturbations - harmonized output with tau-argus
- renamed
- new method
$measures()
usesck_cnt_measures()
internally for count variables - updated unit tests for counts based on hashes
- updated of package vignette
- fix tests due to changes in R 3.6.0
- install
ptable
from personal fork untilsdcTools/ptable
is updated
- removed placeholder for
pThreshold
inperturbTable()
- make use of new package
sdcHierarchies
to generate and update hierarchies - new function
ck_rename_nodes()
perturbTable()
got a new argumentpThreshold
that allows to specify a threshold above no perturbation is applied independent from the perturbation table. Currently only a placeholder and not used.
- new convenience function
ck_vignette()
that displays the package vignette in a browser ck_generate_rkeys()
got a new argumentseed
that allows to overwrite the default seed computed from a hash of the input dataset.- improvements in code-styling and readability of examples
- improvements in vignette
- feature: new method
ck_export_table()
that allows to save results in a simple format - improvement: better error-message if too large values for
bigN
are specified - improvement: better error-message if parameter
smallN
is too large in respect to the specified pTable - improvement: display message about ignored parameters in
ck_generate_rkeys()
only if non-required parameters have been actually specified - improvement: no warning messages that parameters are ignored in case they are irrelevant
- feature: new function
ck_cnt_measures_basic()
that computes infoloss/utility measures based on two input vectors referring to original and perturbed values - bugfix: check that record-keys in destatis-format are >= 0
- feature: perturbation parameters for magnitude tables can be left empty
- new function
ck_cnt_measures()
that computes some (distance-based) information loss measures for count variables - updated vignette and examples
- feature: new method
print()
for objects returned fromperturbTable()
- feature: new method
summary()
for objects returned fromperturbTable()
- small updates to reflect changes in ptable
- depend on package ptable to generate the perturbation tables by rewriting
ck_create_pTable()
; thus the package must be installed, e.g usingdevtools::install_github("sdcTools/ptable", build_vignette=FALSE)
- feature: use package ptable to generate pTables in destatis format
- feature: use functions to create hierarchies directly from
sdcTable
and bump version requirement of this package to>=0.23
- feature: if a (valid) variable is specified in argument
by
in perturbTable()
it is automatically added tocountVars
even though not explicitely specified.
- feature:
perturbTable()
gained an optional new argumentby
. In this argument one can use a variable that must also be listed incountVars
. This variable is then used to compute the magnitute tables by the given 0/1 binary variable. For an example see?perturbTable
.
- feature: new argument
countVars
inperturbTable()
which allows to additionally tabulate any number or 0/1 variables. For such variables. In such case, the record-keys of non-contribution units are set to 0 prior to the lookup in the perturbation table - removed method
results()
and replaced it with new methodsck_freq_table()
andck_cont_table()
that should be used to query specific tables from the output ofperturbTable()
- updated examples showing new features in
perturbTable()
,ck_freq_table()
andck_cont_table()
- updated introduction vignette
- bugfix for continous tables using pre-specified record keys
-
feature: new dynamic way to specify hierarchies for tables, for an example see
?ck_manage_hierarchies
. This functionality will eventually also find its way to sdcTable- bugfix: rkeys need not to be integer if the "destatis"-method is used
- small fixes and some exported function gained
verbose
arguments
- feature: perturbation tables (
pTable
) can now be specified in two different formats. The (default) way is to specify it as described in the original ABS-paper Methodology for the Automatic Confidentialisation of Statistical Outputs from Remote Servers at the Australian Bureau of Statistics (Thompson, Broadfoot, Elazar). An alternative way is to provide the perturbation tables for count tables in the "destatis"-format.ck_create_pTable(type="destatis")
returns an exemplary pTable in this format. In the future, such pTables will likely be generated from another package. As the requirements regarding record keys are different in the following lookup-approach, we have already implemented some (basic) checks for validity of record keys when they are already available in the microdata used inck_create_input()
.