Skip to content

Improve ROMM within addNoise by RegSDC methodology #271

@olangsrud

Description

@olangsrud

It seems that ROMM within addNoise is implemented in a way not preserving sample means. Below I suggest how to fix this and speed up the calculations remarkably by utilizing methodology in a recent paper. See https://github.com/olangsrud/RegSDC (hopefully soon on CRAN). Below, I will refer to the functions in that package.

y <- testdata[sample(NROW(testdata), 100), c("expend", "income", "savings")]
addNoise(y, method = "ROMM")$xm

# An almost identical (read about sequentially phenomenon in paper for minor differences) method is  

RegSDCromm(y, lambda = 0.001, ensureIntercept = FALSE)

# This can be viewed as a high-speed version of the current implementation in addNoise.
# Sample means is preserved by the default method where ensureIntercept = TRUE.
# Other values of lambda may be used. 

RegSDCromm(y, lambda = 0.001)

# This is equivalent to calling a more general function 

RegSDCgen(y, lambda = 0.001, makeunique = TRUE)

# The parameter makeunique is of minor importance, but must be TRUE if exact distributional behaviour 
# is important (sample form RegSDCromm several times). So setting makeunique to FALSE can be OK. 

# Feel free to import/wrap functions from  RegSDC within sdcMicro.  
# However, this line 

RegSDCgen(y, lambda = 0.001, makeunique = FALSE)

# can be implemented without using RegSDC by 

lambda <- 0.001
y <- as.matrix(y)
Mean <- function(x) t(matrix(colMeans(x), ncol(x), nrow(x)))
qr1 <- qr(y - Mean(y))
qr1Q <- qr.Q(qr1)
z <- qr1Q + lambda * matrix(rnorm(length(qr1Q)), nrow(y))
qr2 <- qr(z - Mean(z))
Mean(y) + qr.Q(qr2) %*% qr.R(qr1)

# Here Mean can be replaced in several ways. The difference from the result using RegSDCgen is at the 
# level of numerical precision (use set.seed to see).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions