Nested Hierarchies? #66

alexrand-gprl · 2025-05-14T18:15:31Z

alexrand-gprl
May 14, 2025

I'm working on a meta-analysis of studies in which each study has the same single continuous outcome measurement of interest, measured over time for each participant.

I've gotten my ctsem model fitting separately to each study-specific dataset, with partial pooling across participants in that study. But I'd like to fit one big model that also does partial pooling of study-specific parameters across studies.

I'm new to ctsem but I haven't found any info on how/whether this might be possible in the documentation. One idea I had was to treat each participant's measurement as its own outcome variable, and fit the model as a multi-measurement factor model. But this seems tricky because studies vary greatly in their number of participants, with some having a few dozen, and others having several thousand.

Any guidance appreciated!

cdriveraus · 2025-05-14T20:08:44Z

cdriveraus
May 14, 2025
Maintainer

Yes extra levels are a bit tricky. Handling one of the levels as fixed effects makes it much easier, but I assume that's not desirable. What exactly do you want to pool across - you want to allow every parameter to vary across individuals and to have this pooled based on both study specific population and overall (many studies) population? It might be possible to make each parameter a sum of 2 parameters - one N(popmean, popsd) and the other N(studymeandeviation, studysd) - and have the study specific distribution parameters moderated by study dummy variables. This would be interesting and nice to see, happy to provide some support but you would also be diving in to the deeper waters with both ctsem and random effects thinking! A sim study to verify the approach would be a good start. Can't really envisage the measurement model version but I might be misunderstanding the problem.

…

On Wed, May 14, 2025, 20:15 alexrand-gprl ***@***.***> wrote: I'm working on a meta-analysis of studies in which each study has the same single continuous outcome measurement of interest, measured over time for each participant. I've gotten my ctsem model fitting separately to each study-specific dataset, with partial pooling across participants in that study. But I'd like to fit one big model that also does partial pooling of study-specific parameters across studies. I'm new to ctsem but I haven't found any info on how/whether this might be possible in the documentation. One idea I had was to treat each participant's measurement as its own outcome variable, and fit the model as a multi-measurement factor model. But this seems tricky because studies vary greatly in their number of participants, with some having a few dozen, and others having several thousand. Any guidance appreciated! — Reply to this email directly, view it on GitHub <#66>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA7LCBORQFA7R4NORNBRFTL26OB5RAVCNFSM6AAAAAB5ECKH5CVHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZYGMZDMMBQGI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

0 replies

cdriveraus · 2025-05-24T08:02:10Z

cdriveraus
May 24, 2025
Maintainer

yes, if you change the subject variable to a study variable then you've mashed all the individual time series together. Some kind of hack along those lines, where you make sure each subject is a long enough time after the previous subject so they begin from the stationary distribution, may well work, though is a bit on the ugly side! A nicer approach might be to fit the individual study models without priors, estimate the population distribution of the raw parameters using the distribution from each study (you'll find the samples under myfit$stanfit$rawposterior), then modify the priors to reflect this. This will only shrink estimates based on the variance of each parameter unfortunately -- it would be possible to modify the internals of ctsem to accomodate a multivariate prior, but this is not the standard setup.

Perhaps there is a principled way to avoid the re-estimation and just shrink the obtained estimates to the empirical bayes estimates -- I kind of guess there is but don't know it off the top of my head.

I'll have a little more time to think in a week or two.

0 replies

cdriveraus · 2025-05-24T11:05:42Z

cdriveraus
May 24, 2025
Maintainer

Well, since I needed this for another work and it seemed much more interesting than what I otherwise had to do... here's an empirical bayes approach / test simulation. the blue estimates are without shrinkage across multiple studies, the eb (red) estimates consider the population of studies (ran with 20 studies here, only the drift parameter had variation, dotted black line represents approximate true value / mean).

library(ctsem)
library(ggplot2)
library(data.table)
# x  : list of length M; x[[i]] is an n_draws x p matrix of posterior samples
# out: list of length M; each element is a p-vector of EB/shrinkage estimates
get_empirical_bayes <- function(x, eps = 1e-8) {
  M <- length(x)
  if (M < 2) stop("Need at least two studies.")

  # --- Study-level posterior summaries --------------------------------------
  means <- t(vapply(x, colMeans, numeric(ncol(x[[1]]))))   # M x p
  Vlist <- lapply(x, cov)                                  # list of p x p

  # --- Hyper-parameter (population) estimates --------------------------------
  mu_hat <- colMeans(means)                                # pooled mean

  S_between   <- cov(means)                                # raw between-study cov
  W_withinbar <- Reduce("+", Vlist) / M                    # average within-study cov
  Sigma_hat   <- S_between - W_withinbar                   # method-of-moments Σ

  # regularise to ensure Σ̂ is positive-semi-definite
  eig <- eigen(Sigma_hat, symmetric = TRUE)
  eig$values[eig$values < 0] <- 0                          # truncate negatives
  Sigma_hat <- eig$vectors %*% diag(eig$values) %*% t(eig$vectors) +
    eps * diag(ncol(means))

  # --- Empirical-Bayes (shrinkage) estimates ---------------------------------
  EB <- vector("list", M)
  for (i in seq_len(M)) {
    Vi  <- Vlist[[i]]
    Ki  <- Sigma_hat %*% solve(Sigma_hat + Vi)              # gain matrix
    EB[[i]] <- drop(mu_hat + Ki %*% (means[i, ] - mu_hat))  # p-vector
    names(EB[[i]]) <- colnames(x[[i]])
  }
  EB
}

## Example usage (p = 3 parameters, 20 studies, 2000 posterior draws each):
# set.seed(1)
# x <- replicate(20, matrix(rnorm(2000*3), 2000, 3), simplify = FALSE)
# eb_est <- get_empirical_bayes(x)


nstudy <- 20

drift = -log1p_exp(rnorm(nstudy, 0, 0.5))

parsposterior <- list()
for(studyi in 1:nstudy){
  gm <- ctModel(
    DRIFT=drift[studyi],
    Tpoints=20,
    type='omx',
    LAMBDA = diag(1),
    CINT = 0,
    T0MEANS = 5,
    T0VAR = diag(1),
    DIFFUSION=1,MANIFESTVAR=.2)

  dat <- ctGenerate(n.subjects = 20,ctmodelobj = gm)

  m <- ctModel(type='stanct',LAMBDA=diag(1), DRIFT='drift|-log1p_exp(param)')
  f <- ctStanFit(datalong = dat,ctstanmodel = m,cores=1,priors=T)

  parnames=ctsem:::getparnames(f)
  parsposterior[[studyi]] <- f$stanfit$rawposterior

  if(studyi ==1) parestimates <- matrix(NA, nrow=nstudy, ncol=length(f$stanfit$rawest))
  parestimates[studyi,] <- f$stanfit$rawest

  if(f$standata$nindvarying > 0){
    ms=f$setup$matsetup
    ms=ms[!duplicated(ms$param) & ms$param > 0,]
    nsdpars <- f$standata$nindvarying
    sdparnames <- paste0('sd_',parnames[ms$param[f$standata$indvaryingindex]])
    parnames = c(parnames, sdparnames)
  }
  if(f$standata$nindvarying > 1){
    corparnames <- paste0('cor_',matrix(
      paste0(rep(parnames[ms$param[f$standata$indvaryingindex]],times=length(sdparnames)), '_',
        rep(parnames[ms$param[f$standata$indvaryingindex]], each=length(sdparnames))),
      nrow=f$standata$nindvarying, ncol=f$standata$nindvarying)[lower.tri(diag(f$standata$nindvarying))])
    parnames = c(parnames, corparnames)
  }

  colnames(parsposterior[[studyi]]) <- parnames

}

#compute approx true estimates (rough approach!, using large data)
gm$DRIFT[1]= -log1p_exp(0)
gm$Tpoints=200
bigdat <- ctGenerate(n.subjects=500, ctmodelobj=gm)
bigf <- ctStanFit(datalong = bigdat, ctstanmodel = m, cores=10)
truepars <- bigf$stanfit$rawest
names(truepars) = parnames


#get par posterior before shrinkage from estimated distribution
dtparspost <- data.table::rbindlist(lapply(parsposterior,as.data.table),idcol='study')
dtparspost <- melt(dtparspost, id.vars='study', variable.name='parname', value.name='value')

dtparspostmeans=copy(dtparspost)[, .(value=mean(value)), by=.(study,parname)] #means (max a posteriori)


#get empirical bayes estimates
ebpars <- get_empirical_bayes(parsposterior)
ebparsm <- do.call(rbind, ebpars)           # each row = EB for one study
colnames(ebparsm) <- names(ebpars[[1]])
ebdtpars <- melt(data.table(study=1:length(ebpars),ebparsm),variable.name='parname', value.name='value',id.vars='study')


#rbind data
dt <- rbind(
  data.table(EstimateType = 'Max a posteriori', dtparspostmeans),
  data.table(EstimateType = 'EmpiricalBayes', ebdtpars)
)
dt[,true:=truepars[parname]]

# install.packages("gghalves")   # run once if you don’t have it
library(gghalves)

ggplot(dt,
  aes(y      = value,colour=factor(EstimateType))) +
  ## 1 ── cloud: half-violin (left)
  geom_half_violin(
    side     = "l",        # draw on the left of the y position
    trim     = FALSE,
    scale    = "width",    # all clouds same width
    width    = 0.8,
    alpha    = 0.50
  ) +
  ## 2 ── box: boxplot (centre)
  # geom_boxplot(
  #   width          = 0.10,             # thin boxes
  #   outlier.shape  = NA,
  #   alpha          = 0.55,
  #   orientation    = "x"               # horizontal boxes
  # ) +
  ## 3 ── rain: jittered points (right)
  geom_half_point(
    side        = "r",                 # draw on the right
    range_scale = 0.6,                 # how far from the centre
    size        = 1.8,
    alpha       = 0.75
  ) +
  geom_hline(
    aes(yintercept = true),            # horizontal line at true value
    linetype = "dashed",               # dashed line
    color    = "black",                # black line
    size     = 0.5
  ) +
  facet_wrap(~ parname, scales = "free") +   # one column per parameter
  theme_bw() +
  theme(
    strip.background = element_blank(),
    strip.placement  = "outside",
    axis.title.y     = element_blank(),
    axis.text.y      = element_text(size = 7),
    legend.position  = "bottom"
  ) +
  labs(
    y     = "Raw estimate",
    title = "Raincloud plot: density, box & points for each study"
  )



#now transform raw parameters into system parameters

ctTransformRawPars <- function(x, f){ #include this in ctsem!
  ms <- f$setup$matsetup
  mv <- f$setup$matvalues
  mv <- mv[!duplicated(ms$param) & ms$param > 0,]
  ms <- ms[!duplicated(ms$param) & ms$param > 0,]
  m <- cbind(data.table(ms),data.table(mv))
  for(pari in sort(m$param)){
  x[,pari]=ctsem:::tform(parin=x[,pari,with=F],transform = m$transform[pari], offset=m$offset[pari],
    multiplier = m$multiplier[pari],inneroffset = m$inneroffset[pari], meanscale = m$meanscale[pari])
  }
  return(x)
}

#reshape dt into study x parameter matrix
tfdtwide <- dcast(dt, study + EstimateType ~ parname, value.var = 'value')
#transform raw parameters
tfdtwide[,c(3:ncol(dtwide))] <- ctTransformRawPars(tfdtwide[,c(3:ncol(tfdtwide)),with=F], f)

tfdt <- melt(tfdtwide, id.vars=c('study','EstimateType'), variable.name='parname', value.name='value')
tftruepars <- as.numeric(ctTransformRawPars(data.table(t(truepars)), f))
tfdt[,true:=tftruepars[parname]]

ggplot(tfdt,
  aes(y      = value,colour=factor(EstimateType))) +
  ## 1 ── cloud: half-violin (left)
  geom_half_violin(
    side     = "l",        # draw on the left of the y position
    trim     = FALSE,
    scale    = "width",    # all clouds same width
    width    = 0.8,
    alpha    = 0.50
  ) +
  ## 2 ── box: boxplot (centre)
  # geom_boxplot(
  #   width          = 0.10,             # thin boxes
  #   outlier.shape  = NA,
  #   alpha          = 0.55,
  #   orientation    = "x"               # horizontal boxes
  # ) +
  ## 3 ── rain: jittered points (right)
  geom_half_point(
    side        = "r",                 # draw on the right
    range_scale = 0.6,                 # how far from the centre
    size        = 1.8,
    alpha       = 0.75
  ) +
  geom_hline(
    aes(yintercept = true),            # horizontal line at true value
    linetype = "dashed",               # dashed line
    color    = "black",                # black line
    size     = 0.5
  ) +
  facet_wrap(~ parname, scales = "free") +   # one column per parameter
  theme_bw() +
  theme(
    strip.background = element_blank(),
    strip.placement  = "outside",
    axis.title.y     = element_blank(),
    axis.text.y      = element_text(size = 7),
    legend.position  = "bottom"
  ) +
  labs(
    y     = "Raw estimate",
    title = "Raincloud plot: density, box & points for each study"
  )

0 replies

alexrand-gprl · 2025-05-28T17:57:26Z

alexrand-gprl
May 28, 2025
Author

Wow thanks Charles, this is super clear and helpful!

I still have my heart set on full Bayes, and I'm curious about the possibility of modifying the ctsem-generated Stan code for the non-hierarchical version of the model to accommodate pooling across studies. In an ideal world this could even evolve into PR that would allow for this functionality in ctsem.

It is a daunting idea but I have time and energy to devote to it. Do you have any advice on:

How feasible this seems to you on the face of it?
Where you might begin in the modification of the Stan code?

0 replies

cdriveraus · 2025-08-05T07:51:27Z

cdriveraus
Aug 5, 2025
Maintainer

I'm sorry I ignored / missed this Alex! I came back here because someone else (José Ángel Martínez-Huertas) was asking me about the related issue of crossed random effects, perhaps you two would like to discuss it, I'm really happy to support in general.

Basically, I just don't find full Bayes via Stan very feasible at present with ctsem. Making substantial changes to the code so that it's parallelisable within stan would be one place to start. Other things I can think of:

using the empirical bayes estimates to set some kind of more sensible priors.

using / improving the optimization then importance sampling approach already available within ctsem (this would give true Bayesian outcomes but with a linearized / integrated version of individual differences in parameters, not perfect for e.g. diffusion variance).

discussing with another colleague the potential for some kind of autoencoder neural net setup for optimisation / sampling.

wait and see if one of my students attempts to make it all work using Julia in the backend rather than stan pays off, might offer more flexibility for sampling setups etc.

with the stan code, I would honestly start by stripping out all the stuff (at least within the core kalman filter section) that is not needed for a basic example -- it will be frustrating in the beginning but save you pain in the long run. From there you would need to restructure things so that the subject / id dependency is not univariate, and forward predictions are based off the correct prior state estimates. I guess I would love to see an approach using the kalman filter logic so it could be integrated easily in an optimisation framework too, because Bayes is just too slow for trying out many models etc I think.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nested Hierarchies? #66

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Nested Hierarchies? #66

Uh oh!

alexrand-gprl May 14, 2025

Replies: 5 comments

Uh oh!

cdriveraus May 14, 2025 Maintainer

Uh oh!

cdriveraus May 24, 2025 Maintainer

Uh oh!

cdriveraus May 24, 2025 Maintainer

Uh oh!

alexrand-gprl May 28, 2025 Author

Uh oh!

cdriveraus Aug 5, 2025 Maintainer

alexrand-gprl
May 14, 2025

cdriveraus
May 14, 2025
Maintainer

cdriveraus
May 24, 2025
Maintainer

cdriveraus
May 24, 2025
Maintainer

alexrand-gprl
May 28, 2025
Author

cdriveraus
Aug 5, 2025
Maintainer