question of a recent commit (and also reproducibility questions)

Hi Rose,

The recent commit changed x_tp1=x_tp to x_tp=x_tp1 (https://github.com/rosewang2008/language_modeling_via_stochastic_processes/commit/cb3d3454433d821c606bc224d42ee81b7cd3754f?diff=split?), but are your results based on this new commit or the old one? Since I also got different numbers on Wikisection compared to your paper (based on the version before the above commit), as brought up by another person in the latest post of #7. (Just to make sure, the results in the paper are based on GPT2-small which is called gpt-2 in huggingface, but not GPT2-large/xl right?)

Besides, I have a question about simulate_brownian_bridge: in this function, `x_tp1 = x_t * (1- dt/(1-t)) + (dt/(1-t))*B_T + noise`, but why is there a dt=0.05? According to the Brownian bridge process, shouldn't this be either `x_tp1 = x_0 * (1 - t) + t * B_T + noise` (if you use the older version always interpolating between x_0 and x_T), or `x_tp1 = x_t * (1-1/(T-num_samples)) + 1/(T-num_samples) * B_T + noise` (if you use the newer version interpolating between x_t and x_T)? And why is this noise term fixed rather than depending on t and T as in Equation 1 in the paper?

Lastly, I wonder if it's possible for you to share one setting of your model (Wikisection, TC-32) since that would address the issue of #7 as well. I can understand that it's hard to share big files, but I think google drive allows uploading big files, and you can remove optimizer.pt to make the checkpoint smaller.

Thanks,
Yuntian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

question of a recent commit (and also reproducibility questions) #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

question of a recent commit (and also reproducibility questions) #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions