Model metrics

Thank you for your contribution to NLP. Now, I have some questions I would like to ask. I encountered some issues while trying to train a model. I used the OpenWebText dataset as the training set and WikiText103 as the validation set. After training, the lowest perplexity metric was over 100, which does not meet the requirements stated in your paper. Here are the parameters in my configuration file: 
```
defaults:
  - _self_
  - model: small
  - override hydra/launcher: submitit_slurm

ngpus: 1
tokens: 50257

training:
  batch_size: 16
  accum: 1
  n_iters: 400000
  snapshot_freq: 50000
  log_freq: 50
  eval_freq: 100
  snapshot_freq_for_preemption: 10000
  weight: standard
  snapshot_sampling: True
  ema: 0.9999

data: 
  train: /home/m125656330/input/Score-Entropy-Discrete-Diffusion-main/date/openwebtext
  valid: /home/m125656330/input/Score-Entropy-Discrete-Diffusion-main/date/wikitext-103
  cache_dir: /home/m125656330/input/catch
          
graph:   
  type: absorb
  file: data
  report_all: False 
      
noise:
  type: loglinear
  sigma_min: 1e-4
  sigma_max: 20

sampling:
  predictor: euler 
  steps: 128
  noise_removal: True

eval:
  batch_size: 2
  perplexity: True
  perplexity_batch_size: 1

optim:
  weight_decay: 0
  optimizer: AdamW
  lr: 3e-4
  beta1: 0.9
  beta2: 0.999
  eps: 1e-8
  warmup: 2500
  grad_clip: 1.


hydra:
  run:
    dir: exp_local/${data.train}/${now:%Y.%m.%d}/${now:%H%M%S}
  sweep:
    dir: exp/${data.train}/${now:%Y.%m.%d}/${now:%H%M%S}
    subdir: ${hydra.job.num}
  launcher:
    max_num_timeout: 100000
    # timeout_min: 10079
    partition: g40x
    account: stanford
    mem_gb: 96
    cpus_per_task: 40
    gpus_per_node: ${ngpus}
    constraint: null
```

Additionally, I tried fine-tuning the pre-trained small model you provided and obtained a perplexity of over ten thousand. I am now unsure where the problem lies. Any help you can provide would be greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model metrics #15

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Model metrics #15

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions