-
Notifications
You must be signed in to change notification settings - Fork 160
Description
You're welcome! I'm glad to assist you with this question!
- The error message is as follows:
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid usingtokenizersbefore the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid usingtokenizersbefore the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid usingtokenizersbefore the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid usingtokenizersbefore the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
{'loss': 0.0185, 'learning_rate': 7.973102785782901e-06, 'epoch': 5.04}
{'loss': 0.0185, 'learning_rate': 6.9724623759205894e-06, 'epoch': 5.16}
{'loss': 0.0188, 'learning_rate': 5.971821966058277e-06, 'epoch': 5.28}
{'loss': 0.0178, 'learning_rate': 4.971181556195966e-06, 'epoch': 5.4}
wandb: Network error (ReadTimeout), entering retry loop.
{'loss': 0.0183, 'learning_rate': 3.970541146333654e-06, 'epoch': 5.52}
{'loss': 0.018, 'learning_rate': 2.9699007364713415e-06, 'epoch': 5.64}
{'loss': 0.0179, 'learning_rate': 1.96926032660903e-06, 'epoch': 5.76}
{'loss': 0.0174, 'learning_rate': 9.68619916746718e-07, 'epoch': 5.88}
[INFO|trainer.py:1901] 2023-04-19 17:43:28,689 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
{'train_runtime': 1749.401, 'train_samples_per_second': 142.815, 'train_steps_per_second': 14.281, 'train_loss': 0.023130248267032992, 'epoch': 6.0}
[INFO|trainer.py:2709] 2023-04-19 17:43:28,693 >> Saving model checkpoint to classifier_models/e2e-tgt-tree_e=6_b=10_m=bert-base-uncased_wikitext-103-raw-v1_101_wp_None
[INFO|configuration_utils.py:453] 2023-04-19 17:43:28,694 >> Configuration saved in classifier_models/e2e-tgt-tree_e=6_b=10_m=bert-base-uncased_wikitext-103-raw-v1_101_wp_None/config.json
[INFO|modeling_utils.py:1704] 2023-04-19 17:43:29,841 >> Model weights saved in classifier_models/e2e-tgt-tree_e=6_b=10_m=bert-base-uncased_wikitext-103-raw-v1_101_wp_None/pytorch_model.bin
***** train metrics *****
epoch = 6.0
train_loss = 0.0231
train_runtime = 0:29:09.40
train_samples = 41640
train_samples_per_second = 142.815
train_steps_per_second = 14.281
04/19/2023 17:43:29 - INFO - main - *** Evaluate ***
[INFO|trainer.py:710] 2023-04-19 17:43:29,848 >> The following columns in the evaluation set don't have a corresponding argument in Classifier_Tree.forward and have been ignored: chart_lst. If chart_lst are not expected by Classifier_Tree.forward, you can safely ignore this message.
[INFO|trainer.py:2964] 2023-04-19 17:43:29,850 >> ***** Running Evaluation *****
[INFO|trainer.py:2966] 2023-04-19 17:43:29,850 >> Num examples = 421
[INFO|trainer.py:2969] 2023-04-19 17:43:29,851 >> Batch size = 10
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
{'eval_runtime': 1.4868, 'eval_samples_per_second': 283.16, 'eval_steps_per_second': 28.921, 'epoch': 6.0}
Traceback (most recent call last):
File "/home/name/diffusion-LM/transformers/examples/pytorch/language-modeling/run_clm.py", line 1704, in
main()
File "/home/name/diffusion-LM/transformers/examples/pytorch/language-modeling/run_clm.py", line 1675, in main
perplexity = math.exp(metrics["eval_loss"])
KeyError: 'eval_loss'
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
Exception ignored in atexit callback: <function _Manager._atexit_setup.. at 0x7f2f280f1fc0>
Traceback (most recent call last):
File "/home/name/anaconda3/lib/python3.10/site-packages/wandb/sdk/wandb_manager.py", line 166, in
self._atexit_lambda = lambda: self._atexit_teardown()
File "/home/name/anaconda3/lib/python3.10/site-packages/wandb/sdk/wandb_manager.py", line 175, in _atexit_teardown
self._teardown(exit_code)
File "/home/name/anaconda3/lib/python3.10/site-packages/wandb/sdk/wandb_manager.py", line 186, in _teardown
result = self._service.join()
File "/home/name/anaconda3/lib/python3.10/site-packages/wandb/sdk/service/service.py", line 216, in join
ret = self._internal_proc.wait()
File "/home/name/anaconda3/lib/python3.10/subprocess.py", line 1204, in wait
return self._wait(timeout=timeout)
File "/home/name/anaconda3/lib/python3.10/subprocess.py", line 1938, in _wait
(pid, sts) = self._try_wait(0)
File "/home/name/anaconda3/lib/python3.10/subprocess.py", line 1896, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt:
(diffusion-LM) name@taizun-SYS-4029GP-TRT:/diffusion-LM$ wandb: - 0.010 MB of 0.010 MB uploaded (0.(diffusion-LM) name@taizun-SYS-4029GP-TRT:/diffusion-LM$ wandb: / 0.010 MB of 0.010 MB uploaded (0.wandb: \ 0.010 MB of 0.010 MB uploaded (0.000 MB deduped)
-
The relevant code that caused the error is as follows:
Training
if training_args.do_train:
checkpoint = None
if training_args.resume_from_checkpoint is not None:
checkpoint = training_args.resume_from_checkpoint
elif last_checkpoint is not None:
checkpoint = last_checkpoint
train_result = trainer.train(resume_from_checkpoint=checkpoint)
trainer.save_model() # Saves the tokenizer too for easy uploadmetrics = train_result.metrics max_train_samples = ( data_args.max_train_samples if data_args.max_train_samples is not None else len(train_dataset) ) metrics["train_samples"] = min(max_train_samples, len(train_dataset)) trainer.log_metrics("train", metrics) trainer.save_metrics("train", metrics) trainer.save_state()Evaluation
if training_args.do_eval:
logger.info("*** Evaluate ***")metrics = trainer.evaluate() max_eval_samples = data_args.max_eval_samples if data_args.max_eval_samples is not None else len(eval_dataset) metrics["eval_samples"] = min(max_eval_samples, len(eval_dataset)) try: perplexity = math.exp(metrics["eval_loss"]) except OverflowError: perplexity = float("inf") metrics["perplexity"] = perplexity trainer.log_metrics("eval", metrics) trainer.save_metrics("eval", metrics)kwargs = {"finetuned_from": model_args.model_name_or_path, "tasks": "text-generation"}
if data_args.dataset_name is not None:
kwargs["dataset_tags"] = data_args.dataset_name
if data_args.dataset_config_name is not None:
kwargs["dataset_args"] = data_args.dataset_config_name
kwargs["dataset"] = f"{data_args.dataset_name} {data_args.dataset_config_name}"
else:
kwargs["dataset"] = data_args.dataset_nameif training_args.push_to_hub:
trainer.push_to_hub(**kwargs)
else:
trainer.create_model_card(**kwargs)