File tree Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Original file line number Diff line number Diff line change @@ -313,6 +313,17 @@ rm -rf $OUTPUT_DIR && \
313313 --gin_bindings=train_eval.warmstart_policy_dir=\" $WARMSTART_OUTPUT_DIR /saved_policy\"
314314```
315315
316+ You may also start a tensorboard to monitor the training process with
317+
318+ ``` shell
319+ tensorboard --logdir=$OUTPUT_DIR
320+ ```
321+
322+ Mainly check the reward_distribution section for the model performance. It
323+ includes the average reward and the percentile of the reward distributions
324+ during training. Positive reward means an improvement against the heuristic,
325+ and negative reward means a regression.
326+
316327### Evaluate trained policy on a corpus (Optional)
317328
318329Optionally, if you are interested in seeing how the trained policy (` $OUTPUT_DIR/saved_policy ` )
You can’t perform that action at this time.
0 commit comments