Question about retraining/fine-tuning EncoderModel with new words in `t5.get_tokenizer()`

That I have added some new words to `t5.get_tokenizer()` as shown below:
```
def get_tokenizer(name):
    tokenizer = T5Tokenizer.from_pretrained(name, model_max_length=MAX_LENGTH)
    new_words  =['XXX', 'OOO', ......]
    tokenizer.add_tokens(new_words)
    return tokenizer
```
I would like to understand if I need to retrain or fine-tune the EncoderModel after adding these new words to the tokenizer. How will this modification affect the model's performance or behavior?

This question is related to the Imagen project, and I want to ensure that I am following the correct approach when incorporating new words into the tokenizer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Question about retraining/fine-tuning EncoderModel with new words in `t5.get_tokenizer()` #358

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Question about retraining/fine-tuning EncoderModel with new words in t5.get_tokenizer() #358

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Question about retraining/fine-tuning EncoderModel with new words in `t5.get_tokenizer()` #358