- Regex (part 1)
- Regex (part 2), DF, stemming, lemmatization, displaying collection content, Topic Modelling
- Linguistic Model (n-grams), Markov Chains, and PoS tagging
- Text classifier (Bag-of-words, Naive Bayes, Logistic Regression)
- TEST 1
- Text classifier (MLP network with PyTorch using bag-of-words), dropout
- Embeddings, position embeddings, residual connections
- Sequence models (RNN, CNN, LSTM)
- Linguistic models with embeddings (GPT model), layer normalization
- Tokenizers (sentence piece tokenizer vs. word tokenizer)
- Datasets, Dataloaders
- Sequence models (Multi-head attention / transformer model)
- Models with GloVe embeddings
- BERT and fine-tuning BERT for classification
- TEST 2
- Information extraction with templating (zero-shot prompting)
- Question answering with RAG (few-shot prompting)
- TEST 3
[Time to make your project]