Skip to content

Latest commit

 

History

History
27 lines (23 loc) · 987 Bytes

syllabus.md

File metadata and controls

27 lines (23 loc) · 987 Bytes

Course syllabus

Part 1: Classic NLP

  1. Regex (part 1)
  2. Regex (part 2), DF, stemming, lemmatization, displaying collection content, Topic Modelling
  3. Linguistic Model (n-grams), Markov Chains, and PoS tagging
  4. Text classifier (Bag-of-words, Naive Bayes, Logistic Regression)
  5. TEST 1

Part 2: Modern NLP

  1. Text classifier (MLP network with PyTorch using bag-of-words), dropout
  2. Embeddings, position embeddings, residual connections
  3. Sequence models (RNN, CNN, LSTM)
  4. Linguistic models with embeddings (GPT model), layer normalization
  5. Tokenizers (sentence piece tokenizer vs. word tokenizer)
  6. Datasets, Dataloaders
  7. Sequence models (Multi-head attention / transformer model)
  8. Models with GloVe embeddings
  9. BERT and fine-tuning BERT for classification
  10. TEST 2

Part 3: LLMs and their applications

  1. Information extraction with templating (zero-shot prompting)
  2. Question answering with RAG (few-shot prompting)
  3. TEST 3

[Time to make your project]