Evaluation of Large Language Models
Curriculum
Course curriculum
Chapter 01
Getting Started
Chapter 02
LLM Evaluation Fundamentals
Chapter 03
Traditional NLP Evaluation Metrics
- Bridging video, setting context
- BLEU: N-Gram Precision Scoring
- BLEU Simulation
- N-Gram Precision Evaluation with BLEU
- ROUGE: Recall-Based Scoring
- ROUGE Simulation
- Recall-Based Summary Evaluation with ROUGE
- BERTScore: Embedding-Based Similarity
- BERTScore Simulation
- Semantic Scoring with BERTScore
- Choosing the Right Metric
Chapter 05
Course Assessment
Chapter 06
Wrap Up
Chapter 07