Overview
Introduction
Context
Why This Matters?
Learning Objectives
Rationale for LLM and Agent Evaluation
Components of LLM Evaluation
Tasks and Benchmark Datasets for Evaluation
Challenges in LLM Evaluation
LLM-As-A-Judge Evaluation
Classic and Contextual Embedding Approaches
Evaluation Using BLEU
Evaluation Using ROUGE
Evaluation Using METEOR
Evaluation Using BERTScore
Evaluating RAG-Based Applications
Practice - Evaluation Using RAGAs
Faithfulness
Answer Relevancy
Context Precision
Context Recall
Course Summary