DSBA 6010: Applications of Large Language Models (LLMs)

Instructor: Archit Parnami, PhD
Semester: Spring 2026
Timings: Tuesday, 5:30 pm - 8:15 pm

Introduction

This 14-week course introduces students to the practical applications of large language models, covering foundational ML concepts, transformer architecture, prompting, tool use, fine-tuning, retrieval-augmented generation, multimodality, agents, evaluation, safety, deployment, scaling, and building full-stack applications.


Lecture 1: Machine Learning Basics

  • Supervised vs unsupervised learning, classification vs regression
  • Neural networks: MLPs, CNNs, activation functions, loss functions, optimization
  • NLP fundamentals: word embeddings, tokenization, RNNs
  • Working knowledge of NumPy, Pandas, PyTorch, TensorFlow

Lecture 2: Attention & Transformers

  • Core transformer components: embeddings, multi-head self-attention, positional encoding
  • Feed-forward layers and layer normalization
  • Architectural variants: BERT, GPT
  • Hands-on with HuggingFace transformers library

Lecture 3: Introduction to Large Language Models

  • High-level LLM architecture and tokenization
  • Text generation pipeline: inference and decoding strategies
  • Sampling techniques: greedy, top-k, nucleus (top-p), temperature scaling
  • End-to-end generation with HuggingFace

Lecture 4: Prompt Engineering

  • Foundational formats: instruction, few-shot, chain-of-thought prompting
  • Prompt structure and constraints affecting output quality
  • Programmatic templates and prompt libraries
  • Building prompts for summarization, translation, classification, extraction

Lecture 5: Tool Use with LLMs

  • Function calling: tool schemas, parameter passing, result handling
  • Integrating external tools: search, calculators, APIs, code interpreters
  • Single-step and multi-step tool calling with OpenAI, Anthropic, open-source models
  • Error handling and retry logic

Lecture 6: Fine-Tuning Large Language Models

  • Task-specific fine-tuning for classification (BERT)
  • Instruction fine-tuning (SFT) for decoder-only models
  • Parameter-efficient methods: LoRA, adapters
  • Chat templates and HuggingFace Trainer

Lecture 7: Retrieval-Augmented Generation (RAG)

  • RAG architecture: retrieval + generation for grounded responses
  • Document processing: chunking, metadata extraction, embeddings
  • Vector databases: FAISS, Chroma, Pinecone
  • Building end-to-end RAG pipelines with LangChain or LlamaIndex

Lecture 8: Large Multimodal Models

  • Vision-language architectures: CLIP, GPT-4V, LLaVA, Gemini
  • Cross-modal embedding alignment and contrastive learning
  • Applications: image captioning, VQA, document understanding, multimodal RAG
  • Evaluation for relevance, consistency, and hallucination

Lecture 9: Agents & Planning with LLMs

  • Agent building blocks: planning, acting, observing, memory
  • Architectures: ReAct, planner-executor, CoT + tool use
  • Frameworks: LangChain, LlamaIndex, AutoGen, LangGraph, SmolAgents
  • Evaluating agent reliability and safety

Lecture 10: Evaluation of LLMs

  • Evaluation dimensions: correctness, coherence, faithfulness, toxicity, bias
  • Task-specific metrics: BLEU, ROUGE, METEOR, EM/F1
  • Model-based evaluation, human preference ranking, A/B testing
  • Tools: HELM, TruLens, OpenAI Evals, RAGAS

Lecture 11: Safety, Bias and Ethics in LLMs

  • Types of bias: gender, racial, cultural, linguistic, geographic
  • Safety risks: prompt injection, jailbreaks, hallucination, toxic generation
  • Alignment techniques: RLHF, Constitutional AI, red teaming
  • Ethical deployment considerations

Lecture 12: LLM APIs & Deployment

  • Integrating LLM APIs with LangChain (OpenAI, Anthropic, Cohere, HuggingFace)
  • Web frameworks: Streamlit, FastAPI, LangServe
  • Self-hosting models: Ollama, vLLM, Text Generation Inference
  • Rate limiting, monitoring, and observability with LangSmith

Lecture 13: Scaling & Cost-Efficiency

  • Scaling laws: compute-performance relationships
  • Inference optimization: batching, KV caching, continuous batching
  • Quantization techniques: GPTQ, AWQ, QLoRA
  • Cost reduction strategies: FrugalGPT, model cascades

Lecture 14: Final Project – Full-Stack LLM Application

  • Build and deploy a complete LLM-powered application
  • Apply techniques: prompting, APIs, RAG, tools, evaluation
  • Deliverables: code, demo, technical documentation

Course Deliverables

  • Weekly Project Milestones
  • Production Ready LLM Application
  • Peer Evaluation

Next