DSBA 6010: Applications of Large Language Models (LLMs)

Instructor: Archit Parnami, PhD
Semester: Spring 2025
Timings: Tuesday, 5:30 pm - 8:15 pm

Introduction

This 14-week course introduces students to the practical applications of large language models, covering key concepts such as transformers, LLM architecture, prompting, fine-tuning, retrieval-augmented generation, evaluation, deployment, and building full-stack applications.


Week 1: Introduction to Generative AI and LLMs

  • Overview of generative AI and NLP evolution
  • LLM capabilities and risks
  • Model families and usage (OpenAI, Hugging Face, Anthropic, Cohere)

Week 2: Transformers

  • Transformer architecture: encoder, decoder, attention
  • Self-attention, positional encoding, multi-head attention
  • Pretraining tasks: MLM, CLM

Week 3: Introduction to Large Language Models

  • High-level LLM architecture and tokenization
  • Text generation, sampling techniques (top-k, nucleus, temperature)
  • From pretraining to instruction tuning

Week 4: Prompt Engineering

  • Zero-, one-, few-shot prompting
  • Instruction prompting, role prompting
  • Prompt templates and libraries (PromptTools, LangChain)

Week 5: Retrieval-Augmented Generation (RAG)

  • Limitations of LLM memory
  • Dense vector retrieval, embeddings
  • Building a simple RAG pipeline

Week 6: LLM Tool Use and LangChain

  • LLM tool APIs (search, calculator, code execution)
  • LangChain agents and tools
  • Chains and routing logic

Week 7: Evaluation of LLMs

  • Human evaluation, BLEU, ROUGE, METEOR
  • LLM-as-a-judge and preference-based evaluation
  • RAGAS, TruLens, Promptfoo

Week 8: Fine-Tuning LLMs

  • Non-generative (BERT) fine-tuning
  • SFT: instruction fine-tuning of decoder models
  • Tools: Hugging Face Trainer, ChatTemplates

Week 9: Multimodal LLMs

  • Visual Language Models (e.g., GPT-4V, LLaVA)
  • Architectures for combining image and text
  • Applications: captioning, VQA, visual chat

Week 10: Agents, Memory, and Planning

  • What are LLM agents?
  • Memory types (short/long-term)
  • Planning, state tracking, ReAct, AutoGPT

Week 11: LLM Applications and APIs

  • OpenAI, Anthropic, Hugging Face API access
  • Flask, FastAPI integration with LLMs
  • Streamlit frontend integration

Week 12: LLM Deployment

  • Hosting LLMs: vLLM, Ollama, TGI
  • Inference optimization: quantization, batching, streaming
  • Cloud deployment via Docker, Hugging Face Spaces

Week 13: Scaling and Cost Efficiency

  • Batching, KV caching, rate limiting
  • Quantization with bitsandbytes, 4/8-bit inference
  • FrugalGPT and model cascades

Week 14: Final Project – Full LLM Application

  • Build and deploy a complete LLM-powered app
  • Use prompts, APIs, RAG, LangChain, Streamlit
  • Deliverables: code, demo, README, video/slide

Course Deliverables

  • Weekly exercises & notebooks
  • Final project (GitHub repo + demo)
  • Participation in peer feedback and weekly discussions

Next