DSBA 6010: Applications of Large Language Models (LLMs)

Instructor: Archit Parnami, PhD
Semester: Spring 2025
Timings: Tuesday, 5:30 pm - 8:15 pm

Introduction

This 14-week course introduces students to the practical applications of large language models, covering key concepts such as transformers, LLM architecture, prompting, fine-tuning, retrieval-augmented generation, evaluation, deployment, and building full-stack applications.

Week 1: Introduction to Generative AI and LLMs

Overview of generative AI and NLP evolution
LLM capabilities and risks
Model families and usage (OpenAI, Hugging Face, Anthropic, Cohere)

Week 2: Transformers

Transformer architecture: encoder, decoder, attention
Self-attention, positional encoding, multi-head attention
Pretraining tasks: MLM, CLM

Week 3: Introduction to Large Language Models

High-level LLM architecture and tokenization
Text generation, sampling techniques (top-k, nucleus, temperature)
From pretraining to instruction tuning

Week 4: Prompt Engineering

Zero-, one-, few-shot prompting
Instruction prompting, role prompting
Prompt templates and libraries (PromptTools, LangChain)

Week 5: Retrieval-Augmented Generation (RAG)

Limitations of LLM memory
Dense vector retrieval, embeddings
Building a simple RAG pipeline

Week 6: LLM Tool Use and LangChain

LLM tool APIs (search, calculator, code execution)
LangChain agents and tools
Chains and routing logic

Week 7: Evaluation of LLMs

Human evaluation, BLEU, ROUGE, METEOR
LLM-as-a-judge and preference-based evaluation
RAGAS, TruLens, Promptfoo

Week 8: Fine-Tuning LLMs

Non-generative (BERT) fine-tuning
SFT: instruction fine-tuning of decoder models
Tools: Hugging Face Trainer, ChatTemplates

Week 9: Multimodal LLMs

Visual Language Models (e.g., GPT-4V, LLaVA)
Architectures for combining image and text
Applications: captioning, VQA, visual chat

Week 10: Agents, Memory, and Planning

What are LLM agents?
Memory types (short/long-term)
Planning, state tracking, ReAct, AutoGPT

Week 11: LLM Applications and APIs

OpenAI, Anthropic, Hugging Face API access
Flask, FastAPI integration with LLMs
Streamlit frontend integration

Week 12: LLM Deployment

Hosting LLMs: vLLM, Ollama, TGI
Inference optimization: quantization, batching, streaming
Cloud deployment via Docker, Hugging Face Spaces

Week 13: Scaling and Cost Efficiency

Batching, KV caching, rate limiting
Quantization with bitsandbytes, 4/8-bit inference
FrugalGPT and model cascades

Week 14: Final Project – Full LLM Application

Build and deploy a complete LLM-powered app
Use prompts, APIs, RAG, LangChain, Streamlit
Deliverables: code, demo, README, video/slide

Course Deliverables

Weekly exercises & notebooks
Final project (GitHub repo + demo)
Participation in peer feedback and weekly discussions