Week 11: Agents & Planning with LLMs

Introduction

This week covers practical methods for turning LLMs into agents that plan and act by integrating reasoning, tool use, and memory. We focus on agent architectures (e.g., ReAct and planner–executor patterns), tool and API integration (function-calling, browsing, code execution), and representative frameworks and libraries (LangChain, LlamaIndex, AutoGen, AutoGPT, SmolAgents). You’ll learn hands‑on exercises for building agents, and how to evaluate their reliability, safety, and reproducibility.

Goals for the Week

Define “agent” and its core building blocks (planning, acting, observing, memory).
Compare key architectures (ReAct, planner–executor, CoT + tool use).
Get hands‑on with frameworks (LangChain, LlamaIndex, OpenAI Agents, AutoGen).
Build basic agent workflows and evaluate reliability and safety.

Learning Guide

Videos

Agentic AI - Learn about 4 the Agentic AI design patterns from Andrew Ng

Other Short Courses

AI Agents in LangGraph: A short course from DeepLearning.ai by Harrison Chase & Rotem Weiss
Multi AI Agents Systems with crewAI
Building AI Browser Agents
Agents Course: A course by HuggingFace

Frameworks & APIs

Agent Frameworks:

AI Agents Frameworks Comparison: Comparison of features across different open-source AI agent frameworks.

LangChain Agents — planner-executor, tool integrations, and agent templates.
LlamaIndex Agents — retrieval-augmented agents with tooling.
AutoGen (Microsoft) — multi-agent orchestration and planning.
SmolAgents (HuggingFace) — lightweight agent framework.
LangGraph — graph-based agent orchestration.
AutoGPT — Platform to create, deploy, and manage agents.

APIs & Commercial Services:

OpenAI Agents SDK — Learn how to build, deploy, and optimize agent workflows with AgentKit
Anthropic Claude —Build production AI agents with Claude Code as a library.
Google Gemini API — Gemini models and multimodal APIs for agentic workflows.
Microsoft Azure AI / Azure OpenAI Service — managed OpenAI/GPT deployments with enterprise integrations and monitoring.
Amazon Bedrock — managed foundation-model APIs and integration with AWS services for agent tooling.

Foundational Papers

Example Use Cases

Autonomous web research & summarization (browsing + source citation)
Code generation and debugging assistants that run and test code
Multi-step task automation (calendar, email, travel planning)
Robotics and embodied agents (navigation, manipulation)
Multi-agent coordination (delegation, voting, ensemble planning)

Programming Practice

Implement a ReAct agent that:
- Uses tools (e.g., calculator, wiki search, code executor)
- Thinks before it acts, explains steps, and provides a final answer
Create a LangChain agent with:
- A planner that breaks a user goal into subtasks
- Executors that solve subtasks using external tools or APIs
Use OpenAI function calling to create a structured agent with access to multiple APIs (e.g., weather, calendar, file reader).
Build a mini AutoGPT-style loop:
- Define a goal
- Let the agent iteratively refine plans and take steps toward completion

References

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., … & Le, Q. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903. ↩︎
Yao, S., Zhao, Y., Yu, D., Oraby, S., & Torii, M. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629. ↩︎
Schick, T., Schütze, H., et al. (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. arXiv:2302.04761. ↩︎
Nakano, R., Hilton, J., et al. (2021). WebGPT: Improving Factual Accuracy through Web Browsing. OpenAI Research. ↩︎
Press, O., Zhang, M., Min, S., Schmidt, L., Smith, N. A., & Lewis, M. (2022). Measuring and Narrowing the Compositionality Gap in Language Models (introduces Self-Ask prompting). arXiv:2210.03350. ↩︎
Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang, Y., Callan, J., & Neubig, G. (2022). PAL: Program-Aided Language Models for Reasoning. arXiv:2211.10435. ↩︎