Daniel Lyons' Notes

Building Agentic AI Workloads – Crash Course

Description

This course, from Rola Dali, PhD, provides a comprehensive overview of agentic AI, defining agents as software entities that use LLMs to perceive environment...

Notes

00:01 Introduction

  • Speaker: Raali, Machine Learning Architect at Tech42
  • Background: PhD in neuroscience and bioinformatics from McGill University (2017)
  • Experience: 5 years in data, AI, ML, and cloud ecosystem; AWS hero and gold jacket ambassador
  • 00:49 Session overview
    • Generative AI overview
    • What agents are and when to use them
    • Agent design and implementation
    • Architectural patterns and evaluation
    • Career implications

01:15 History of Artificial Intelligence

  • 01:15 Early foundations (1940s-1950s)
    • 1943: McCulloch and Pitts proposed mathematical model of biological neurons
    • 1950: Alan Turing introduced the Turing Test as benchmark for machine intelligence
    • 1956: Dartmouth Workshop established AI as field of study; John McCarthy coined term "artificial intelligence"
  • 01:58 AI Winter and Machine Learning Renaissance
    • AI Winter: 1950s-1980s
    • 1990s: Machine learning renaissance; Geoffrey Hinton earned PhD despite AI being considered career suicide at the time
    • Hinton later became a godfather of AI, won Turing Award and Nobel Prize
  • 02:50 Gaming victories demonstrating AI potential
    • 1997: IBM's Deep Blue defeated Gary Kasparov in chess
    • 2011: IBM Watson won Jeopardy TV game show
  • 03:17 Deep Learning Boom (2010-2016)
    • 2012: AlexNet from Geoffrey Hinton's lab succeeded in image recognition competition
    • 2016: DeepMind's AlphaGo defeated reigning Go champion
  • 03:49 Generative AI Boom (Present)
    • 2017: Transformer paper released by Google and University of Toronto researchers; became blueprint for large language models
    • November 2022: OpenAI released ChatGPT for public access
    • 04:22 Democratization of AI
      • Previous booms required technical expertise; generative AI is accessible to general population
      • Everyone from students to professionals uses ChatGPT

06:12 Generative AI vs Traditional Machine Learning

  • 06:12 Three pillars of AI: Algorithms, Data, Compute
  • 07:04 Data differences
    • Traditional ML: Megabytes to gigabytes; ~1 million examples sufficient
    • Generative AI: Terabytes to petabytes; LLMs see ~15 trillion tokens
  • 08:05 Model size differences
    • Traditional ML: Thousands to millions of parameters
    • Generative AI: Billions to trillions (e.g., GPT-4 estimated at 1.8 trillion parameters)
  • 08:24 Compute advancements
    • Major breakthrough: Ability to parallelize sequential models
    • Transformer paper enabled parallelized training vs serial RNNs/LSTMs
  • 09:32 Combined effect of all three pillars
    • Creates foundational models that understand human language
    • Enables reading, writing, and information extraction
    • Surprising that scaling data and models alone achieves this capability
  • 10:49 Shift from specific to general tasks
    • Traditional ML: Task-specific models requiring custom data and training
    • Generative AI: General models capable of emerging tasks through language understanding
  • 12:13 Model as a Service
    • Traditional ML: Affordable but required technical skills
    • Generative AI: Cost of $100,000 to billions to train; prohibitive for individuals
    • Big labs (Anthropic, OpenAI, Amazon) offer models as pay-as-you-go service

14:42 Agency and Autonomy Spectrum

  • 14:42 Definition of agency
    • Ability to make choices, act intentionally, and have control
    • Spectrum of autonomy exists across different system types
  • 15:06 Evolution of agentic systems
    • LLM: Agency in output (probabilistic token generation with vast output variations)
    • Chatbot loop: LLM in conversational cycle
    • Workflows: Predefined steps in larger systems
    • 15:58 2025 as year of agents: Autonomous systems with control over application flow
    • Deep agents: Control over file systems, browser, ability to spawn other agents
  • 16:29 AGI (Artificial General Intelligence)
    • Pinnacle of the field but lacks clear definition
    • Experts disagree on definition, possibility, and specifications
  • 17:36 Terminology
    • Andrew Ng coined "agentic systems" as umbrella term acknowledging agency exists across spectrum
  • 18:03 Milestone timeline
    • 2017: Transformer paper (foundation algorithm)
    • 2022: ChatGPT released
    • 2023: Agents emerged with ReAct paper (merging reasoning and action)
    • January 2026: Current date (2-3 years into agent development)
  • 18:43 Field maturity considerations
    • Cutting-edge field evolving in real-time
    • No single authority deciding standards
    • Many companies experimenting with different approaches
    • Knowledge evolves rapidly; timestamps matter
    • Be cautious about information age when researching

20:32 What is an Agent?

  • 20:32 Definition
    • Software entity designed to perceive environment, make decisions, and take actions to achieve goals
    • Brain: Large Language Model (foundational model)
    • Capabilities:
      • Plan: Decompose tasks into subtasks using LLM
      • Act: Execute available tools
      • Observe: Monitor tool outputs
      • Loop: Plan-act-observe cycle until solution achieved
  • 21:32 Pseudocode workflow
    • While loop takes user input
    • Invokes LLM to create task plan
    • If response includes tool call, execute it
    • Return tool response to LLM
    • Loop continues until no more actions needed
    • Return final response to user
  • 22:20 Visual workflow
    • User invokes agent with request
    • Agent enters loop invoking LLM
    • LLM returns reasoning or tool call
    • Tool execution (if needed) returns results to LLM
    • Loop continues until resolution
    • Response sent back to user

23:04 Agents vs Workflows

  • 23:04 Example scenario: Booking activities in a new city
    • Available activities for specific dates
    • Check activity websites for availability
    • Check personal calendars for availability
    • Book activity if matches found
    • Add to calendar
  • 24:00 Workflow approach
    • Predetermined sequence of coded steps
    • Ask LLM for list of popular activities with websites
    • Call function to check activity availability
    • Check personal calendar availability
    • Loop until matching activities found
    • Call book activity function
    • Update calendar
  • 24:51 Agent approach
    • Provide agent with tools (same functions as workflow)
    • Specify role and goal but not how to solve
    • Example instruction: "Book activities in Montreal for these dates using available tools"
    • Agent autonomously decides execution path
  • 25:40 ⭐ Key difference
    • Agents: Dynamic control flow devised by LLM at runtime
    • Workflows: Static predefined coded graphs

26:21 Pros and Cons of Agents

  • 26:21 Advantages
    • Available 24/7/365; no breaks beyond maintenance
    • Multilingual support (200+ languages out of box)
    • Efficiency: Improved response times with well-configured systems
    • Consistency: Smaller execution variation than human-to-human differences
    • Convenience: Self-service options anytime
    • Scalability: Well-designed systems scale fast
    • Cost-effective: Generally cheaper than human labor
  • 27:56 Disadvantages
    • Not human: Lack of human touch; frustrating when systems break or are slow
    • Technology still maturing: LLM improvements ongoing, hallucinations still present, context window limitations
    • Application space maturing: Still learning optimal use cases, safe/ethical implementation, user experience optimization
    • Higher cost than traditional software: Expense greater than other software systems (though less than human labor)

29:60 When to Use or Avoid Agents

  • 30:02 Workflow vs Agent decision factors
    • Use workflows for:
      • Mission-critical or error-sensitive applications
      • Regulated industries requiring deterministic outcomes
      • Latency-sensitive systems
      • Cost-sensitive situations (easier to estimate costs)
    • Use agents for:
      • Performance-critical tasks (agents outperform on average due to reasoning loop)
      • Situations requiring flexibility
      • Unknown solution paths
      • Situations comfortable with model-driven decisions
  • 31:57 Key questions to ask
    • Is application mission critical, error sensitive, or highly regulated?
    • Is task path predictable or can be predefined?
    • Is value of task worth the cost?
    • Is latency critical?
  • 32:21 Use agents when:
    • Error is tolerable
    • Open to flexible approaches
    • Execution path is hard to code
    • Cost is not an issue
    • Latency can be tolerated

32:39 Components of an Agent

  • 32:39 Essential components (consensus across references)
    • Purpose/Goal: Agent has specific task to solve
    • Reasoning/Planning capability: Decompose tasks into subtasks
    • Memory: Enable longer discussions and context retention
    • Tools/Actions: Solve problems on user's behalf
  • 33:48 Optional additions
    • Guard rails
    • Communication (for multi-agent systems)
    • Learning from experience
  • 34:15 Implementation mapping
    • LLM: Reasoning/planning component
    • System prompt: Purpose, goal, mission, identity
    • Functions/API calls: Tools and actions
    • Short-term and long-term memory: Memory component

34:55 Choosing an LLM

  • 35:02 Selection criteria
    • Task complexity: Simpler tasks allow smaller models; complex tasks need larger models
    • Reasoning capabilities: Not all models trained to reason; important for agents
    • Context window size: Need capacity for more information
    • Tool calling capabilities: Models with native tool calling support simplify implementation
    • Latency: Affects overall application performance
    • Cost: Pay-per-token pricing considerations
    • Compliance and data privacy: Critical for regulated industries; may require self-hosting
    • Available benchmarks and leaderboards: Hugging Face agent leaderboard and others provide guidance

37:43 System Prompt (Agent Identity)

  • 37:43 Definition
    • Like instructions to a junior intern defining who they are, what they do, expected tasks
  • 38:09 Examples
    • Financial adviser (eloquent, professional)
    • Medical assistant (caring, empathetic)
    • Teen adviser (young, hip)
  • 38:32 Production considerations
    • Natural language statements
    • Includes tone of voice, identity, task instructions, prohibitions, situation-specific guidance
    • Longer than example prompts shown

39:02 Memory in Agents

  • 39:05 Why memory needed
    • LLMs are stateless; do not retain information between interactions
  • 39:20 Three types of memory
    • Intrinsic memory: Model parameters from training
      • Stable and doesn't change (unless retrained)
      • Varies between models
    • Short-term memory: Within-session context window
      • Conversation appended to context window for recall
      • Art and science of context management (what to keep, drop, summarize, compress)
    • Long-term memory: Across-session external storage
      • Stores customer preferences, complaints, information
      • Moved from short-term memory when context window fills
      • Important: storage structure, indexing, retrieval mechanisms

41:27 Tools and Actions

  • 41:29 Purpose
    • Help overcome LLM limitations
    • LLMs are stateless (memory fixes this), have knowledge cutoff dates, lack real-time information
  • 42:18 Types of tool actions
    • Capability extensions: Function calls or API calls
    • Knowledge augmentation: Retrieve data/context from databases
    • Orchestration: Call other agents or communicate with systems
  • 42:38 Tool forms
    • Function calls in any language
    • API calls
    • Data retrieval from external databases
    • Browser actions
    • Code execution
    • File system control
    • Future: More autonomy and control capabilities

43:06 Implementing an Agent (Code Examples)

  • 43:06 Overview of implementation approaches
    • Single LLM call
    • LLM call in loop (chatbot simulation)
    • Simple agent from scratch (Python)
    • Agent with memory
    • Agent using frameworks (LangChain)
    • Open-source repository available with all code
  • 43:54 Single LLM call example
    • Uses boto3 with AWS Bedrock API
    • Model: Anthropic Claude 4.5
    • Converse API standardizes different model formats
    • Takes user input, sends to LLM, prints response
  • 45:02 Chatbot loop example
    • Same code as single call but wrapped in while loop
    • Runs until user enters "quit"
    • Demonstrates stateless nature: model doesn't retain name given earlier
    • Shows knowledge cutoff: can't answer questions about current date, time, weather
  • 48:17 Simple agent example
    • Tools created: calculator, get_weather, get_date, get_time
    • System prompt explains agent role and available tools
    • LLM function calls available tools when needed
    • Tool execution function handles tool invocation
    • Loop continues until agent provides final answer
    • Demonstrates agent using tools for real-time information
  • 52:05 Agent with memory example
    • Same code as simple agent plus update_memory function
    • Appends conversations to context window (JSON format)
    • Agent can now recall earlier information (e.g., user's name)
    • Can summarize interactions showing memory of all previous steps
    • Very basic memory implementation (simple appending)
  • 54:53 Agent with framework (LangChain)
    • Create_agent function from LangChain
    • Same Bedrock model and tools
    • Tools decorated with @tool
    • Simpler implementation: framework handles tool execution internally
    • Less boilerplate code needed
  • 57:01 Framework stability considerations
    • Frameworks can change dramatically (e.g., LangChain 1.0 overhaul ~month ago)
    • Models can be deprecated as new versions emerge
    • Unlike traditional software systems, generative AI systems have shorter "expiry dates"
    • Evolution happens not through architecture fault but through model and framework changes
    • Be mindful of deprecation and updates
  • 58:41 Agent with framework and memory
    • Add in-memory storage
    • Add checkpoint saver for short-term memory
    • Significantly simpler implementation of memory than manual approach
    • Same agent, tools, system prompt with memory capability
  • 59:36 Additional topics beyond basics
    • Model selection
    • Prompt engineering
    • Context engineering
    • Data management (storage, ranking, indexing, retrieval)
    • Memory types
    • Tooling and interfaces
    • Architectural choices
    • Deployment considerations

60:03 Agentic Architectural Patterns

  • Key decision-making areas for agentic systems:
    • Architectural choices and deployment
    • Security and compliance
    • Orchestration approaches
  • Pattern selection depends on problem complexity:
    • Workflow: Use when you know the predefined sequence of events
    • Single Agent: Use when the solution space is too large
    • Multi-Agent System: Use when the problem is too complex

60:18 Emerging Multi-Agent Patterns

61:12 Hierarchical Supervisor Architecture

  • Supervisor agent manages specialized agents
  • Specialized agents cannot communicate with each other
  • All communication flows through the supervisor

61:26 Swarm Architecture

  • Multiple agents that can communicate with one another
  • No central coordinator required
  • More flexible agent interactions

62:04 Single Agent Implementation

  • Uses Python with LangChain
  • Claude model as the brain
  • Access to tools: add, multiply, divide
  • Can solve simple math expressions without tool usage
  • More complex expressions trigger tool use with breakdown

62:43 Supervisor Architecture Demonstration

  • Three specialized agents, each with one tool:
    • Addition agent (add tool only)
    • Multiplication agent (multiply tool only)
    • Division agent (divide tool only)
  • Supervisor agent routes tasks to appropriate specialists
  • Performance metrics 63:60:
    • 16 total hops
    • 10 agent actions and 6 transfers
    • ~8,000 input tokens
    • ~700 output tokens

64:38 Swarm Architecture Demonstration

  • All agents can speak to each other
  • Each agent has one primary tool but can request help from others
  • Default active agent: add agent
  • Performance metrics 65:51:
    • 8 total interactions
    • Only 2 transfers
    • ~5,000 input tokens
    • ~500 output tokens

66:13 When to Use Single vs Multi-Agent Systems

  • Recommendation 67:02: Start with single agent when possible
    • Avoids overhead of memory management
    • Reduces complexity of multi-agent coordination
  • Multi-agent benefits when 67:20:
    • Task complexity exceeds single agent capabilities
    • Prevents context window clogging with too many tools
  • Supervisor vs Swarm 67:48:
    • Swarm: Better for small teams and simple tasks (less overhead)
    • Supervisor: Better for complex tasks (easier to debug)
    • Choice depends on task size and complexity

69:02 Agent Interfaces and Standardization Protocols

  • Four main interfaces agents need 69:07:
    • Call tools and use their outputs
    • Interface with data sources and databases
    • Talk to users
    • Communicate with other agents in multi-agent systems
  • Standardization protocols emerging 69:45:
    • Model Context Protocol (MCP) 69:59: Anthropic standard for tools and data at agent interface
    • Agent-to-Agent (A2A) Protocol: Google's protocol for agent communication
    • Agent User Interactions (AGUI) 70:25: Collaboration between Copilot, Crew AI, and LangChain
  • MCP advantages 70:42:
    • Donated to Linux Foundation
    • Co-chaired by OpenAI and Anthropic
    • Enables plug-and-play tool integration
    • MCP hubs allow sharing and reusing tools

72:02 Evaluating Agents

  • Three layers of evaluation 72:10:
    • LLM layer: Following instructions, capability, accuracy, hallucination
    • Agentic system layer: Task decomposition, tool selection, information retrieval, task completion
    • Application layer: Performance, error rate, latency, scalability, cost efficiency, security, UX
  • Three evaluation methods 73:54:
    • Code-based evaluation: Most cost-effective and consistent
    • LLM as judge: More expensive but useful for complex outputs
    • Human evaluators: Most expensive but necessary for subjective assessments
  • Code-based evals recommended when 74:39:
    • Output is quantitative or discrete
    • Ground truth is known
    • Can compare against baseline

75:25 Agent Challenges

75:32 Model-Related Challenges

  • Open-ended output evaluation difficulties
  • Model capability limitations
  • Context window constraints
  • Hallucination issues
  • Handling model cutoff limitations

76:29 System-Level Challenges

  • Path evaluation and decision-making complexity
  • Context management as active field of research
  • Convoluted debugging due to system layers
  • Price estimation difficulty due to agent loops
  • Compounding errors in long complex tasks
  • Getting stuck in loops despite framework safeguards
  • Tool integration issues

77:31 Infrastructure and Business Challenges

  • Framework stability concerns
  • Model deprecation risks
  • Library version changes
  • Uncertainty around business value delivery

78:01 Real-World Issues and Incidents

  • 78:08 Replika agent deleted production after code freeze
  • 78:39 Lawsuit against OpenAI over ChatGPT and suicide claim (guardrails now added)
  • 79:13 Air Canada liable for chatbot's bad advice
  • 79:33 $40 billion in generative AI products not returning value
  • AI Incident Database 79:49: Tracks 4,323+ incidents ranging from minor (hallucinations) to major (ethics, legality, human welfare)

80:15 Current State and Future Outlook

  • Technology is still maturing
  • Progress is rarely linear
  • Recommendation: Use AI as a junior assistant, not autonomous system
  • Implementation best practices 81:14:
    • Start with read-only access to tools and systems
    • Add human approvals for critical steps
    • Enable comprehensive logging and trace visibility

81:28 Job Displacement and Career Implications

  • Microsoft Research Study (July 2025) 81:59:
    • Ranked occupations by AI replaceability
    • Highest risk: Proof readers, editors, mathematicians, data scientists, web developers
    • Lowest risk: Nursing assistants, dishwashers, roofers, floor sanders
  • Moravec's Paradox 84:08:
    • What's hard for humans is easier for AI (intellectual tasks)
    • What's easy for humans is harder for AI (physical tasks)
    • AI excels at: Chess, gaming, cognitive abilities
    • AI struggles with: Walking, physical manipulation
  • Realistic assessment 83:00:
    • Change is coming but not yet
    • Pure cognitive replacement unlikely (mathematics is too complex)
    • Roles will transform rather than disappear (writers, analysts, developers)

86:06 Software Development Evolution

  • Three paradigms coexist 86:37:
    • Software 1.0 (1940s-2010s): Write explicit rules and conditionals
    • Software 2.0 (2010s): Machine learning - provide input/output, model learns rules
    • Software 3.0 (Today): Generative AI - use natural language to program systems
  • Modern developers should understand all three paradigms

89:01 Advice for Weathering the Storm

89:12 Learn AI

  • Don't fear AI—use it to amplify capabilities
  • AI is a tool whose impact depends on how we use it
  • Generative AI is accessible with free learning resources available

89:55 Fundamentals Don't Fade

  • Physics, math, biology, chemistry remain essential
  • System architecture and design principles are timeless
  • Databases, networking, identity management always matter
  • Use AI to boost productivity, not replace fundamental knowledge

90:45 Move Up the Abstraction Ladder

  • You must define the problem
  • You must design the solution
  • You must own the outcomes
  • AI is a junior assistant, good with syntax
  • AI tends to write convoluted, overly-defensive code (creating tech debt)
  • Use clear instructions and review outputs for conciseness

92:03 Think in Systems

  • See the bigger picture
  • Understand components and integration points
  • Current context window can't handle full codebases
  • Poor context window management causes performance degradation
  • Manage AI as junior helper needing guidance, not autonomous agent

92:54 Be a Polymath

  • Broaden your knowledge across domains
  • Act as supervisor of AI helpers
  • Learn quickly to understand what AI is doing
  • Develop broader skill sets

93:19 Some Niches Are More Difficult for AI

  • AI only knows what's in training data
  • Struggles with cutting-edge and novel ideas
  • Token-by-token probabilistic generators (not planning)
  • Limitations 94:02:
    • No understanding of physics or world models
    • Primarily text-based, limited multimodal capability
    • Missing reasoning components needed for future progress
    • World models (like Dr. Yann LeCun's work) likely next evolution

95:37 Focus on the Human Element

  • Agents are not human and cannot replicate human connection
  • Humans' collective ability to work together is our species' greatest asset
  • Learning across generations and horizontal networks is uniquely human
  • Trust-building and team connections cannot be automated
  • AI is still an outsider to humanity

97:19 Recommended Learning Resources

  • 97:31 Andrew Ng's "Agentic AI" course on DeepLearning.AI (highly recommended)
  • Chip Huang: Books on machine learning and AI engineering (comprehensive, down-to-earth writing)
  • Andre Karpathy: YouTube channel and educational service (excellent at explaining complex concepts)
  • Coursera: Multiple AI courses from industry leaders
  • Academy platforms: LangChain, Anthropic, Nvidia, DeepLearning.AI offer free courses
  • Stanford University: Computer science classes available on YouTube
  • Linux Foundation: MCP donation article and resources
Building Agentic AI Workloads – Crash Course
Interactive graph
On this page
Description
Notes
00:01 Introduction
01:15 History of Artificial Intelligence
06:12 Generative AI vs Traditional Machine Learning
14:42 Agency and Autonomy Spectrum
20:32 What is an Agent?
23:04 Agents vs Workflows
26:21 Pros and Cons of Agents
29:60 When to Use or Avoid Agents
32:39 Components of an Agent
34:55 Choosing an LLM
37:43 System Prompt (Agent Identity)
39:02 Memory in Agents
41:27 Tools and Actions
43:06 Implementing an Agent (Code Examples)
60:03 Agentic Architectural Patterns
60:18 Emerging Multi-Agent Patterns
61:12 Hierarchical Supervisor Architecture
61:26 Swarm Architecture
62:04 Single Agent Implementation
62:43 Supervisor Architecture Demonstration
64:38 Swarm Architecture Demonstration
66:13 When to Use Single vs Multi-Agent Systems
69:02 Agent Interfaces and Standardization Protocols
72:02 Evaluating Agents
75:25 Agent Challenges
75:32 Model-Related Challenges
76:29 System-Level Challenges
77:31 Infrastructure and Business Challenges
78:01 Real-World Issues and Incidents
80:15 Current State and Future Outlook
81:28 Job Displacement and Career Implications
86:06 Software Development Evolution
89:01 Advice for Weathering the Storm
89:12 Learn AI
89:55 Fundamentals Don't Fade
90:45 Move Up the Abstraction Ladder
92:03 Think in Systems
92:54 Be a Polymath
93:19 Some Niches Are More Difficult for AI
95:37 Focus on the Human Element
97:19 Recommended Learning Resources