Building AI Agents with the Vercel AI SDK
If a chatbot is like a helpful person at an information desk, an AI agent is like a personal assistant who can leave the desk, go out into the world, and get things done for you.
An agent is an autonomous system that can understand a high-level goal, break it down into steps, use tools to execute those steps, and adapt its plan based on the results. This is the leap from simply answering questions to actively solving problems.
How Do Agents Work?
At its core, an agent operates in a loop: it assesses a goal, chooses a tool, executes it, observes the result, and then decides what to do next. This process, often called a ReAct (Reason + Act) loop, allows the agent to tackle complex tasks that require multiple steps or access to external information.
The Vercel AI SDK gives you a straightforward way to build these agentic systems. Instead of manually creating every step in the loop, generateText can manage tool calls with stopWhen.
Building a Simple Multi-Step Agent
Let's build a simple math agent that can solve a multi-step word problem. The agent will have two tools: a calculator and an answer tool to deliver the final result.
Notice how we're not telling it when to use the calculator, just that it is available. The useful bit is that the model can reason about the problem and decide when a tool is necessary.
// lib/agents/math-agent.ts
import { generateText, hasToolCall, stepCountIs, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
import * as mathjs from 'mathjs';
// Ensure you have a .env file with your OPENAI_API_KEY
import 'dotenv/config';
async function main() {
const { text, toolCalls, finishReason } = await generateText({
model: openai('gpt-4o'),
// Allow a short multi-step loop, or stop when the final answer is produced.
stopWhen: [stepCountIs(5), hasToolCall('answer')],
tools: {
// A calculator tool that the model can use to evaluate math expressions.
calculator: tool({
description: 'A tool to evaluate mathematical expressions.',
inputSchema: z.object({
expression: z.string().describe('The mathematical expression to evaluate.'),
}),
execute: async ({ expression }) => mathjs.evaluate(expression),
}),
// An answer tool that the model uses to provide the final answer.
answer: tool({
description: 'A tool to provide the final answer to the user.',
inputSchema: z.object({
finalAnswer: z.string().describe('The final answer.'),
explanation: z.string().describe('A step-by-step explanation of how the answer was derived.'),
}),
// When the model calls this tool, it will terminate the agent loop.
}),
},
prompt: `You are a helpful math assistant.
A user is asking a question about a word problem.
Reason step-by-step and use the calculator tool when needed.
When you have the final answer, use the "answer" tool to provide it.
Question: A taxi driver earns $9461 in a 1-hour shift.
If they work 12 hours a day and use 14 liters of petrol per hour at a price of $134 per liter,
how much money does the driver earn in one day after deducting petrol costs?`,
});
console.log('---');
if (finishReason === 'tool-calls') {
const answerToolCall = toolCalls.find(call => call.toolName === 'answer');
if (answerToolCall) {
console.log('Final Answer:', answerToolCall.args.finalAnswer);
console.log('\nExplanation:\n', answerToolCall.args.explanation);
} else {
console.log('The agent did not provide a final answer.');
}
} else {
console.log('Final Text Response:', text);
}
console.log('---');
}
main();
In this example:
- We give the model a prompt and a set of
tools. - We use
stopWhento allow a short multi-step loop, then stop once the model calls theanswertool or reaches the step limit. - The model first reasons that it needs to calculate the total earnings and total fuel cost. It calls the
calculatortool multiple times to do this. - After each tool call, the result is fed back into the model's context, allowing it to plan its next step.
- Once it has all the information it needs, it calls the
answertool to present the final, structured result. Because theanswertool has noexecutefunction, the AI SDK knows this is a terminal step and stops the loop.
That is the practical value of agentic design. You provide the goal and the tools, and the agent can work through the intermediate steps.
Key Concepts for Building Agents
- Tool Definition: Tools are the actions your agent can take. They should be well-described so the LLM knows when and how to use them. Using
zodfor parameters ensures the data passed to your tools is always correctly structured. - State Management: For more complex agents, you'll need a way to persist information between interactions. This could be a simple object in memory, or a database-backed store.
- Planning: The LLM's ability to break down a large goal into smaller, manageable steps. This is often guided by a strong system prompt that encourages the model to "think step-by-step."
- Multi-Agent Systems: For very complex workflows, you can even have multiple specialized agents collaborate. For example, a "researcher" agent could find information, and a "writer" agent could use that information to compose a report.
AI agents are useful when a task needs planning, tools and multiple steps. With the Vercel AI SDK, you can move beyond simple Q&A bots and build systems that reason, call functions and adapt based on results.