We've been exploring the Vercel AI SDK v5 canary in this series, and today we're tackling a big one: Tools. If you're building any kind of agentic behavior, or just need your AI to interact with the outside world (or your own app's functions), this is where the rubber meets the road. v5 brings some serious structure and developer experience improvements to how tools are defined, called, and represented in the UI.
As a quick reminder, we're building on concepts from previous posts: UIMessage
and UIMessagePart
(the new message anatomy), v5's streaming capabilities, V2 Model Interfaces (especially LanguageModelV2FunctionTool
), and the overall client/server architecture. If those are new to you, you might want to glance back.
🖖🏿 A Note on Process & Curation: While I didn't personally write every word, this piece is a product of my dedicated curation. It's a new concept in content creation, where I've guided powerful AI tools (like Gemini Pro 2.5 for synthesis, git diff main vs canary v5 informed by extensive research including OpenAI's Deep Research, spent 10M+ tokens) to explore and articulate complex ideas. This method, inclusive of my fact-checking and refinement, aims to deliver depth and accuracy efficiently. I encourage you to see this as a potent blend of human oversight and AI capability. I use them for my own LLM chats on Thinkbuddy, and doing some make-ups and pushing to there too.
Let's get into how v5 makes tool calls first-class citizens.
1. Tool Invocation Lifecycle Recap: From AI Request to UI Update
Before we dive into the v5 specifics, let's quickly refresh the general lifecycle of an AI tool call, as this sets the stage for understanding the structured approach v5 brings.
Why this matters?
Tool calling is a fundamental capability for making LLMs more than just text generators. They allow models to fetch real-time data, perform actions, or interact with other systems. Historically, managing this interaction – the request from the AI, executing the tool, getting the result back to the AI, and updating the UI – could be a bit loose, often involving custom logic and parsing.
How it generally works (Pre-v5 context):
- AI Decides to Use a Tool: LLM processes input, determines need for external help, decides to call a specific function/tool.
- AI Specifies Tool and Arguments: LLM generates a request specifying
toolName
andargs
(often JSON). - Application Executes Tool: Your application (server-side or client-side) receives the request and executes the named tool with provided arguments.
- Result Fed Back to AI: Tool output (data, confirmation, or error) is packaged and sent back to the LLM.
- AI Generates Final Response: LLM incorporates tool result and formulates its final response to the user.
v5 Enhancement Teaser:
This general flow remains, but Vercel AI SDK v5 provides a much more structured, typed, and integrated way to represent and manage this entire lifecycle, especially within the chat UI (using ToolInvocationUIPart
) and the data flow.
Take-aways / Migration Checklist Bullets
- Tool calling involves the AI requesting, your app executing, and the AI using the result.
- v5 brings enhanced structure to this lifecycle, especially for UI and data flow.
2. Defining V2 LanguageModelV2FunctionTool
with Zod (Server-Side)
To empower your LLM with tools in Vercel AI SDK v5, you first need to define them on the server-side using the LanguageModelV2FunctionTool
interface, with Zod playing a crucial role in schema definition and argument validation.
Why this matters?
LLMs need clear, machine-readable definitions of tools: name, description, and argument schema. Without this, calls fail. v5 standardizes this with LanguageModelV2FunctionTool
and strongly encourages Zod for schema validation.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Tools are defined using LanguageModelV2FunctionTool
from @ai-sdk/provider
.
// From packages/provider/src/language-model/v2/language-model-v2.ts
export interface LanguageModelV2FunctionTool<NAME extends string, ARGS, RESULT> {
readonly type: 'function';
readonly function: {
readonly name: NAME;
readonly description?: string;
readonly parameters: ZodSchema<ARGS>; // Typically a Zod schema
readonly execute?: (args: ARGS) => PromiseLike<RESULT>; // Optional server-side execution
};
}
+---------------------------------+
| LanguageModelV2FunctionTool |
|---------------------------------|
| .type: 'function' |
| .function: { |
| .name: string (ToolName) |
| .description?: string |
| .parameters: ZodSchema<ARGS> | <--- Zod for arg schema & validation
| .execute?: (ARGS)=>Promise<RESULT> | <--- Optional server-side logic
| } |
+---------------------------------+
[FIGURE 1: Diagram showing the structure of LanguageModelV2FunctionTool and its fields]
Key fields within function
:
-
name: NAME
(string): Unique name LLM uses to call the tool. -
description?: string
: Natural language description for the LLM (what it does, when to use it). Good descriptions are crucial for correct tool usage. -
parameters: ZodSchema<ARGS>
: Zod schema for arguments.- Why Zod? Provides static typing and runtime validation. SDK often converts to JSON Schema for the LLM.
- Automatic Validation: If tool has server-side
execute
, SDK uses this Zod schema to validate LLM-provided arguments beforeexecute
is called. This is key for security and robustness.
-
execute?: (args: ARGS) => PromiseLike<RESULT>
: Optional server-side function.- If provided, SDK can auto-run the tool. Receives validated
args
. - If not provided (or for client-side tools), tool call info is streamed to client.
- If provided, SDK can auto-run the tool. Receives validated
Example with Zod:
import { z } from 'zod';
import { LanguageModelV2FunctionTool } from '@ai-sdk/provider';
const weatherParamsSchema = z.object({
city: z.string().describe("The city, e.g., 'San Francisco'"),
unit: z.enum(['celsius', 'fahrenheit']).optional().default('celsius'),
});
type WeatherParams = z.infer<typeof weatherParamsSchema>;
const getWeatherTool: LanguageModelV2FunctionTool<'getWeatherInformation', WeatherParams, string> = {
type: 'function',
function: {
name: 'getWeatherInformation',
description: 'Fetches current weather for a city.',
parameters: weatherParamsSchema,
execute: async ({ city, unit }) => {
// ... (call weather API or simulate) ...
return `Weather in ${city} is X°${unit === 'celsius' ? 'C' : 'F'}.`;
},
},
};
// Usage with streamText:
// await streamText({
// model: openai('gpt-4o-mini'),
// messages: modelMessages,
// tools: { getWeatherInformation: getWeatherTool },
// toolChoice: 'auto'
// });
LLM hints via .describe()
on Zod properties help LLM generate accurate arguments.
Take-aways / Migration Checklist Bullets
- Define server-side tools using
LanguageModelV2FunctionTool
. - Use Zod schemas for
function.parameters
for argument definition and auto-validation. -
function.execute
is optional for server-side execution. - Write clear
function.description
. Use.describe()
on Zod schema properties. - Update V4 tool definitions to this V2 structure.
3. Server-side Tool Execution Flow (with execute
function)
When an LLM calls a server-defined tool that includes an execute
function, Vercel AI SDK v5 orchestrates a seamless flow: validating arguments, running your function, and then automatically feeding the results back to the LLM for further processing, all while keeping the client UI informed via streamed updates.
Why this matters?
Automated server-side tool execution flow saves boilerplate for parsing args, calling tool code, formatting results, and constructing messages for the LLM.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Flow when streamText
uses a tool with an execute
method:
- LLM Decides to Call Tool: Generates tool call request (
toolName
,args
). - V2 Provider Adapter Parses: Extracts tool call info from LLM response.
-
AI SDK Core Logic (in
streamText
):- Identifies Tool.
- Validates Arguments: Uses tool's Zod schema from
parameters
to validate LLM-providedargs
. If fails, error (e.g.,InvalidToolArgumentsError
) generated. - Calls
execute(args)
: If valid, calls your tool'sexecute
with typed, validatedargs
. - Awaits Result: SDK awaits the
Promise
fromexecute
.
+---------+ Tool Call Req +-----------------+ Validated Args +-----------------+ Tool Result +-----------------+ | LLM |------------------>| SDK Core |------------------->| Tool's .execute |---------------->| SDK Core | | | | (Arg Validation)| | Function | | (Gets Result) | +---------+ +-----------------+ +-----------------+ +-----------------+
[FIGURE 2: Sequence diagram of server-side tool execution: LLM -> SDK -> Validate Args -> Execute Tool -> Get Result -> SDK]
-
Result Sent Back to LLM (for multi-step):
- SDK constructs
ModelMessage
(role: 'tool'
) withLanguageModelV2ToolResultPart
(s) (containingtoolCallId
,toolName
,result
). - If
streamText
is configured for multi-step (e.g.,maxSteps > 1
), SDK automatically sends thistool
message back to LLM in the samestreamText
operation. - LLM processes result, generates next response (text or another tool call).
- SDK constructs
3.1 Auto-stream of tool-call
and tool-result
parts to the client
Simultaneously, result.toUIMessageStreamResponse()
streams updates to client UI:
- Streaming Tool Call Intention:
-
'tool-call-delta'
UIMessageStreamPart
(iftoolCallStreaming
enabled) then'tool-call'
part (withtoolCallId
,toolName
,args
). - Client
useChat
creates/updatesToolInvocationUIPart
(state: 'call'
). - UI shows AI using tool, e.g., "AI is using getWeatherInformation...".
-
-
Streaming Tool Result:
- After server
execute
completes,'tool-result'
UIMessageStreamPart
streamed (withtoolCallId
,toolName
,result
). - Client
ToolInvocationUIPart
updates (state: 'result'
, populated with result). - UI shows tool outcome.
Server (streamText & toUIMessageStreamResponse) Client (useChat & processUIMessageStream) ----------------------------------------------- ----------------------------------------- 1. LLM emits tool_call(getWeather, {city:"L"}) | v 2. SDK: 'tool-call-delta' sent to client -------------> Client UI: Display "Tool: getWeather, Args: {city:\"L\"..." (state: 'partial-call') | v 3. SDK: 'tool-call' sent to client (full args) -------> Client UI: Update ToolInvocationUIPart (state: 'call') | v 4. SDK: Server executes getWeatherTool.execute() | (result = "Rainy...") v 5. SDK: 'tool-result' sent to client (result) --------> Client UI: Update ToolInvocationUIPart with result (state: 'result') | v 6. SDK: Result fed back to LLM (if maxSteps > 1) | v 7. LLM: Generates final text based on tool result | v 8. SDK: 'text' parts sent to client ------------------> Client UI: Display final text message
[FIGURE 3: Diagram showing UIMessageStreamParts ('tool-call', 'tool-result') flowing to the client, updating a ToolInvocationUIPart]
- After server
3.2 Injecting results for model follow-up (Automatic if maxSteps
> 1)
- If
maxSteps
(e.g.,maxSteps: 5
) allows, SDK appendstool
roleModelMessage
(with tool output) to internal history and calls LLM again. - Loop (LLM -> tool call -> execute -> tool result -> LLM -> text/tool call) continues until final text or
maxSteps
limit. - All intermediate steps streamed to client.
Take-aways / Migration Checklist Bullets
- If server tool has
execute
,streamText
can auto-validate args & run it. - Arg validation uses Zod schema from
LanguageModelV2FunctionTool.parameters
. - SDK streams
'tool-call'
and'tool-result'
UIMessageStreamPart
s to client, updatingToolInvocationUIPart
. - If
maxSteps > 1
, SDK auto-sends tool results back to LLM for continued processing.
4. Client-side onToolCall
for Browser-Based Tools
Sometimes, a tool needs to be executed directly in the user's browser—to access browser APIs like geolocation, interact with a browser extension, or simply to ask the user for a confirmation via window.confirm
. Vercel AI SDK v5's useChat
hook facilitates this through its onToolCall
prop.
Why this matters?
Not all tools run on servers (e.g., browser navigator.geolocation
, window.confirm
). v5's onToolCall
in useChat
handles these.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Use onToolCall
callback prop in useChat
.
Scenario: LLM calls a tool.
- Case 1: Tool defined on server without
execute
. - Case 2: LLM calls tool not defined on server (client-only handler).
Server streams
'tool-call'
UIMessageStreamPart
. ClientuseChat
processes this, creatingToolInvocationUIPart
(state: 'call'
). ClientonToolCall
then runs.
// Client-side React component
import { useChat, UIMessage, ToolCall } from '@ai-sdk/react';
function MyChatComponent() {
const { messages, addToolResult /* ...etc */ } = useChat({
onToolCall: async ({ toolCall }: { toolCall: ToolCall }) => {
// toolCall: { toolCallId: string, toolName: string, args: any }
if (toolCall.toolName === 'getClientGeolocation') {
try {
const position = await new Promise<GeolocationPosition>((res, rej) => navigator.geolocation.getCurrentPosition(res, rej));
return { toolCallId: toolCall.toolCallId, result: { lat: position.coords.latitude, lon: position.coords.longitude } };
} catch (error: any) {
return { toolCallId: toolCall.toolCallId, error: error.message };
}
}
// ... handle other client tools ...
return { toolCallId: toolCall.toolCallId, error: `Tool '${toolCall.toolName}' not handled.` };
}
});
// ... render UI ...
}
+-------------------+ Streams 'tool-call' +-------------------+ Invokes callback +-------------------+
| Server (LLM) |------------------------>| Client (useChat) |--------------------->| onToolCall( |
| (decides to call | | (receives part, | | toolCall |
| client tool) | | updates UI msg) | | ) |
+-------------------+ +-------------------+ +-------------------+
^ |
| (updates ToolInvocationUIPart | . Returns {toolCallId, result/error}
| state to 'result'/'error') v
+---------------------------------------+
| (Optional: if maxSteps, useChat auto-resubmits to server with tool result)
[FIGURE 4: Diagram showing client-side onToolCall flow: Server streams 'tool-call' -> Client useChat -> onToolCall executes -> Result updates ToolInvocationUIPart -> (Optional) Resubmit to server]
onToolCall
Receives: { toolCall: ToolCall }
with toolCallId
, toolName
, args
.
onToolCall
Must Return: Promise<{ toolCallId: string; result?: any; error?: string; }>
(must include original toolCallId
).
SDK Action After onToolCall
Completes:
- SDK takes returned
result
/error
. - Updates
ToolInvocationUIPart
inUIMessage
(state: 'result'
or'error'
). - If
maxSteps
inuseChat
allows and all pending tools resolved,useChat
auto-resubmits messages (with client tool results) to server. Server/LLM continues.
4.1 Browser APIs (geolocation example)
onToolCall
allows await
ing async browser APIs like navigator.geolocation
.
4.2 UX patterns for confirmation dialogs (using addToolResult
for manual submission)
For tools needing user interaction after AI request (e.g., "AI wants to book flight. [Confirm] [Cancel]").
- LLM streams
'tool-call'
for, e.g.,requestConfirmation
. - UI renders
ToolInvocationUIPart
(state: 'call'
) with Confirm/Cancel buttons.onToolCall
for this tool might do nothing or just log. - User clicks button. Handler calls
addToolResult({ toolCallId, result: "User confirmed." })
. -
addToolResult
updatesToolInvocationUIPart
state. IfmaxSteps
allows,useChat
auto-resubmits.
Take-aways / Migration Checklist Bullets
- Use
useChat
'sonToolCall
for browser-executed tools. -
onToolCall
receives{ toolCall: ToolCall }
, returnsPromise<{ toolCallId, result?, error? }>
. - Return original
toolCallId
. - SDK updates
ToolInvocationUIPart
. IfmaxSteps
allows, auto-resubmits to server. - For UI-driven confirmations, render
'call'
state, useaddToolResult()
from button handlers.
5. Error Propagation & Recovery in Tool Calls
Tool calls, like any external interaction, can fail. Vercel AI SDK v5 provides mechanisms for these errors to propagate through the system and offers strategies for recovery, ensuring your application can handle hiccups gracefully.
Why this matters?
Tool calls can fail (invalid LLM args, tool execution error, tool not found). Robust apps need to catch, display, and offer recovery.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
v5 surfaces tool errors via specific error types and ToolInvocationUIPart.state
.
-
Schema Validation Errors (
InvalidToolArgumentsError
):- LLM args don't match Zod schema in
LanguageModelV2FunctionTool.parameters
. - SDK (server-side) auto-validates before
execute
. If fails, error likeInvalidToolArgumentsError
thrown. - Error streamed to client as
'tool-error'
UIMessageStreamPart
(withtoolCallId
,toolName
,errorMessage
). -
ToolInvocationUIPart
on client updates tostate: 'error'
,errorMessage
populated.
- LLM args don't match Zod schema in
-
Tool Execution Errors (
ToolExecutionError
):- Server-side
execute
throws unhandled exception (e.g., external API down). SDK catches, often wraps asToolExecutionError
. Streamed as'tool-error'
.ToolInvocationUIPart
updates tostate: 'error'
. - Client-side
onToolCall
throws or returns{ toolCallId, error: "..." }
.useChat
updatesToolInvocationUIPart
tostate: 'error'
.
- Server-side
-
Tool Not Found Errors (
NoSuchToolError
):- LLM calls tool not defined in
streamText
'stools
or not handled by clientonToolCall
. SDK recognizes, results inNoSuchToolError
. Streamed as'tool-error'
or general stream error.
- LLM calls tool not defined in
-
SDK Error Handling & Repair (Experimental -
ToolCallRepairError
):- V4 had
experimental_repairToolCall
to fix invalid tool calls. If this persists/enhanced in v5 and repair fails,ToolCallRepairError
might occur.
Error Points in Tool Call Lifecycle: 1. LLM -> Args -> [SDK: Zod Validation] --(FAIL)--> InvalidToolArgumentsError | (PASS) v 2. SDK -> Tool.execute() -> [Tool Logic] --(FAIL)--> ToolExecutionError (or custom error in execute) | (PASS) v 3. Tool -> Result (If LLM calls non-existent tool --> NoSuchToolError) All these errors are typically streamed to client as 'tool-error' UIMessageStreamPart, updating the ToolInvocationUIPart.state to 'error'.
[FIGURE 5: Flowchart showing different points where tool errors can occur and how they propagate to the UI]
- V4 had
Recovery Strategies:
- Retry with
reload()
:useChat().reload()
resends last user message. LLM might try tool again (maybe with corrected args), choose different tool, or answer without tool. - AI Self-Correction (multi-step): If
ToolInvocationUIPart
error (with message) is sent back to LLM, sophisticated LLM might understand error and retry tool with corrected args or try alternative. - Clear UI Feedback: Crucial. UI must display
ToolInvocationUIPart
errors fromerrorMessage
. Helps user rephrase or retry.
Take-aways / Migration Checklist Bullets
- Anticipate errors: LLM arg errors, tool execution failures, tool not found.
- v5 surfaces these via specific error types and
ToolInvocationUIPart.state
to'error'
witherrorMessage
. - Errors typically streamed via
'tool-error'
UIMessageStreamPart
s. - Implement UI rendering for
ToolInvocationUIPart
error state. - Use
useChat().reload()
for user-driven retries. - Informative error messages to LLM can enable AI self-correction.
6. Security – Validating Args & Results (Crucial Emphasis)
When integrating tools with LLMs, security is paramount. Always validate arguments provided by the LLM before tool execution and sanitize results from tools before displaying them or feeding them back to the LLM. Vercel AI SDK v5's emphasis on Zod schemas for tool parameters is a key enabler for input validation.
Why this matters?
CRUCIAL EMPHASIS. LLMs generate text; unchecked arguments or unsanitized tool results can lead to vulnerabilities (SQL injection, XSS, DoS via large payloads). Prompt injection is a real risk.
How it’s solved in v5? (Practices & SDK Features)
v5's LanguageModelV2FunctionTool
encourages practices to mitigate risks.
6.1 Validating LLM-Generated Arguments (Input Validation for Tools)
- Golden Rule: Always validate LLM-generated arguments before tool execution.
- Zod Schemas:
- For server tools with
execute
, SDK auto-validates LLM args against Zod schema inLanguageModelV2FunctionTool.parameters
. If fails, error raised,execute
not called with bad data. Leverage fully with strict schemas.
- For server tools with
- Manual Validation for Client-Side Tools (
onToolCall
):- In client
onToolCall
, if tool critical or args complex, manually validatetoolCall.args
against Zod schema.
- In client
- Preventing Injections: Strict schema validation is one defense against prompt injection leading to malicious args (e.g., SQL injection). Secure coding in tool (e.g., parameterized queries) is another.
6.2 Sanitizing/Validating Tool Results (Output Validation from Tools)
- Golden Rule: If tool results from untrusted sources, validate/sanitize before displaying in UI (prevent XSS) or sending back to LLM (prevent input poisoning/DoS).
-
Examples:
- XSS: Tool returns title
"<img src=x onerror=alert(1)>"
. Render as plain text or use HTML sanitizer (DOMPurify). - LLM Input Poisoning/DoS: Tool returns 10MB random chars. Validate structure/size, truncate, sanitize control chars before sending to LLM.
+----------+ args +---------------------+ result +---------------------+ | LLM |-------->| Tool Execution |<---------| External API/Data | +----------+ | (Your Code) | +---------------------+ ^ | | ^ | (result_clean) | - INPUT VALIDATION | | (potentially unsafe data) | | (Zod on args) | | | | - OUTPUT VALIDATION/| | +----------------| SANITIZATION |-------------------+ | (on result) | +---------------------+
[FIGURE 6: Diagram illustrating the two-way validation: LLM args -> Tool (Input Validation) and Tool result -> UI/LLM (Output Validation/Sanitization)]
- XSS: Tool returns title
Take-aways / Migration Checklist Bullets
- SECURITY IS PARAMOUNT for tools.
- ALWAYS validate LLM-generated arguments. Use Zod for server tools. Manually validate in client
onToolCall
. - ALWAYS validate/sanitize tool results before UI render (XSS) or sending to LLM.
- Be careful with tools executing code, DB queries, file system access. Use least privilege.
7. Composing Multi-Step Chains (Server-Side and Client-Involved)
Vercel AI SDK v5 excels at facilitating multi-step conversational flows where the AI might call a tool, get a result, then call another tool or generate text, all within a single user turn. This is powered by the maxSteps
option on both the server (streamText
) and client (useChat
), along with the structured streaming of tool interactions.
Why this matters?
Conversational agents often need sequential actions/reasoning (e.g., "Weather in French capital? Find bistro there."). v5 simplifies orchestrating these chains.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
7.1 maxSteps
in streamText
(Server-Side Multi-Step)
- Scenario: Call
streamText
with server tools (withexecute
) andmaxSteps > 1
(e.g.,maxSteps: 5
). - SDK Orchestration:
- User msg ->
streamText
. - LLM (Turn 1) -> calls ServerToolA.
- SDK: Validates args,
execute
s ServerToolA, getsresultA
. - SDK: Constructs
tool
ModelMessage
withresultA
, auto-sends back to LLM (samestreamText
op). - LLM (Turn 2): Processes
resultA
. Might gen final text, or call ServerToolB. - Loop continues until final text or
maxSteps
.
- User msg ->
-
Streaming: All intermediate
'tool-call'
,'tool-result'
parts streamed to client UI.
Server-Side Multi-Step (maxSteps > 1): User Prompt --> [streamText Call] --> LLM | (decides ToolA) v SDK: Executes ToolA --> ResultA | v (ResultA fed back) LLM | (decides ToolB or Final Text) v SDK: Executes ToolB --> ResultB (if ToolB called) | v (ResultB fed back) LLM --> Final Text Response (Each tool call and result is streamed to client as UIMessageStreamParts)
[FIGURE 7: Diagram of server-side multi-step flow with maxSteps: LLM -> ToolA -> ResultA -> LLM -> ToolB -> ResultB -> LLM -> Final Text. All streamed to client.]
7.2 maxSteps
in useChat
(Client-Involved Multi-Step)
-
useChat
client option. Controls rounds of (User msg -> Server/LLM -> Client tool -> Server/LLM -> Client response/tool). - Flow with Client Tools (
onToolCall
):- User msg ->
useChat
POSTs to server. - Server (
streamText
): LLM calls ClientToolX. Server streams'tool-call'
for ClientToolX. - Client (
useChat
):onToolCall
for ClientToolX runs, returnsresultX
.ToolInvocationUIPart
updates. - Client (
useChat
Auto Resubmit): IfmaxSteps
allows & all tools resolved,useChat
auto-POSTs updated messages (withresultX
astool
role msg) back to server. - Server (Again):
streamText
runs withresultX
in history. - LLM: Processes
resultX
, generates final text or another tool call. - Loop continues until final text or
useChat
maxSteps
.
- User msg ->
7.3 StepStartUIPart
for UI Delineation
-
streamText().toUIMessageStreamResponse()
might auto-insert'step-start'
UIMessageStreamPart
s into v5 UI Message Stream during multi-step tool calls. - Purpose: Marker parts
{ type: 'step-start'; }
. - Function: Indicate new logical "step" in AI's process.
- UI Rendering: Render as visual separator (e.g.,
<hr>
, "Step 2:"). Helps user follow complex AI actions.
Take-aways / Migration Checklist Bullets
- Use
maxSteps
in serverstreamText
for automated multi-step server tool chains. - Use
maxSteps
in clientuseChat
for auto-resubmission of client tool results. - SDK streams intermediate tool calls/results as
ToolInvocationUIPart
s. - Look for
StepStartUIPart
to visually delineate steps.
8. Showcase: Calendar-booking Wizard (Conceptual Walkthrough)
To illustrate the power of v5's structured tool handling and multi-step chains, let's imagine building a conceptual calendar-booking assistant. This example will highlight how server-side tools, client-side interactions, and UI updates come together.
Why this matters?
Booking meetings involves multiple steps (check availability, present options, confirm). Hard with simple request-response LLM. v5 helps AI guide user through this.
The Scenario: User: "Book a 1-hour meeting with Jane for next Tuesday afternoon."
Conceptual Tools:
-
checkAvailability(person, dateRange, durationHours)
(Server-side,execute
fn): Queries calendar, returns available slots. -
displaySlotOptions(slots, prompt)
(Client-side, UI render +addToolResult
): AI calls to tell UI to show slots as clickable options. User click provides result. -
confirmBooking(person, selectedSlot, durationHours)
(Server-side,execute
fn): Books meeting.
Walkthrough (v5 Features Highlighted, maxSteps: 5
assumed):
- User: "Book..." -> Client
useChat
POSTs. - AI (Turn 1 - Server): LLM -> calls
checkAvailability
. SDK validates,execute
s. Client UI: ShowsToolInvocationUIPart
"Checking Jane's availability...". - Server (
checkAvailability
): Returnsresult: ["Tue 2PM", "Tue 3PM"]
. SDK auto-sends to LLM. Client UI:ToolInvocationUIPart
updates "Found 2 slots.". -
AI (Turn 2 - Server): LLM -> calls
displaySlotOptions(slots: ["Tue 2PM", "Tue 3PM"], prompt: "Select time:")
. No serverexecute
, so streams'tool-call'
to client. Client UI: NewToolInvocationUIPart
"Select time:" with buttons [Tue 2PM] [Tue 3PM].
Chat UI - AI Turn: ------------------------------------------------- AI: Okay, I found these times for Jane next Tuesday. Which one works for you? [Tool: displaySlotOptions - awaiting user input] [ Button: Tuesday 2:00 PM ] [ Button: Tuesday 3:00 PM ] [ Button: Tuesday 4:00 PM ] -------------------------------------------------
[FIGURE 8: Mockup of UI showing these clickable slot buttons within a ToolInvocationUIPart]
User (Client): Clicks "Tue 3PM". Button handler calls
addToolResult({ toolCallId_displaySlots, result: "Tue 3PM" })
.ToolInvocationUIPart
updates.useChat
auto-POSTs back to server.AI (Turn 3 - Server): LLM gets "Tue 3PM". Calls
confirmBooking
. SDK validates,execute
s. Client UI: NewToolInvocationUIPart
"Confirming meeting...".Server (
confirmBooking
): Returnsresult: "Booked!"
. SDK auto-sends to LLM. Client UI:ToolInvocationUIPart
updates "Booking confirmed!".AI (Turn 4 - Server): LLM gets success. Generates final text: "Great! Booked for Tue 3PM." Server streams
'text'
parts. Client UI: Final assistant message appears.
v5 Features Highlighted:
- Structured
ToolInvocationUIPart
for each tool step. - Mix of server (
execute
) and client (onToolCall
/addToolResult
) tools. -
maxSteps
(server & client) for automated chaining. -
addToolResult
for UI-driven tool completion. - Clear UI feedback via
ToolInvocationUIPart
states.
Take-aways / Migration Checklist Bullets
- v5 tool features enable complex multi-step agents.
- Combine server and client tools.
maxSteps
automates chain. -
ToolInvocationUIPart
key for UI state. Design tools/prompts for workflow.
9. Key Lessons Learned (Summary of Tool Usage in v5)
Wrapping up our deep dive into Vercel AI SDK v5's tool capabilities, let's consolidate the main advantages and best practices that emerge from this new, more structured approach.
Why this matters?
v5 brings organization and power to complex AI tool interactions. Understanding core principles helps build robust, maintainable, user-friendly tool-using AI apps.
Actionable Takeaways & Best Practices:
- Structured is Better: Shift to v5's
ToolInvocationUIPart
and serverLanguageModelV2FunctionTool
for robust, typed, stateful tool handling. - Schema is Your Friend (Embrace Zod): Always use Zod schemas for
LanguageModelV2FunctionTool.parameters
for clear arg definition, auto server-side validation, and type safety. - Clear Client vs. Server Execution Strategy: Decide where tools run. Server
execute
for backend resources/security. ClientonToolCall
/addToolResult
for browser APIs/UI interaction. - Design for Multi-Step Interactions: Leverage
maxSteps
(serverstreamText
, clientuseChat
) for chained tool calls, result processing, continued reasoning. - Build Rich, Informative UIs for Tools: Use
ToolInvocationUIPart
states ('partial-call'
,'call'
,'result'
,'error'
) for clear visual feedback. - Security First, Always: Validate LLM args (Zod helps). Sanitize tool results before UI display (XSS) and before sending to LLM.
Teasing Post 9: Persisting Rich Chat Histories
With tool calls as first-class citizens in UIMessage
s, how do we reliably save and restore these intricate conversations?
Post 9: "Persisting Rich UIMessage
Histories: The v5 'Persist Once, Render Anywhere' Model." We'll dive into database schema strategies, best practices for saving UIMessage
arrays with parts
/metadata
, and v5's high-fidelity restoration.
Top comments (0)