Control tool choice
Applies to:
When you use a language model with Arcade AI, you can control how it selects and uses the available tools with the tool_choice
parameter:
response = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "Star the ArcadeAI/arcade-ai repo on GitHub",
},
],
model="gpt-4o",
user=user_id,
tools=[
"GitHub.SetStarred",
"GitHub.CountStargazers",
],
tool_choice="generate",
)
The tool_choice
parameter accepts these options:
none
: Prevent the model from running any toolsexecute
: The model runs the tool and returns the tool's output directlygenerate
: The model runs the tool and generates a response based on the tool's output
Additionally, these options from OpenAI's tool_choice
parameter are supported, but are not commonly used:
auto
: Let the model decide which tools to use, but does not run the toolrequired
: Ensure the model selects at least one tool, but does not run the tool
For backwards compatibility, auto
and required
only predict the tool
choice, but do not run the tool. These options behave the same with or without
Arcade AI.
Tool calling patterns with Arcade AI
Whether to use execute
or generate
depends on how you want to use the tool's output.
tool_choice: execute
The execute
option empowers the LLM to run tools as if it were executing them directly. Arcade AI handles the tool execution behind the scenes and returns the results to the client.
Flow Overview:
- Client Request: The client calls the AI model via the Arcade Engine.
- Tool Definition: The Engine adds tool definitions to the request.
- Model Prediction: The LLM predicts which tool to use and its arguments.
- Tool Execution: The Engine sends the arguments to the appropriate Actor.
- Result Return: The Actor executes the tool and returns results to the Engine.
- Client Response: The Engine sends the results back to the client.
Example: Sending a Slack Message
Imagine a user wants to send a Slack message:
- User Input: "Send a Slack message to John saying 'Meeting at 3 PM'"
- LLM Prediction: Use the
Slack.SendDmToUser
tool with arguments:user_name: "john"
message: "Meeting at 3 PM"
- Tool Execution: The Engine forwards these arguments to the Actor that hosts the Slack toolkit, which sends the message.
- Response: The client receives the return value from the tool
This process happens seamlessly, with the client only seeing the initial request and final response.
tool_choice: generate
The generate
option works like execute
but adds a step where the Engine asks the LLM to create a response based on the tool's results. This provides more refined output that incorporates the tool's data.
Flow Overview:
- Client Request: The client calls the AI model via the Arcade Engine.
- Tool Definition: The Engine adds tool definitions to the request.
- Model Prediction: The LLM predicts which tool to use and its arguments.
- Tool Execution: The Engine sends the arguments to the appropriate Actor.
- Intermediate Results: The Actor executes the tool and returns results to the Engine.
- Response Generation: The Engine sends a second request to the LLM with the tool's results.
- Final Response: The LLM generates a response incorporating the tool's output, and the Engine returns it to the client.
Example: Checking Calendar Availability
Suppose a user wants to know their availability for the next day:
- User Input: "What's my availability for tomorrow?"
- LLM Prediction: Use the
Google.ListEvents
tool for the specified date. - Tool Execution: The Engine requests the Actor hosting the Calendar toolkit to retrieve events for tomorrow.
- Results: The Actor returns calendar data (e.g., three meetings scheduled).
- LLM Response Generation: The Engine provides the calendar data to the LLM.
- Response: The LLM generates: "You have 3 meetings tomorrow. You're free from 9-10 AM, 12-2 PM, and after 4 PM."
- Client Receives: The summarized availability information.
By leveraging generate
, you receive responses that are both informative and contextually rich.