LLM Inference

Provides access to Somnia's on-chain deterministic AI models for text generation, analysis, and decision-making. This agent enables smart contracts to leverage large language models for intelligent automation — including tool use with both MCP servers and on-chain function calls.

Methods

inferString

Simple single-turn inference with an optional system prompt.

function inferString(string prompt, string system, bool chainOfThought, string[] allowedValues) returns (string response)

Parameters

Input
Type
Description

prompt

string

The user prompt to send to the model

system

string

The system prompt to configure model behavior (can be empty string)

chainOfThought

bool

Whether to enable chain-of-thought reasoning

allowedValues

string[]

Optional list of allowed response values — the model is constrained to return one of these. Pass an empty array for unconstrained output

Returns

Output
Type
Description

response

string

The generated text response


inferNumber

Single-turn inference that extracts an integer from the model's response, clamped to a specified range.

function inferNumber(string prompt, string system, int256 minValue, int256 maxValue, bool chainOfThought) returns (int256 response)

Parameters

Input
Type
Description

prompt

string

The user prompt to send to the model

system

string

The system prompt to configure model behavior (can be empty string)

minValue

int256

Minimum allowed value for the response

maxValue

int256

Maximum allowed value for the response

chainOfThought

bool

Whether to enable chain-of-thought reasoning

Returns

Output
Type
Description

response

int256

The extracted integer, clamped to [minValue, maxValue]


inferChat

Multi-turn conversational inference with full message history.

Parameters

Input
Type
Description

roles

string[]

Array of message roles: "system", "user", or "assistant"

messages

string[]

Array of message contents (must match length of roles array)

chainOfThought

bool

Whether to enable chain-of-thought reasoning

Returns

Output
Type
Description

response

string

The generated text response


inferToolsChat

Inference with tool use. The LLM can call tools provided by MCP servers (executed in-situ by the agent) and on-chain tools (yielded back to the caller as ABI-encoded calldata). When on-chain tool calls are pending, the function returns the full conversation state so the caller can execute the calls and resume.

Tool Definitions

On-chain tools are defined using Solidity function signature strings, making them natural for on-chain callers:

Supported Solidity types in signatures: string, bool, address, uint256 (and other uint/int sizes), bytes, and arrays of these types.

MCP (Model Context Protocol) tools are discovered automatically — pass the server URLs and the agent fetches the available tools at runtime.

Parameters

Input
Type
Description

roles

string[]

Array of message roles (system/user/assistant/tool)

messages

string[]

Array of message contents (must match length of roles)

mcpServerUrls

string[]

URLs of MCP servers whose tools the LLM may call

onchainTools

OnchainTool[]

Tools yielded back to caller as calldata

maxIterations

uint256

Maximum LLM↔tool round-trips before stopping

chainOfThought

bool

Whether to enable chain-of-thought reasoning

Returns

Output
Type
Description

finishReason

string

"stop" (LLM done), "tool_calls" (on-chain calls pending), or "max_iterations" (limit reached)

response

string

Final text response (populated when finishReason == "stop")

updatedRoles

string[]

Full conversation state — roles (for resumption)

updatedMessages

string[]

Full conversation state — messages (for resumption)

pendingToolCallIds

string[]

IDs of pending on-chain tool calls

pendingToolCalls

bytes[]

ABI-encoded calldata for each pending call (selector + args)

Return Semantics

  • finishReason == "stop": The LLM has finished. response contains the final text. All other outputs are empty arrays. Any MCP tool calls were already executed during processing.

  • finishReason == "tool_calls": The LLM wants to call on-chain tool(s). response is empty. updatedRoles/updatedMessages contain the full conversation history (including any MCP tool results). pendingToolCallIds and pendingToolCalls are parallel arrays — each pendingToolCalls[i] is calldata (4-byte function selector + ABI-encoded arguments).

  • finishReason == "max_iterations": The agent reached maxIterations tool round-trips without the LLM producing a final response.

Tool Use Flow

MCP Tools (Automatic)

MCP tools are executed automatically by the agent. The LLM calls the tool, the agent forwards the call to the MCP server, feeds the result back to the LLM, and continues until done.

spinner

On-Chain Tools (Yield & Resume)

On-chain tools are yielded back to the caller as ABI-encoded calldata. The caller executes them, then resumes the conversation with the results.

spinner

Resuming After On-Chain Tool Calls

When finishReason == "tool_calls", the caller should:

  1. Execute each on-chain call using the calldata in pendingToolCalls

  2. Append the results to the conversation state:

    • For each pending call, add role: "tool" with a JSON message {"tool_call_id": pendingToolCallIds[i], "content": "result string"}

  3. Call inferToolsChat again with the updated conversation

Deterministic Execution

Because the models run deterministically across all validating nodes, consensus can be achieved on the output, making AI results trustworthy for on-chain use.

spinner

Example Use Cases

  • Analyzing text content for moderation decisions

  • Generating summaries of on-chain data

  • Making classification decisions (sentiment, category, etc.)

  • Creating dynamic NFT descriptions or game narratives

  • Extracting numeric scores or ratings from text

  • Agentic DeFi: LLM decides which swaps/transfers to execute and returns calldata

  • AI oracles with tool use: LLM fetches external data via MCP servers before responding

Usage Examples

Simple Inference

Numeric Inference

Conversational Example

MCP Tool Use (Solidity)

The LLM automatically discovers and calls tools from the MCP server, gets the result, and incorporates it into its response.

On-Chain Tool Use with Yield & Resume (Solidity)

The LLM returns calldata for on-chain tools. Your contract executes them and resumes the conversation.

Last updated