LLM Monitoring
Sentry auto-generates LLM Monitoring data for common providers in Python, but you may need to manually annotate spans for other frameworks.
Span conventions
Span Operations
Span OP | Description |
---|---|
ai.pipeline.* | The top-level span which corresponds to one or more AI operations & helper functions |
ai.run.* | A unit of work - a tool call, LLM execution, or helper method. |
ai.chat_completions.* | A LLM chat operation |
ai.embeddings.* | An LLM embedding creation operation |
Span Data
Attribute | Type | Description | Examples | Notes |
---|---|---|---|---|
ai.input_messages | string | The input messages sent to the model | [{"role": "user", "message": "hello"}] | |
ai.completion_tоkens.used | int | The number of tokens used to respond to the message | 10 | required for cost calculation |
ai.prompt_tоkens.used | int | The number of tokens used to process just the prompt | 20 | required for cost calculation |
ai.total_tоkens.used | int | The total number of tokens used to process the prompt | 30 | required for charts and cost calculation |
ai.model_id | list | The vendor-specific ID of the model used | "gpt-4" | required for cost calculation |
ai.streaming | boolean | Whether the request was streamed back | true | |
ai.responses | list | The response messages sent back by the AI model | ["hello", "world"] | |
ai.pipeline.name | string | The description of the parent ai.pipeline span | My AI pipeline | required for charts |
Instrumentation
When a user creates a new AI pipeline, the SDK automatically creates spans that instrument both the pipeline and its AI operations.
Example
Copied
from sentry_sdk.ai.monitoring import ai_track
from openai import OpenAI
sentry.init(...)
openai = OpenAI()
@ai_track(description="My AI pipeline")
def invoke_pipeline():
result = openai.chat.completions.create(
model="some-model", messages=[{"role": "system", "content": "hello"}]
).choices[0].message.content
return openai.chat.completions.create(
model="some-model", messages=[{"role": "system", "content": result}]
).choices[0].message.content
This should result in the following spans.
Copied
<span op:"ai.pipeline" description:"My AI pipeline">
<span op:"ai.chat_completions.openai" description:"OpenAI Chat Completion" data[ai.total_tokens.used]:15 data[ai.pipeline.name]:"My AI pipeline" />
<span op:"ai.chat_completions.openai" description:"OpenAI Chat Completion" data[ai.total_tokens.used]:20 data[ai.pipeline.name]:"My AI pipeline" />
</span>
Notice that the ai.pipeline.name span of the children spans is the description of the ai.pipeline.*
span parent.
You can edit this page on GitHub.