Phoenix

Mar 2025

Trial

Phoenix is an open-source observability tool designed for experimentation, evaluation, and troubleshooting of AI and LLM applications.

Phoenix works with OpenTelemetry and OpenInference instrumentation. See Integrations: Tracing for details.

Features

Tracing: Trace your LLM application's runtime using OpenTelemetry-based instrumentation.
Evaluation: Leverage LLMs to benchmark your application's performance using response and retrieval evals.
Datasets: Create versioned datasets of examples for experimentation, evaluation, and fine-tuning.
Experiments: Track and evaluate changes to prompts, LLMs, and retrieval.
Playground: Optimize prompts, compare models, adjust parameters, and replay traced LLM calls.
Prompt Management: Manage and test prompt changes systematically using version control, tagging, and experimentation.