Query Data via SDKs

Langfuse is open-source and all data tracked with Langfuse is accessible programmatically. The SDKs wrap the Public API so you can query data without writing raw HTTP requests.

Common use cases:

Collect production examples for few-shot prompts.
Export traces and generations for fine-tuning.
Build custom dashboards or downstream pipelines.
Programmatically create datasets from production data.

If you are new to Langfuse, familiarize yourself with the data model first.

New data is typically available for querying within 15-30 seconds of ingestion, though processing times may vary. Check status.langfuse.com if you encounter any issues.

Recommended vs. legacy endpoints

Langfuse provides high-performance endpoints optimized for querying large datasets with cursor-based pagination and flexible field selection. These are the recommended defaults.

Resource	Recommended (high-performance)	Legacy
Observations	`api.observations`	`api.legacy.observations_v1` / `observationsV1`
Scores	`api.scores`	`api.legacy.score_v1` / `scoreV1`
Metrics	`api.metrics`	`api.legacy.metrics_v1` / `metricsV1`
Traces	`api.trace` (unchanged)	—

⚠️

From Python SDK v4 and JS/TS SDK v5 onward, the high-performance endpoints are the defaults:

api.observations (formerly api.observations_v_2 / api.observationsV2)
api.scores (formerly api.score_v_2 / api.scoreV2)
api.metrics (formerly api.metrics_v_2 / api.metricsV2)

The old v2 aliases were removed in these major releases. Legacy v1 endpoints are now available under api.legacy.*.

If you need aggregated metrics (counts, costs, usage) rather than individual entities, consider the Metrics API which is optimized for aggregate queries and has higher rate limits.

Both SDKs authenticate via the LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_HOST environment variables. The api namespace is auto-generated from the Public API (OpenAPI spec). Method names mirror the REST resources and support filters and pagination.

Observations (recommended)

The observations endpoint is the primary and recommended way to query LLM call data from Langfuse. It covers all observation types — generations, spans, events, and more — with cursor-based pagination and selectable field groups.

Field selection

By default, the core and basic field groups are returned. Request additional groups via the fields parameter to include cost, token usage, model info, inputs/outputs, and more.

Field group	Fields included
`core`	`id`, `traceId`, `startTime`, `endTime`, `projectId`, `parentObservationId`, `type`
`basic`	`name`, `level`, `statusMessage`, `version`, `environment`, `bookmarked`, `public`, `userId`, `sessionId`
`time`	`completionStartTime`, `createdAt`, `updatedAt`
`io`	`input`, `output`
`metadata`	`metadata` (truncated to 200 chars by default)
`model`	`providedModelName`, `internalModelId`, `modelParameters`
`usage`	`usageDetails`, `costDetails`, `totalCost`
`prompt`	`promptId`, `promptName`, `promptVersion`
`metrics`	`latency`, `timeToFirstToken`

Advanced filtering with the `filter` parameter

For complex queries, use the structured filter parameter. It accepts a JSON string with an array of filter conditions. When provided, it takes precedence over individual query parameter filters.

Each filter condition has the following structure:

{
  "type": "string",
  "column": "name",
  "operator": "=",
  "value": "chat-completion"
}

Supported filter types and operators:

Type	Operators
`string`	`=`, `contains`, `does not contain`, `starts with`, `ends with`
`number`	`=`, `>`, `<`, `>=`, `<=`
`datetime`	`>`, `<`, `>=`, `<=`
`stringOptions`	`any of`, `none of`
`arrayOptions`	`any of`, `none of`, `all of`
`stringObject`	`=`, `contains`, `does not contain`, `starts with`, `ends with` (requires `key`)
`numberObject`	`=`, `>`, `<`, `>=`, `<=` (requires `key`)
`boolean`	`=`, `<>`
`null`	`is null`, `is not null`

Available columns:

Core fields: id, type, name, traceId, startTime, endTime, environment, level, statusMessage, version, userId, sessionId
Trace fields: traceName, traceTags / tags
Performance: latency, timeToFirstToken, tokensPerSecond
Token usage: inputTokens, outputTokens, totalTokens
Cost: inputCost, outputCost, totalCost
Model: model (alias: providedModelName), promptName, promptVersion
Metadata: metadata (use with stringObject, numberObject, or categoryOptions type and a key)

All conditions in the filter array are combined with AND logic.

Examples

pip install langfuse

from langfuse import get_client
 
langfuse = get_client()

Basic query

observations = langfuse.api.observations.get_many(limit=50)
 
for obs in observations.data:
    print(obs["id"], obs["type"], obs["name"])

Field selection

observations = langfuse.api.observations.get_many(
    fields="core,basic,usage,model,metrics",
    limit=50,
)
 
for obs in observations.data:
    print(obs["name"], obs.get("providedModelName"), obs.get("totalCost"), obs.get("latency"))

To retrieve full (non-truncated) metadata values for specific keys, use expand_metadata:

observations = langfuse.api.observations.get_many(
    fields="core,metadata",
    expand_metadata="system_prompt,config",
    limit=50,
)

Filter by observation type

Common types: GENERATION, SPAN, EVENT, AGENT, TOOL.

generations = langfuse.api.observations.get_many(
    type="GENERATION",
    fields="core,basic,usage,model",
    limit=100,
)

Filter by name

observations = langfuse.api.observations.get_many(
    name="chat-completion",
    type="GENERATION",
    limit=50,
)

Filter by trace

observations = langfuse.api.observations.get_many(
    trace_id="trace-abc-123",
    fields="core,basic,io,usage",
)

Filter by time range

from datetime import datetime, timedelta, timezone
 
observations = langfuse.api.observations.get_many(
    from_start_time=datetime.now(timezone.utc) - timedelta(hours=24),
    to_start_time=datetime.now(timezone.utc),
    type="GENERATION",
    fields="core,basic,usage",
    limit=100,
)

Filter by environment

observations = langfuse.api.observations.get_many(
    environment=["production", "staging"],
    type="GENERATION",
    limit=100,
)

Filter by trace tags

import json
 
observations = langfuse.api.observations.get_many(
    filter=json.dumps([
        {
            "type": "arrayOptions",
            "column": "traceTags",
            "operator": "any of",
            "value": ["production", "high-priority"]
        }
    ]),
    fields="core,basic,usage",
    limit=100,
)

Filter by metadata

Metadata filters use stringObject or numberObject types with a key to target specific metadata fields.

import json
 
observations = langfuse.api.observations.get_many(
    filter=json.dumps([
        {
            "type": "stringObject",
            "column": "metadata",
            "key": "customer_tier",
            "operator": "=",
            "value": "enterprise"
        }
    ]),
    fields="core,basic,metadata",
    expand_metadata="customer_tier",
    limit=100,
)

Filter by cost or latency

import json
 
expensive_slow_generations = langfuse.api.observations.get_many(
    filter=json.dumps([
        {"type": "string", "column": "type", "operator": "=", "value": "GENERATION"},
        {"type": "number", "column": "totalCost", "operator": ">=", "value": 0.10},
        {"type": "number", "column": "latency", "operator": ">=", "value": 5.0},
    ]),
    fields="core,basic,usage,model,metrics",
    limit=100,
)

Filter by model

import json
 
observations = langfuse.api.observations.get_many(
    filter=json.dumps([
        {"type": "string", "column": "model", "operator": "=", "value": "gpt-4o"},
    ]),
    fields="core,basic,usage,model",
    limit=100,
)

Combine multiple filters

import json
from datetime import datetime, timedelta, timezone
 
observations = langfuse.api.observations.get_many(
    filter=json.dumps([
        {"type": "string", "column": "type", "operator": "=", "value": "GENERATION"},
        {"type": "string", "column": "model", "operator": "starts with", "value": "gpt-4"},
        {"type": "arrayOptions", "column": "traceTags", "operator": "any of", "value": ["production"]},
        {"type": "stringObject", "column": "metadata", "key": "feature", "operator": "=", "value": "chat"},
        {
            "type": "datetime",
            "column": "startTime",
            "operator": ">=",
            "value": (datetime.now(timezone.utc) - timedelta(days=7)).isoformat()
        },
    ]),
    fields="core,basic,usage,model,metadata,metrics",
    limit=100,
)

Cursor-based pagination

The cursor is returned in response.meta.cursor and should be passed in subsequent requests to retrieve the next page.

all_observations = []
cursor = None
 
while True:
    response = langfuse.api.observations.get_many(
        type="GENERATION",
        fields="core,basic,usage",
        limit=500,
        cursor=cursor,
    )
    all_observations.extend(response.data)
 
    cursor = response.meta.cursor
    if not cursor:
        break
 
print(f"Fetched {len(all_observations)} observations")

Async usage

All endpoints are also available as async via async_api:

observations = await langfuse.async_api.observations.get_many(
    type="GENERATION",
    fields="core,basic,usage",
    limit=100,
)

npm install @langfuse/client

import { LangfuseClient } from "@langfuse/client";
 
const langfuse = new LangfuseClient();

Basic query

const observations = await langfuse.api.observations.getMany({ limit: 50 });
 
for (const obs of observations.data) {
  console.log(obs.id, obs.type, obs.name);
}

Field selection

const observations = await langfuse.api.observations.getMany({
  fields: "core,basic,usage,model,metrics",
  limit: 50,
});
 
for (const obs of observations.data) {
  console.log(obs.name, obs.providedModelName, obs.totalCost, obs.latency);
}

To retrieve full (non-truncated) metadata values for specific keys, use expandMetadata:

const observations = await langfuse.api.observations.getMany({
  fields: "core,metadata",
  expandMetadata: "system_prompt,config",
  limit: 50,
});

Filter by observation type

Common types: GENERATION, SPAN, EVENT, AGENT, TOOL.

const generations = await langfuse.api.observations.getMany({
  type: "GENERATION",
  fields: "core,basic,usage,model",
  limit: 100,
});

Filter by name

const observations = await langfuse.api.observations.getMany({
  name: "chat-completion",
  type: "GENERATION",
  limit: 50,
});

Filter by trace

const observations = await langfuse.api.observations.getMany({
  traceId: "trace-abc-123",
  fields: "core,basic,io,usage",
});

Filter by time range

const oneDayAgo = new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
const now = new Date().toISOString();
 
const observations = await langfuse.api.observations.getMany({
  fromStartTime: oneDayAgo,
  toStartTime: now,
  type: "GENERATION",
  fields: "core,basic,usage",
  limit: 100,
});

Filter by environment

const observations = await langfuse.api.observations.getMany({
  environment: ["production", "staging"],
  type: "GENERATION",
  limit: 100,
});

Filter by trace tags

const observations = await langfuse.api.observations.getMany({
  filter: JSON.stringify([
    {
      type: "arrayOptions",
      column: "traceTags",
      operator: "any of",
      value: ["production", "high-priority"],
    },
  ]),
  fields: "core,basic,usage",
  limit: 100,
});

Filter by metadata

Metadata filters use stringObject or numberObject types with a key to target specific metadata fields.

const observations = await langfuse.api.observations.getMany({
  filter: JSON.stringify([
    {
      type: "stringObject",
      column: "metadata",
      key: "customer_tier",
      operator: "=",
      value: "enterprise",
    },
  ]),
  fields: "core,basic,metadata",
  expandMetadata: "customer_tier",
  limit: 100,
});

Filter by cost or latency

const expensiveSlowGenerations = await langfuse.api.observations.getMany({
  filter: JSON.stringify([
    { type: "string", column: "type", operator: "=", value: "GENERATION" },
    { type: "number", column: "totalCost", operator: ">=", value: 0.1 },
    { type: "number", column: "latency", operator: ">=", value: 5.0 },
  ]),
  fields: "core,basic,usage,model,metrics",
  limit: 100,
});

Filter by model

const observations = await langfuse.api.observations.getMany({
  filter: JSON.stringify([
    { type: "string", column: "model", operator: "=", value: "gpt-4o" },
  ]),
  fields: "core,basic,usage,model",
  limit: 100,
});

Combine multiple filters

const sevenDaysAgo = new Date(
  Date.now() - 7 * 24 * 60 * 60 * 1000
).toISOString();
 
const observations = await langfuse.api.observations.getMany({
  filter: JSON.stringify([
    { type: "string", column: "type", operator: "=", value: "GENERATION" },
    {
      type: "string",
      column: "model",
      operator: "starts with",
      value: "gpt-4",
    },
    {
      type: "arrayOptions",
      column: "traceTags",
      operator: "any of",
      value: ["production"],
    },
    {
      type: "stringObject",
      column: "metadata",
      key: "feature",
      operator: "=",
      value: "chat",
    },
    {
      type: "datetime",
      column: "startTime",
      operator: ">=",
      value: sevenDaysAgo,
    },
  ]),
  fields: "core,basic,usage,model,metadata,metrics",
  limit: 100,
});

Cursor-based pagination

The cursor is returned in response.meta.cursor and should be passed in subsequent requests to retrieve the next page.

const allObservations = [];
let cursor: string | undefined;
 
do {
  const response = await langfuse.api.observations.getMany({
    type: "GENERATION",
    fields: "core,basic,usage",
    limit: 500,
    cursor,
  });
  allObservations.push(...response.data);
  cursor = response.meta.cursor;
} while (cursor);
 
console.log(`Fetched ${allObservations.length} observations`);

Other endpoints

Traces

traces = langfuse.api.trace.list(
    limit=100,
    user_id="user_123",
    tags=["production"],
)
 
trace = langfuse.api.trace.get("trace-id")

The trace list endpoint supports filtering by user_id, session_id, name, tags, version, release, environment, from_timestamp/to_timestamp, and the structured filter parameter.

Fetching a single trace via trace.get() returns the trace with all its observations included. For querying observations at scale, prefer the observations endpoint instead.

Scores

scores = langfuse.api.scores.get_many(
    trace_id="trace-id",
    name="relevance",
    limit=100,
)
 
score = langfuse.api.scores.get_by_id("score-id")

The scores endpoint supports filtering by user_id, name, source, data_type, trace_id, observation_id, session_id, config_id, queue_id, trace_tags, environment, from_timestamp/to_timestamp, and the structured filter parameter.

Sessions

sessions = langfuse.api.sessions.list(limit=50)
session = langfuse.api.sessions.get("session-id")

Datasets

datasets = langfuse.api.datasets.list()
dataset = langfuse.api.datasets.get("dataset-name")
runs = langfuse.api.datasets.get_runs("dataset-name")

Prompts

See the prompt management documentation for details on fetching prompts.

Metrics

import json
 
result = langfuse.api.metrics.get(
    query=json.dumps({
        "view": "observations",
        "metrics": [{"measure": "totalCost", "aggregation": "sum"}],
        "dimensions": [{"field": "providedModelName"}],
        "filters": [],
        "fromTimestamp": "2025-05-01T00:00:00Z",
        "toTimestamp": "2025-05-31T00:00:00Z",
    })
)

See the Metrics API documentation for full details.

Legacy endpoints

The legacy v1 endpoints use page-based pagination and return a fixed set of fields. They remain available under api.legacy.* but are not recommended for new integrations.

legacy_observations = langfuse.api.legacy.observations_v1.get_many(
    trace_id="trace-id",
    type="GENERATION",
    page=1,
    limit=100,
)
legacy_observation = langfuse.api.legacy.observations_v1.get("observation-id")
 
legacy_scores = langfuse.api.legacy.score_v1.get(score_ids="score-id")

Public API reference — full OpenAPI spec for all endpoints.
Metrics API — optimized endpoint for aggregate queries (counts, costs, usage).
Blob Storage Export — scheduled data sync to S3, GCS, or Azure for large-scale exports.
Export from UI — download filtered data directly from the Langfuse dashboard.
Data Model — understand traces, observations, and scores.

Public API Authentication & SSO

Was this page helpful?

Support

Query Data via SDKs

Recommended vs. legacy endpoints

Observations (recommended)

Field selection

Advanced filtering with the filter parameter

Examples

Basic query

Field selection

Filter by observation type

Filter by name

Filter by trace

Filter by time range

Filter by environment

Filter by trace tags

Filter by metadata

Filter by cost or latency

Filter by model

Combine multiple filters

Cursor-based pagination

Async usage

Basic query

Field selection

Filter by observation type

Filter by name

Filter by trace

Filter by time range

Filter by environment

Filter by trace tags

Filter by metadata

Filter by cost or latency

Filter by model

Combine multiple filters

Cursor-based pagination

Other endpoints

Traces

Scores

Sessions

Datasets

Prompts

Metrics

Traces

Scores

Sessions

Datasets

Prompts

Metrics

Legacy endpoints

Related resources

Advanced filtering with the `filter` parameter