Documentation Index
Fetch the complete documentation index at: https://lightdash-mintlify-c548d9cd.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Context compaction is currently behind the
ai-context-compaction feature flag and applies to web-app threads only. Slack threads are not compacted yet.What it does
When a new user message would push the thread over the model’s safe context window, Lightdash:- Picks the earlier messages that have not been compacted yet.
- Sends them, along with any previous compaction summary, to a fast model that produces a structured markdown summary covering goals, constraints, progress, decisions, next steps, and critical context.
- Stores the compaction against the thread and uses the summary in place of the original messages for future responses in that thread.
When it runs
Compaction is triggered when all of the following are true:- The
ai-context-compactionfeature flag is enabled for the user’s organization. - The thread is a web-app thread (Slack threads are skipped).
- The previous reply’s total token usage is greater than
context window − 16,384 reserve tokensfor the model in use. - The active model exposes a known context window. Azure and OpenRouter providers are not supported because their context windows are not declared in Lightdash and are skipped.
- The triggering prompt has not already been compacted.
Enabling compaction
Compaction is gated by a feature flag. To turn it on across a self-hosted instance, addai-context-compaction to LIGHTDASH_ENABLE_FEATURE_FLAGS:
Supported models
Compaction relies on the model’s declared context window, so it is only available for the OpenAI, Anthropic, and Bedrock provider presets shipped with Lightdash. Azure deployments and OpenRouter custom models are skipped — those threads keep working but will not be compacted. The summary itself is always generated with the provider’s fast model preset (for example,gpt-5-mini for OpenAI or claude-haiku-4-5 for Anthropic and Bedrock) to keep latency and cost low.
What you’ll notice as a user
- Conversations can keep going on long-running threads without hitting context limits.
- The agent’s recall of very early messages becomes a structured summary rather than the full text, so highly specific phrasing from early turns may be condensed.
- Pinned context, decisions, and explicit user preferences are preserved across compactions.