New features, improvements, and fixes for Recall MCP.
Track tasks across sessions with automatic context injection
Recall now includes a built-in to-do list that persists across sessions and integrates directly into your workflow. Claude can create, track, defer, and complete tasks — and pending to-dos are automatically injected into every session start so nothing falls through the cracks.
Persists across sessions
To-dos are stored in Redis alongside your memories. They survive session restarts and context compaction.
Auto-injected at session start
Pending to-dos appear in auto_session_start context with priority indicators, so Claude always knows what's outstanding.
Priority & defer tracking
Four priority levels (low, medium, high, urgent) and defer counting to identify repeatedly postponed items.
Token-budget aware
Context injection is hard-capped at 300 tokens. To-dos are formatted compactly to minimize per-session cost.
The todo_list tool uses action dispatch (like workflow) to keep the tool count minimal:
createCreate a new to-do with title, priority, and tagslistList all to-dos, optionally filtered by statusgetGet details of a specific to-doupdateUpdate title, priority, status, or tagscompleteMark a to-do as completeddeferDefer a to-do (tracks defer count)deleteRemove a to-do permanentlycontextGet a token-budgeted summary of pending to-dosJust tell Claude:
"Create a to-do to refactor the auth middleware — high priority"
Zero-config installation via the official plugin directory
Recall is now available as an official Claude Code plugin. Instead of manually editing .mcp.json and configuring skills, you can install with a single command and get everything pre-configured.
Install
/install recall
~200-300 fewer tokens per session start
Tool definitions have been streamlined to reduce per-session token cost. Redundant property descriptions were removed from self-documenting fields, and Claude Code's Tool Search (lazy-loading) is now enabled so tools are loaded on-demand instead of all at once.
Before
~1,200
tokens per ListTools
After
~900
tokens per ListTools
Run Recall on your own infrastructure with a single command
Recall is now available as a self-hosted Docker image. Teams with strict data residency requirements, air-gapped environments, or simply a preference for running their own stack can deploy a fully-featured Recall instance in under two minutes.
Unlimited everything
No memory caps, no workspace limits, no webhook quotas. Your server, your rules.
License-key activation
Activate with a license key from recallmcp.com. Works air-gapped with a 7-day grace window for offline environments.
Full data sovereignty
Memories never leave your infrastructure. No telemetry, no cloud dependency — just Redis + Recall running inside Docker.
One-command Docker install
curl -fsSL https://install.recallmcp.com | bash detects Docker, pulls the image, configures Redis, and prints your MCP config.
Install
curl -fsSL https://install.recallmcp.com | bash
7 days of Pro, no card required
New accounts can now activate a 7-day Pro trial from the billing dashboard — no payment method required. Trial accounts get the full Pro feature set: 5,000 memories, 3 workspaces, cross-session workflows, webhooks, and event history.
Simpler tiers, lower entry price
Pro dropped from $9.99 to $5/mo, and Team dropped from $19.99 to $15/mo. Existing subscribers are grandfathered on their current rate — no action needed.
Free
$0/mo
500 memories
Pro
$5/mo
5,000 memories
Team
$15/mo
25,000 memories
Push memory events to any HTTP endpoint in real time
Recall can now call your own HTTP endpoint whenever a memory is created, updated, deleted, or when a session summary is written. Webhooks unlock integrations that were previously impossible — trigger CI runs, sync to Notion, fire Slack alerts, or feed a second AI agent whenever context changes.
HMAC signature verification
Every webhook request is signed with your secret. Verify the X-Recall-Signature header on your server to reject spoofed calls.
Event filtering
Subscribe only to the event types you care about. Pro gets 25 filters; Team gets 100. Filter by workspace, memory type, or action.
Event history & replay
Every fired event is stored for 30 days. Replay missed events after downtime or roll back a bad deploy by replaying the event stream.
Session-targeted routing
Route webhook payloads directly into a running Claude Code session using the stop hook. Claude processes the event as a task — no polling needed.
New REST endpoints
Zero-effort memory capture with installable Claude Code hooks
Until now, Recall required Claude to proactively call memory tools — meaning it only captured what you explicitly asked it to remember. Auto-Memory Hooks change this: install four lightweight bash scripts once and Recall captures context automatically on every session start, file edit, and key command.
curl -fsSL https://recallmcp.com/install-hooks | bashInstalls hooks to ~/.claude/recall/hooks/ and patches your Claude Code settings.json automatically. Backs up your settings first. Idempotent — safe to re-run.
session-start.sh
SessionStart
Fetches your recent memories and injects them as context at the start of every Claude Code session.
observe.sh
PostToolUse
Silently captures Write, Edit, Task, and key Bash events (git commit, npm install, deploy) as low-importance observations.
pre-compact.sh
PreCompact
Saves a state marker before context compaction so session continuity is preserved.
session-end.sh
Stop
Records a session-end marker when Claude Code closes.
async: true. Claude never waits for them.settings.json hooks array without overwriting Pilot, sx, or other tools.A new GET /api/context endpoint returns your recent memories as a formatted markdown block — grouped by type (decisions, errors, patterns, recent work) and ready for direct stdout injection into Claude's session context.
Named workflows that span multiple Claude sessions
The #1 pain point with AI assistants is context loss across sessions. Workflow threads solve this by creating named workflows that automatically link memories across multiple sessions, giving Claude persistent context about what you're working on.
Step 1: Start a workflow at the beginning of a multi-session task
start_workflow({ name: "Implementing auth system", description: "Adding OAuth 2.0 with refresh tokens" })Step 2: Work normally — memories are auto-tagged
store_memory({ content: "Decided to use PKCE flow for public clients", context_type: "decision" })Memories created during an active workflow are automatically linked to it. No extra steps needed.
Step 3: In a new session, context is automatic
auto_session_start({ task_hint: "Continue auth implementation" })Active workflow context (name, description, recent memories) is included automatically.
Step 4: Pause and resume workflows
pause_workflow() // Pauses the current workflow
resume_workflow({ workflow_id: "..." }) // Resume laterStep 5: Complete the workflow when done
complete_workflow({ summary: "Auth system implemented with OAuth 2.0 PKCE" })start_workflow
Start a named multi-session workflow
complete_workflow
Complete with summary
pause_workflow
Pause without losing progress
resume_workflow
Resume a paused workflow
get_active_workflow
Check current active workflow
list_workflows
List all workflows by status
get_workflow_context
Get linked memories for a workflow
Tell Claude: “Start a workflow called '[your task]' so we can track context across sessions.” Claude will call start_workflow automatically. Only one workflow can be active at a time — pause or complete the current one before starting another.
Automatically merge similar memories to keep your store efficient
As your memory store grows, similar and overlapping memories accumulate. The auto-consolidation pipeline clusters similar memories using cosine similarity, creates consolidated summaries, and keeps originals as version history via supersedes relationships.
Check if consolidation is needed
consolidation_status()Returns memory count, threshold, last run date, and recommendation.
Run automatic consolidation (safe to call proactively)
auto_consolidate()Checks memory count threshold and 24h cooldown. Returns early if not needed — safe to call at any time.
Force consolidation (after large imports or manual trigger)
force_consolidate({ similarity_threshold: 0.8, min_cluster_size: 3 })Runs regardless of thresholds. Configurable similarity (default 0.75), minimum cluster size (default 2), and max memories to process (default 1000).
auto_consolidate
Smart consolidation — only runs when needed
force_consolidate
Manual trigger with custom thresholds
consolidation_status
Check status and get recommendations
Consolidation quality depends on your embedding provider. For best results, use Voyage AI, Cohere, or OpenAI embeddings. The Anthropic keyword-based fallback works but may require lowering the similarity threshold to 0.6. Run consolidation_status() to see your current provider.
Fetches recent memories (capped at 1,000 for performance)
Clusters by cosine similarity with cross-scope guard (global memories only cluster with global)
Creates consolidated summary combining content from all cluster members
Preserves originals with 'consolidated' tag and supersedes relationships
Takes max importance and merges tags from all cluster members
24-hour cooldown prevents redundant runs
Added clearWorkspace() method to MemoryStore
Fixed MCP SDK 1.25.x architecture for stale sessions
Gracefully handle missing Firebase API key
Automatic hooks — auto_session_start, quick_store_decision, should_use_rlm
RLM (Recursive Language Model) tools for handling large contexts exceeding context window limits
7 AI Embedding Providers — Voyage AI, Cohere, OpenAI, Deepseek, Grok, Anthropic, Ollama