New features, improvements, and fixes for Recall MCP.
Six new features inspired by benchmarking research: self-teaching session protocol, pre-write deduplication, temporal validity on relationships, bulk conversation import, heuristic content classification, and a retrieval benchmark harness.
Every auto_session_start response now teaches Claude the workflow - no CLAUDE.md configuration required
Previously, getting Claude to use Recall correctly required adding rules to ~/.claude/rules/recall.md. Now the expected workflow is baked into theauto_session_start response itself - a ~130-token protocol string that travels with the tool call. New users get the correct behavior on first contact.
New check_duplicate tool prevents noise from prolific auto-save patterns
Returns similar memories above a threshold (default 0.85) so Claude can skip, merge, or explicitly override instead of writing yet another duplicate. Especially useful before bulk imports or when stop hooks fire on every interaction.
Just tell Claude:
"Check if we already have this stored before saving"
Facts can end without being deleted - historical queries still work
The memory_graph tool now supports temporal validity via three additions:
valid_from / valid_to
Links now have optional validity windows. Empty valid_to means still active.
invalidate action
Mark a relationship as ended without deleting it. Preserves history.
as_of parameter
Point-in-time queries: 'What was Kai working on in January 2026?'
Just tell Claude:
"Kai stopped working on Orion as of March 1st"
Backfill Claude Code, ChatGPT, and generic transcripts into Recall
The new import_conversations tool parses three formats and chunks by exchange pair (user turn + AI response = one memory). Each chunk is auto-classified and stored verbatim - no summarization.
claude_codeJSONL files from ~/.claude/projects/
chatgptOpenAI conversation export JSON
genericBlockquote or Human:/Assistant: labeled text
Supports dry_run mode to preview what would be imported without writing. Six months of conversation history? Import it in one call.
Regex-based auto-classification for imported content
Powers the conversation importer. Classifies content into Recall's ContextType enum (decision, preference, directive, error, insight, code_pattern, todo, requirement) using regex marker sets. Zero LLM calls, zero API cost. Also suggests importance (1-10) from keyword signals like "critical", "breaking", "security", "minor", "typo".
Standardized retrieval quality measurement
New benchmarks/ directory with a runner that measures R@1, R@5, and R@10 on the LongMemEval dataset. Ships with a synthetic 15-question dataset for quick validation. Accepts the full LongMemEval corpus via --dataset path/to/longmemeval.json.
The v1.15 series landed always-on memory directives, billing enforcement, memory add-on packs, the pick_up tool, and a long tail of reliability fixes across hooks, API key healing, and version propagation.
v1.15.0 - rules file injected into every session automatically
Recall now ships a recall.md rules file that injects directly into ~/.claude/rules/ on every session start. No CLAUDE.md edits, no manual configuration - Claude just knows to call set_workspace and auto_session_start at the start, store decisions mid-session, and call summarize_session on exit.
pick_up Toolv1.15.18 - structured work resumption brief
New tool for returning to a project after time away. Returns a narrative brief with:
When to use vs auto_session_start:
auto_session_start for every session (fast, ~2000 tokens). pick_up when returning after hours or days, switching projects, or when explicitly asked "catch me up" (~3000 tokens).
v1.15.18 - clear upgrade messages instead of silent failure
store_memory, analyze_and_remember, and quick_store_decision now block with a helpful upgrade message when you hit your plan's memory limit. Previously the store would silently fail or truncate - now you know exactly when and why to upgrade.
v1.15.18 - top up without upgrading your plan
Need more memories without changing plans? Purchase add-on packs directly from your dashboard at $4.99 per 5,000 memories (one-time, no subscription). Admins can grant and adjust add-ons per user.
Hooks, version propagation, and API key self-healing
~/.claude.json, and updates config silentlyCLAUDE_PLUGIN_ROOT resolutionSession continuity. Recall now persists work across session stops and context compactions automatically, so nothing is lost between conversations.
Async Stop hook snapshots your work when a session ends
The new stop-summarize.sh hook fires when Claude Code stops and stores a summary memory at importance 7. It tracks which files were touched, what decisions were made, and where the session ended - so your next session picks up where this one left off.
State re-injected after context window shrinks
When Claude Code compacts its context window mid-session, the new compact-restore.sh hook fires on SessionStart(compact) and re-injects the pre-compaction state snapshot (session name, project, CWD, memory count). Compactions used to wipe context - now they preserve it.
File edits and command failures now captured automatically
The observe hook now captures a wider signal:
error tagTrack tasks across sessions with automatic context injection
Recall now includes a built-in to-do list that persists across sessions and integrates directly into your workflow. Claude can create, track, defer, and complete tasks — and pending to-dos are automatically injected into every session start so nothing falls through the cracks.
Persists across sessions
To-dos are stored in Redis alongside your memories. They survive session restarts and context compaction.
Auto-injected at session start
Pending to-dos appear in auto_session_start context with priority indicators, so Claude always knows what's outstanding.
Priority & defer tracking
Four priority levels (low, medium, high, urgent) and defer counting to identify repeatedly postponed items.
Token-budget aware
Context injection is hard-capped at 300 tokens. To-dos are formatted compactly to minimize per-session cost.
The todo_list tool uses action dispatch (like workflow) to keep the tool count minimal:
createCreate a new to-do with title, priority, and tagslistList all to-dos, optionally filtered by statusgetGet details of a specific to-doupdateUpdate title, priority, status, or tagscompleteMark a to-do as completeddeferDefer a to-do (tracks defer count)deleteRemove a to-do permanentlycontextGet a token-budgeted summary of pending to-dosJust tell Claude:
"Create a to-do to refactor the auth middleware — high priority"
Zero-config installation via the official plugin directory
Recall is now available as an official Claude Code plugin. Instead of manually editing .mcp.json and configuring skills, you can install with a single command and get everything pre-configured.
Install
/install recall
~200-300 fewer tokens per session start
Tool definitions have been streamlined to reduce per-session token cost. Redundant property descriptions were removed from self-documenting fields, and Claude Code's Tool Search (lazy-loading) is now enabled so tools are loaded on-demand instead of all at once.
Before
~1,200
tokens per ListTools
After
~900
tokens per ListTools
Run Recall on your own infrastructure with a single command
Recall is now available as a self-hosted Docker image. Teams with strict data residency requirements, air-gapped environments, or simply a preference for running their own stack can deploy a fully-featured Recall instance in under two minutes.
Unlimited everything
No memory caps, no workspace limits, no webhook quotas. Your server, your rules.
License-key activation
Activate with a license key from recallmcp.com. Works air-gapped with a 7-day grace window for offline environments.
Full data sovereignty
Memories never leave your infrastructure. No telemetry, no cloud dependency — just Redis + Recall running inside Docker.
One-command Docker install
curl -fsSL https://install.recallmcp.com | bash detects Docker, pulls the image, configures Redis, and prints your MCP config.
Install
curl -fsSL https://install.recallmcp.com | bash
7 days of Pro, no card required
New accounts can now activate a 7-day Pro trial from the billing dashboard — no payment method required. Trial accounts get the full Pro feature set: 5,000 memories, 3 workspaces, cross-session workflows, webhooks, and event history.
Simpler tiers, lower entry price
Pro dropped from $9.99 to $5/mo, and Team dropped from $19.99 to $15/mo. Existing subscribers are grandfathered on their current rate — no action needed.
Free
$0/mo
500 memories
Pro
$5/mo
5,000 memories
Team
$15/mo
25,000 memories
Push memory events to any HTTP endpoint in real time
Recall can now call your own HTTP endpoint whenever a memory is created, updated, deleted, or when a session summary is written. Webhooks unlock integrations that were previously impossible — trigger CI runs, sync to Notion, fire Slack alerts, or feed a second AI agent whenever context changes.
HMAC signature verification
Every webhook request is signed with your secret. Verify the X-Recall-Signature header on your server to reject spoofed calls.
Event filtering
Subscribe only to the event types you care about. Pro gets 25 filters; Team gets 100. Filter by workspace, memory type, or action.
Event history & replay
Every fired event is stored for 30 days. Replay missed events after downtime or roll back a bad deploy by replaying the event stream.
Session-targeted routing
Route webhook payloads directly into a running Claude Code session using the stop hook. Claude processes the event as a task — no polling needed.
New REST endpoints
Zero-effort memory capture with installable Claude Code hooks
Until now, Recall required Claude to proactively call memory tools — meaning it only captured what you explicitly asked it to remember. Auto-Memory Hooks change this: install four lightweight bash scripts once and Recall captures context automatically on every session start, file edit, and key command.
curl -fsSL https://recallmcp.com/install-hooks | bashInstalls hooks to ~/.claude/recall/hooks/ and patches your Claude Code settings.json automatically. Backs up your settings first. Idempotent — safe to re-run.
session-start.sh
SessionStart
Fetches your recent memories and injects them as context at the start of every Claude Code session.
observe.sh
PostToolUse
Silently captures Write, Edit, Task, and key Bash events (git commit, npm install, deploy) as low-importance observations.
pre-compact.sh
PreCompact
Saves a state marker before context compaction so session continuity is preserved.
session-end.sh
Stop
Records a session-end marker when Claude Code closes.
async: true. Claude never waits for them.settings.json hooks array without overwriting Pilot, sx, or other tools.A new GET /api/context endpoint returns your recent memories as a formatted markdown block — grouped by type (decisions, errors, patterns, recent work) and ready for direct stdout injection into Claude's session context.
Named workflows that span multiple Claude sessions
The #1 pain point with AI assistants is context loss across sessions. Workflow threads solve this by creating named workflows that automatically link memories across multiple sessions, giving Claude persistent context about what you're working on.
Step 1: Start a workflow at the beginning of a multi-session task
start_workflow({ name: "Implementing auth system", description: "Adding OAuth 2.0 with refresh tokens" })Step 2: Work normally — memories are auto-tagged
store_memory({ content: "Decided to use PKCE flow for public clients", context_type: "decision" })Memories created during an active workflow are automatically linked to it. No extra steps needed.
Step 3: In a new session, context is automatic
auto_session_start({ task_hint: "Continue auth implementation" })Active workflow context (name, description, recent memories) is included automatically.
Step 4: Pause and resume workflows
pause_workflow() // Pauses the current workflow
resume_workflow({ workflow_id: "..." }) // Resume laterStep 5: Complete the workflow when done
complete_workflow({ summary: "Auth system implemented with OAuth 2.0 PKCE" })start_workflow
Start a named multi-session workflow
complete_workflow
Complete with summary
pause_workflow
Pause without losing progress
resume_workflow
Resume a paused workflow
get_active_workflow
Check current active workflow
list_workflows
List all workflows by status
get_workflow_context
Get linked memories for a workflow
Tell Claude: “Start a workflow called '[your task]' so we can track context across sessions.” Claude will call start_workflow automatically. Only one workflow can be active at a time — pause or complete the current one before starting another.
Automatically merge similar memories to keep your store efficient
As your memory store grows, similar and overlapping memories accumulate. The auto-consolidation pipeline clusters similar memories using cosine similarity, creates consolidated summaries, and keeps originals as version history via supersedes relationships.
Check if consolidation is needed
consolidation_status()Returns memory count, threshold, last run date, and recommendation.
Run automatic consolidation (safe to call proactively)
auto_consolidate()Checks memory count threshold and 24h cooldown. Returns early if not needed — safe to call at any time.
Force consolidation (after large imports or manual trigger)
force_consolidate({ similarity_threshold: 0.8, min_cluster_size: 3 })Runs regardless of thresholds. Configurable similarity (default 0.75), minimum cluster size (default 2), and max memories to process (default 1000).
auto_consolidate
Smart consolidation — only runs when needed
force_consolidate
Manual trigger with custom thresholds
consolidation_status
Check status and get recommendations
Consolidation quality depends on your embedding provider. For best results, use Voyage AI, Cohere, or OpenAI embeddings. The Anthropic keyword-based fallback works but may require lowering the similarity threshold to 0.6. Run consolidation_status() to see your current provider.
Fetches recent memories (capped at 1,000 for performance)
Clusters by cosine similarity with cross-scope guard (global memories only cluster with global)
Creates consolidated summary combining content from all cluster members
Preserves originals with 'consolidated' tag and supersedes relationships
Takes max importance and merges tags from all cluster members
24-hour cooldown prevents redundant runs
Added clearWorkspace() method to MemoryStore
Fixed MCP SDK 1.25.x architecture for stale sessions
Gracefully handle missing Firebase API key
Automatic hooks — auto_session_start, quick_store_decision, should_use_rlm
RLM (Recursive Language Model) tools for handling large contexts exceeding context window limits
7 AI Embedding Providers — Voyage AI, Cohere, OpenAI, Deepseek, Grok, Anthropic, Ollama