MCP unreviewed Apache-2.0 Self-run

Telemem

Name: Telemem
Availability: InStock
Author: TeleAI-UAGI

mcp-teleai-uagi-telemem · by TeleAI-UAGI

Long-term and multimodal memory for AI agents - character-aware, mem0-compatible, fully-local option

↗ Website ↗ Repository

— No reviews yet

0 installs

0 views

— view→install

Install

$ agentstack add mcp-teleai-uagi-telemem

Open-source listing — not yet scanned by AgentStack. Follow the source repository for install instructions.

Are you the author of Telemem? Claim this listing to set pricing, connect Stripe payouts, and keep 70% of every sale.

About

TeleMem: Building Long-Term and Multimodal Memory for Agentic AI

If you find this project helpful, please give us a ⭐️ on GitHub for the latest update.

🤝 Contributions welcome! Feel free to open an issue or submit a pull request.

English | 简体中文

📄 Awesome-Agent-Memory →

TeleMem is an agent memory management layer that can be used as a high-performance drop-in replacement for Mem0 with one line of code (import telemem as mem0), deeply optimized for complex scenarios involving multi-turn dialogues, character modeling, long-term information storage, and semantic retrieval.

Through its unique context-aware enhancement mechanism, TeleMem provides conversational AI with core infrastructure offering higher accuracy, faster performance, and stronger character memory capabilities.

Building upon this foundation, TeleMem implements video understanding, multimodal reasoning, and visual question answering capabilities. Through a complete pipeline of video frame extraction, caption generation, and vector database construction, AI Agents can effortlessly store, retrieve, and reason over video content just like handling text memories.

The ultimate goal of the TeleMem project is to use an agent's hindsight to improve its foresight.

TeleMem, where memory lives on and intelligence grows strong.

Why TeleMem?

🎭 Character memory done right — the only open-source memory layer that automatically builds isolated, per-character memory profiles, built for role-play, companion AI, NPCs, and multi-persona assistants.
🎬 Memory for video, not just text — a full video → frames → captions → vector DB pipeline with ReAct-style multi-step video QA.
🏠 Fully local by default — runs end-to-end on your hardware (Qwen + FAISS); no cloud service, no paid tier, no data leaving your machine.
🔌 mem0-compatible API — add() / search() accept the same arguments and return the same {"results": [...]} shapes, so existing Mem0 code keeps working.

📢 Latest Updates

[2026-06-12] 🎉 TeleMem v1.7.1 is live on the official MCP registry — run the memory server with zero install: uvx telemem! Also new: evaluation principles and a LongMemEval harness with built-in baselines.
[2026-06-12] 🎉 TeleMem is now on PyPI: pip install telemem! v1.6.0 adds Ollama/DeepSeek/Kimi configs, LangChain & LlamaIndex examples, and a documentation site.
[2026-06-12] 🎉 TeleMem v1.5.0 has been released: true mem0 drop-in API, lightweight core install, and CI!
[2026-06-11] 🎉 TeleMem v1.4.0 has been released with [MCP support](docs/MCP.md)!
[2026-01-28] 🎉 TeleMem v1.3.0 has been released!
[2026-01-22] 🎉 TeleMem Tech Report has been updated to its 4th version!
[2026-01-13] 🎉 TeleMem Tech Report has been released on arXiv!
[2026-01-09] 🎉 TeleMem v1.2.0 has been released!
[2025-12-31] 🎉 TeleMem v1.1.0 has been released!
[2025-12-05] 🎉 TeleMem v1.0.0 has been released!

🔥 Research Highlights

Significantly improved memory accuracy: Achieved 86.33% accuracy on the ZH-4O Chinese multi-character long-dialogue benchmark, 19% higher than Mem0.
Doubled speed performance: Millisecond-level semantic retrieval enabled by efficient buffering and batch writing.
Greatly reduced token cost: Optimized token usage delivers the same performance with significantly lower LLM overhead.
Precise character memory preservation: Automatically builds independent memory profiles for each character, eliminating confusion.
Automated Video Processing Pipeline: From raw video → frame extraction → caption generation → vector database, fully automated
ReAct-Style Video QA: Multi-step reasoning + tool calling for precise video content understanding

📌 Table of Contents

[Project Introduction](#project-introduction)
[TeleMem vs Mem0: Core Advantages](#telemem-vs-mem0-core-advantages)
[Experimental Results](#experimental-results)
[Quick Start](#quick-start)
[Project Structure](#project-structure)
[Core Functions](#core-functions)
[Multimodal Extensions](#multimodal-extensions)
[MCP Server](#mcp-server)
[Framework Integrations](#framework-integrations)
[Data Storage Explanation](#data-storage)
[Development and Contribution](#development-and-contribution)
[Acknowledgements](#acknowledgements)

Project Introduction

TeleMem enables conversational AI to maintain stable, natural, and continuous worldviews and character settings during long-term interactions through a deeply optimized pipeline of character-aware summarization → semantic clustering deduplication → efficient storage → precise retrieval.

flowchart LR
    A["Dialoguemessages"] --> B["Character-awaresummarization(global + per-character)"]
    B --> C["Embedding +similar-memoryretrieval"]
    C --> D["Write buffer(batch flush)"]
    D --> E["LLM semanticclustering & fusion"]
    E --> F[("FAISS index +JSON metadata")]
    Q["Query"] --> S["Vector search+ rerank"]
    F --> S
    S --> R["results"]

Features

Automatic memory extraction: Extracts and structures key facts from dialogues.
Semantic clustering & deduplication: Uses LLMs to semantically merge similar memories, reducing conflicts and improving consistency.
Character-profiled memory management: Builds independent memory archives for each character in a dialogue, ensuring precise isolation and personalized management.
Efficient asynchronous writing: Employs a buffer + batch-flush mechanism for high-performance, stable persistence.
Precise semantic retrieval: Combines FAISS + JSON dual storage for fast recall and human-readable auditability.

Applicable Scenarios

Multi-character virtual agent systems
Long-memory AI assistants (e.g., customer service, companionship, creative co-pilots)
Complex narrative/world-building in virtual environments
Dialogue scenarios with strong contextual dependencies
Video content QA and reasoning
Multimodal agent memory management
Long video understanding and information retrieval

TeleMem vs Mem0: Core Advantages

TeleMem deeply refactors Mem0 to address characterization, long-term memory, and high performance. Key differences:

| Capability Dimension | Mem0 | TeleMem | | -------------------------- | --------------------------- | ------------------------------------------------------------ | | Multi-character separation | ❌ Not supported | ✅ Automatically creates independent memory profiles per character | | Summary quality | Basic summarization | ✅ Context-aware + character-focused prompts covering key entities, actions, and timestamps | | Deduplication mechanism | Vector similarity filtering | ✅ LLM-based semantic clustering: merges similar memories via LLM | | Write performance | Streaming, single writes | ✅ Batch flush + concurrency: 2–3× faster writes | | Storage format | SQLite / vector DB | ✅ FAISS + JSON metadata dual-write: fast retrieval + human-readable | | Multimodal Capability | Single image to text only | ✅ Video Multimodal Memory: Full video processing pipeline + ReAct multi-step reasoning QA | ---

Experimental Results

Dataset

We evaluate the ZH-4O Chinese long-character dialogue dataset constructed in the paper MOOM: Maintenance, Organization and Optimization of Memory in Ultra-Long Role-Playing Dialogues:

Average dialogue length: 600 turns per conversation
Scenarios: daily interactions, plot progression, evolving character relationships

Memory capability was assessed via QA benchmarks, e.g.:

{
"question": "What is Zhao Qi's nickname for Bai Yulan? A Xiaobai B Xiaoyu C Lanlan D Yuyu",
"answer": "A"
},
{
"question": "What is the relationship between Zhao Qi and Bai Yulan? A Classmates B Teacher and student C Enemies D Neighbors",
"answer": "B"
}

Experimental Configuration

LLM: Qwen3-8B (thinking mode disabled)
Embedding model: Qwen3-Embedding-8B
Metric: QA accuracy

| Method | Overall(%) | |:--------------------------------------------------------- |:---------- | | RAG | 62.45 | | Mem0 | 70.20 | | MOOM | 72.60 | | A-mem | 73.78 | | Memobase | 76.78 | | TeleMem | 86.33 |

Quick Start

Installation

pip install telemem # core (text memory) pip install "telemem[mcp]" # + MCP server pip install "telemem[video]" # + video/multimodal pipeline pip install "telemem[all]" # everything

Development Environment

Using uv (recommended — creates .venv from the committed uv.lock for a reproducible environment):

uv sync --all-extras # install TeleMem (editable) + all extras, incl. MCP uv run python examples/quickstart.py

Or with conda + pip:

# Create and activate virtual environment conda create -n telemem python=3.10 conda activate telemem # Install from source (editable), with the extras you need pip install -e ".[all]"

Example

Set your OpenAI API key:

export OPENAI_API_KEY="your-openai-api-key"

# python examples/quickstart.py import telemem as mem0 memory = mem0.Memory() messages = [ {"role": "user", "content": "Jordan, did you take the subway to work again today?"}, {"role": "assistant", "content": "Yes, James. The subway is much faster than driving. I leave at 7 o'clock and it's just not crowded."}, {"role": "user", "content": "Jordan, I want to try taking the subway too. Can you tell me which station is closest?"}, {"role": "assistant", "content": "Of course, James. You take Line 2 to Civic Center Station, exit from Exit A, and walk 5 minutes to the company."} ] memory.add(messages=messages, user_id="Jordan") results = memory.search("What transportation did Jordan use to go to work today?", user_id="Jordan") for hit in results["results"]: # same result shape as mem0 print(hit["memory"])

Memory() uses the default provider settings inherited from mem0ai. To use the repository's local Qwen + FAISS configuration, load config/config.yaml explicitly:

from telemem.utils import load_config import telemem as mem0 config = load_config("config/config.yaml") memory = mem0.Memory(config=config)

The runnable examples also honor the same configuration through TELEMEM_CONFIG:

TELEMEM_CONFIG=config/config.yaml python examples/quickstart.py

Using MiniMax as the LLM Provider

TeleMem supports MiniMax as an LLM backend via its OpenAI-compatible API. A ready-to-use example config is provided at config/config.minimax.yaml.

export MINIMAX_API_KEY="your-minimax-api-key" export OPENAI_API_KEY="your-openai-api-key" # still needed for embeddings

from telemem.utils import load_config import telemem as mem0 config = load_config("config/config.minimax.yaml") memory = mem0.Memory(config=config)

Key points for MiniMax usage:

LLM: MiniMax M3 (512K context, default) via https://api.minimax.io/v1; MiniMax M2.7 / M2.7-highspeed (204K context) remain available as alternatives

Temperature: must be in (0.0, 1.0] — set explicitly (e.g. 0.7) to avoid out-of-range errors

Embeddings: MiniMax does not provide a public embedding API; configure a separate embedder (e.g. text-embedding-3-small) in the embedder section

More LLM Providers

TeleMem works with any OpenAI-compatible endpoint. Ready-to-use config examples ship in config/:

| Provider | Config file | LLM | Embeddings | Notes | | -------- | ----------- | --- | ---------- | ----- | | Ollama (fully local) | [config.ollama.yaml](config/config.ollama.yaml) | any local model (e.g. qwen3:8b) | nomic-embed-text, local | No API key, no cloud — everything runs on your machine | | DeepSeek | [config.deepseek.yaml](config/config.deepseek.yaml) | deepseek-chat / deepseek-reasoner | external (e.g. OpenAI) | export DEEPSEEK_API_KEY=... | | Moonshot (Kimi) | [config.moonshot.yaml](config/config.moonshot.yaml) | kimi-k2-0905-preview | external (e.g. OpenAI) | .cn and .ai endpoints supported | | MiniMax | [config.minimax.yaml](config/config.minimax.yaml) | MiniMax-M3 | external (e.g. OpenAI) | see section above |

TELEMEM_CONFIG=config/config.ollama.yaml python examples/quickstart.py # 100% local memory

Project Structure

Expand/Collapse Directory Structure

telemem/ ├── assets/ # Documentation assets and figures ├── baselines/ # Baseline implementations for comparative evaluation │ ├── RAG # Retrieval-Augmented Generation baseline │ ├── MemoBase # MemoBase memory management system │ ├── MOOM # MOOM dual-branch narrative memory framework │ ├── A-mem # A-mem agent memory baseline │ └── Mem0 # Mem0 baseline implementation ├── config/ │ ├── config.yaml # TeleMem default configuration │ └── config.minimax.yaml # MiniMax provider example configuration ├── data/ # Small sample datasets for evaluation or demonstration ├── examples/ # Code examples and tutorial demos │ ├── quickstart.py # Quick start │ ├── quickstart_mm.py # Quick start (multimodal) │ ├── mcp_client.py # Quick start over MCP (stdio client) │ └── mcp_config.json # MCP config snippet for Claude Desktop / Cursor ├── docs/ │ ├── MCP.md # MCP server reference │ └── TeleMem_Tech_Report.pdf ├── telemem/ # Telemem code │ └── mcp/ # Model Context Protocol server ├── tests/ # Telemem test ├── README.md # English README ├── README-ZH.md # Chinese README └── pyproject.toml # Python environment

Core Functions

Add Memory (add)

The add() method injects one or more dialogue turns into the memory system.

def add( self, messages, *, user_id: Optional[str] = None, agent_id: Optional[str] = None, run_id: Optional[str] = None, metadata: Optional[Dict[str, Any]] = None, infer: bool = True, memory_type: Optional[str] = None, prompt: Optional[str] = None, batch: bool = False, )

🔎 Parameter Description

| Parameter | Type |

…

Source & license

This open-source MCP server is cataloged on AgentStack and links to its original source — we do not rehost the code.

Author: TeleAI-UAGI

Source: TeleAI-UAGI/telemem

License: Apache-2.0

Homepage: https://teleai-uagi.github.io/telemem/

Install and usage instructions live in the source repository linked above.

Reviews

No reviews yet — be the first.

Write a review
Sign in to write a review →

Versions

v1.7.1 Imported from the upstream source.

More mcp servers

PilotStack Your agent flies the app — you stay in the cockpit. PilotStack is an MCP server that drives iOS, Android, and web side by side: cheap semantic UI context, deep Flutter introspection, and a human approve/deny gate on every action. by ColtonDevAcc
Lex Episodic memory and architectural policy for AI agents. Frames, Atlas, and Policy. by Guffawaffle
Notes Local markdown notes: ranked + semantic search, tags, todos, and a wiki-link knowledge graph. by Abhishekkumar2021
TrueNAS AIops Governed TrueNAS SCALE storage ops — 21 MCP tools with audit, budget, undo guards. by AIops-tools
OT AIops Governed OT data tap + diagnostics: OPC-UA/Modbus/S7/MC/MTConnect/MQTT-Sparkplug. by AIops-tools
Signdocs Mcp Server SignDocs Brasil e-signature tools for AI agents — signing sessions, envelopes, verification. by signdocsbrasil

Pricing

Free Free
Get this mcp server
Creators keep 70%.

Source & license

Open source — cataloged on AgentStack, linking to the original. We don't rehost the code.

Author
TeleAI-UAGI

Source
Repository ↗

License
Apache-2.0