# Github Twin

> Personal RAG over your GitHub history (commits, code, reviews), served to Claude Code over MCP.

- **Type:** MCP server
- **Install:** `agentstack add mcp-christopherdavenport-github-twin`
- **Verified:** Pending review
- **Seller:** [ChristopherDavenport](https://agentstack.voostack.com/s/christopherdavenport)
- **Installs:** 0
- **Latest version:** 0.0.10
- **License:** MIT
- **Upstream author:** [ChristopherDavenport](https://github.com/ChristopherDavenport)
- **Source:** https://github.com/ChristopherDavenport/github-twin

## Install

```sh
agentstack add mcp-christopherdavenport-github-twin
```

Requires the [AgentStack CLI](https://agentstack.voostack.com/docs/cli). Works with Claude Code, Cursor, and any MCP-compatible agent.

## About

# github-twin

[](https://pypi.org/project/github-twin/)
[](https://pypi.org/project/github-twin/)
[](https://github.com/ChristopherDavenport/github-twin/actions/workflows/ci.yml)
[](LICENSE)

> You reviewed a permission check six months ago. Claude doesn't remember
> it. **github-twin does** — it indexes your commits and review comments,
> and surfaces them as retrieval hits whenever an agent writes or reviews
> new code in your style.

  

  

Try it now from Claude Code — drop this into `~/.claude.json` and reload:

```json
{
  "mcpServers": {
    "github-twin": { "command": "uvx", "args": ["github-twin", "serve"] }
  }
}
```

> **Your code stays on your box.** Embeddings are computed locally
> (Ollama or sentence-transformers); only the LLM seam (`gt summarize`,
> `gt distill`, `gt eval`) optionally calls a hosted provider, and even
> that's swappable to local Ollama. The `gemini` embedder is the one
> exception — opt-in only.

---

A personal RAG over your GitHub history, served to Claude Code (or any MCP
client) as a stdio server. Two scopes, one codebase:

- **User mode** — index your own commits + review comments. Surfaces your
  past code as style examples and your past comments as review hints when
  an agent is writing or reviewing new code.
- **Org mode** — index a whole GitHub org's files-at-HEAD, commits, and PR
  reviews across every member. Queries scope by repo, language, or
  reviewer login.

Retrieval is hybrid (BM25 + vector via RRF), AST-aware via tree-sitter for
python/scala/javascript/typescript/go/rust, and contextually enriched at
embed time with per-chunk headers + optional LLM-generated summaries.

## Install

The fastest path is [uvx](https://docs.astral.sh/uv/) — no virtualenv to
manage, isolated per-tool:

```sh
# One-shot
uvx github-twin --help

# Pinned version
uvx github-twin@0.1.0 --help

# With sentence-transformers for the alt embedder
uvx --with 'github-twin[st]' github-twin --help
```

If you prefer a project-local install:

```sh
uv add github-twin            # or: pip install github-twin
gt --help
```

`gt` and `github-twin` are the same Typer app — use whichever fits your
muscle memory.

## Authenticate

Pick whichever is least friction — github-twin tries them in this order:

1. **OAuth device flow (no `gh` install needed):**
   ```sh
   uvx github-twin auth login        # opens browser, persists token
   uvx github-twin auth status       # show which source is active
   ```
   Token persists in the OS keyring (macOS Keychain / Linux Secret
   Service / Windows Credential Manager) or, when unavailable, a 0600
   file under your data dir.
2. **Existing `gh` CLI**: if you've already run `gh auth login`,
   `gt` picks up the token via `gh auth token` — nothing to do.
3. **`GITHUB_TOKEN` env var**: a classic PAT works too; useful for CI /
   headless / docker. Required scopes: `repo`, `read:org`, `user:email`.

## Wire into Claude Code

The MCP server runs over stdio via `github-twin serve` (or `gt serve`).
Run `uvx github-twin auth login` once on the box that will host the
server.

**Option A — via the Claude Code plugin marketplace** (lowest-friction):

```
/plugin marketplace add ChristopherDavenport/christopherdavenport-marketplace
/plugin install github-twin@christopherdavenport
```

This registers the MCP server entry automatically; set
`GT_PATHS__DATA_DIR` in your environment (or in `~/.claude.json`'s
`env` block for this server) to point at the DB directory.

**Option B — manual wiring**: add an entry to `~/.claude.json` (or
your `mcp_servers.json`):

```json
{
  "mcpServers": {
    "github-twin": {
      "command": "uvx",
      "args": ["github-twin", "serve"],
      "env": {
        "GT_PATHS__DATA_DIR": "/path/to/your/github-twin-data"
      }
    }
  }
}
```

If you'd rather not persist a token and instead supply it inline (CI,
ephemeral container), add `"GITHUB_TOKEN": "ghp_..."` to that `env`
block; it acts as the lowest-priority fallback.

Restart Claude Code; the `find_*`, `predict_review_outcome`,
`summarize_review_patterns`, and `sync` tools will be available.

## Quickstart

Pick a directory to hold the SQLite DB, config, and ingested cache —
everything per-data-dir lives under this one root:

```sh
export GT_PATHS__DATA_DIR=~/github-twin-data
uvx github-twin auth login                 # one-time OAuth (or set GITHUB_TOKEN)

# user mode (your own GitHub history)
uvx github-twin init                       # discover identity via /user
uvx github-twin sync                       # ingest + summarize + embed
uvx github-twin serve                      # MCP server over stdio

# layer an org into the SAME DB
uvx github-twin init --kind org --org http4s
uvx github-twin sync

# OR keep the org in its own DB by switching data dirs
GT_PATHS__DATA_DIR=~/twin-http4s \
  uvx github-twin init --kind org --org http4s
GT_PATHS__DATA_DIR=~/twin-http4s uvx github-twin sync
```

`gt sync` is incremental on subsequent runs.

`config.toml` lives next to the DB at `/config.toml` and is
created on the first `gt init --embed-backend ...` call. Default
`` is `$XDG_DATA_HOME/github-twin` (or `~/.local/share/github-twin`)
when `GT_PATHS__DATA_DIR` is unset.

## LLM provider matrix

The retrieval surface (find_*, predict_review_outcome) always runs locally
on the SQLite index — no API call. **LLM calls only happen** in three
places:

- `gt distill` — clusters review comments / commits into rules.
- `gt summarize` — generates per-chunk NL summaries used by the embed-time
  prefix.
- `gt eval reviews` / `eval predictions` — held-out RAG-vs-baseline scoring.

Each picks a backend by precedence **Claude → Gemini → Ollama** (whichever
API key is set), or you can force one explicitly.

| Provider | Env var | What it covers |
|---|---|---|
| Anthropic (Claude) | `ANTHROPIC_API_KEY` | Distill / summarize / eval LLM. Best quality. |
| Google (Gemini, API key) | `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Distill / summarize / eval LLM. Free tier is generous. |
| Google (Gemini, Vertex / ADC) | `GT_GEMINI_PROJECT` (+ optional `GT_GEMINI_LOCATION`, default `us-central1`) | Same backends, but auth via `gcloud auth application-default login` — no key in your shell. API key wins if both are set. |
| Ollama (local) | `OLLAMA_HOST` (default `http://127.0.0.1:11434`) | Distill / summarize / eval LLM. Fully offline. |

The Vertex / ADC path needs the `aiplatform.googleapis.com` API enabled
on your project, and **billing applies even for "free" Gemini models** —
the AI Studio free tier does not extend to Vertex. Project IDs are not
secrets; the credential itself lives at
`~/.config/gcloud/application_default_credentials.json` and is refreshed
by gcloud.

### Embedder backends

We keep the embedder backend separate from the LLM backend. Choose one:

- **Default — Ollama** (`nomic-embed-text`, 768-dim, ~50ms/chunk).
  Requires a running Ollama daemon. Zero cost, fully local.
- **Alternative — sentence-transformers** (`uv add 'github-twin[st]'`,
  pulls `torch`). Useful when an Ollama daemon isn't available or you
  want a specific HuggingFace model. Local.
- **Alternative — Gemini** (`gemini-embedding-001` at 3072-dim by
  default). Uses the `google-genai` dep that's already installed; auth
  via `GEMINI_API_KEY` / `GOOGLE_API_KEY`, or via `GT_GEMINI_PROJECT`
  + ADC (`gcloud auth application-default login`) to route through
  Vertex AI without managing a key. **Remote** — this is the only
  embedder that sends chunk text off-box. Pick it when you have Gemini
  auth but no Ollama / `[st]` install, and your corpus is okay to
  share with Google.

The embedder is a per-DB commitment — `sqlite-vec` bakes the vector
dimension into the table at first creation. Stamp the choice into
`/config.toml` at init time so every subsequent command
picks it up:

```sh
gt init --embed-backend gemini                              # gemini-embedding-001, 3072
gt init --embed-backend gemini --embed-dim 1536            # request shorter output
gt init --embed-backend sentence_transformers \
        --embed-model BAAI/bge-small-en-v1.5 --embed-dim 384
```

Re-running with the same values is a no-op; running with different
values against an existing `config.toml` fails loud rather than
silently changing the corpus. `GT_EMBED__*` env vars still work for
one-off overrides and CI.

A "cloud-LLM only" setup either needs an embedder process (Ollama /
`[st]`) or has to opt into the remote Gemini embedder.

## Required GitHub token scopes

When you `gt init`, the GH client needs:

- `repo` — private repos and PR comments on them
- `user:email` — verified email addresses for the user-mode identity sweep
- `read:org` — org member listing and private org repo discovery

A fine-grained PAT works; classic tokens too.

## Retrieval

Hybrid search by default: BM25 (SQLite FTS5) and vector similarity run in
parallel, then fuse via Reciprocal Rank Fusion (k=60). The vector leg
matches semantic intent; the BM25 leg catches exact identifiers
(`getUserById`, `SQLITE_OPEN_READWRITE`) that vector search routinely
misses. Design reference: [Anthropic — Contextual
Retrieval](https://www.anthropic.com/engineering/contextual-retrieval).

At embed time, each chunk gets a deterministic header prepended:
`# path :: symbol_name (node_kind)`, plus the function's leading
docstring/comment when present, plus an optional LLM-generated summary
(see `gt summarize`). The header lets vector queries land on chunks
whose bodies only contain identifiers (e.g. natural-language queries
against a `VaultSecretEq` function).

BM25 query expansion is on by default (`cfg.retrieval.query_expansion =
"rule"`), with rule-based code-shaped synonyms applied **only** to the
BM25 leg — embeddings already capture synonymy, so expansion never
touches the vector query. Switch to `"ollama"` to add LLM-generated
alternates on top, cached on disk per-token.

`predict_review_outcome` stays on pure vector retrieval because its
inverse-distance vote weighting depends on calibrated L2 distance.

## MCP tools

All retrieval tools accept optional `repo=` and `author_login=` filters.

| Tool | Returns |
|---|---|
| `find_review_comments(diff_hunk, language?, repo?, author_login?, k=5)` | Past review comments on diffs similar to the input. |
| `find_style_examples(query, language?, repo?, author_login?, k=5)` | Past code chunks matching a description. |
| `find_code(query, language?, repo?, path_glob?, node_kind?, k=5)` | Source snippets from files at HEAD (org mode). |
| `find_applicable_rules(query, language?, repo?, author_login?, k=5)` | Distilled code-pattern rules relevant to a coding task. |
| `predict_review_outcome(diff_or_summary, language?, repo?, author_login?, k=20)` | Weighted prediction over nearest past PRs: `{approved, changes_requested, commented}`. |
| `summarize_review_patterns(language?, limit=20)` | Distilled rules from clustered review comments (run `gt distill` first). |
| `sync(since?)` | Incremental ingest + summarize + embed. |

## CLI

```
gt init [--kind user|org|repo] [--org N] [--repo owner/name]
gt repos                                       # list discovered org repos
gt ingest                                      # backfill
gt summarize [--limit N] [--backend ...]       # LLM NL summaries per chunk
gt embed                                       # embed pending chunks
gt sync [--skip-summarize]                     # incremental: ingest → summarize → embed
gt stats                                       # corpus counts
gt distill [--backend ...] [--author ...]      # rule extraction
gt clones prune [--older-than-days N]          # GC the persistent clone cache
gt eval reviews     --since DATE [...]         # held-out RAG-vs-baseline eval
gt eval predictions --since DATE [...]
gt eval search evals/queries/default.yaml      # retrieval-quality dogfood
gt serve                                       # MCP stdio server
```

Use `github-twin ` interchangeably with `gt `.

## Pluggable backends

| Surface | Env / config key | Default | Alt |
|---|---|---|---|
| LLM (`cfg.distill.backend`, `cfg.summarize.backend`) | `ANTHROPIC_API_KEY` / `GEMINI_API_KEY` (or `GT_GEMINI_PROJECT` + ADC) / Ollama | `auto` (cloud > local) | force `claude` / `gemini` / `ollama` |
| Embedder (`cfg.embed.backend`) | — / `GEMINI_API_KEY` (or `GT_GEMINI_PROJECT` + ADC) | `ollama` (`nomic-embed-text`) | `sentence_transformers` via `[st]` extra, or `gemini` (`gemini-embedding-001`, remote) |
| Vector store (`cfg.vector_store.backend`) | — | `sqlite-vec` (brute-force KNN) | `faiss` via `[faiss]` extra |
| BM25 query expansion (`cfg.retrieval.query_expansion`) | — | `rule` (deterministic) | `ollama` (LLM, cached) or `off` |

All settings are layered: defaults → `/config.toml` (or the
explicit `--config PATH`) → env vars prefixed `GT_` (nested via `__`,
e.g. `GT_EMBED__BACKEND=sentence_transformers`).

## Held-out evaluation

`gt eval` runs the same prompt with and without retrieval and measures
RAG's accuracy lift on real held-out data:

```sh
# Review-comment voice match (cosine distance to ground truth)
uvx github-twin eval reviews --since 2025-01-01 --limit 100

# Org-mode: scope to one reviewer (and optionally one repo)
uvx github-twin eval reviews     --since 2025-01-01 --author alice --repo http4s/http4s
uvx github-twin eval predictions --since 2025-01-01 --author alice

# Retrieval-quality dogfood (per-tier, per-backend pass rates)
uvx github-twin eval search evals/queries/default.yaml --mode all
```

The harness pre-flights eligibility counts so typo'd `--author` or
`--repo` fail fast without burning LLM calls. The judge embedder
defaults to a different model than the retriever
(sentence-transformers BGE-small with the `[st]` extra installed) to
avoid measuring how well retrieval clusters its own outputs.

## Observability (OpenTelemetry)

Spans for every MCP tool call, every embedder call, and every retrieval
leg, exported via OTLP. **Auto-detected** — nothing fires unless the
environment is configured. Specifically:

1. Install the `[otel]` extra (carries the SDK + HTTP OTLP exporter):

   ```sh
   uvx --with 'github-twin[otel]' github-twin serve
   ```

2. Point at an OTLP HTTP collector via env vars:

   ```sh
   export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
   export OTEL_SERVICE_NAME=github-twin            # optional
   ```

`OTEL_SDK_DISABLED=true` forces it off even when an endpoint is set.

Without `[otel]` *or* without an endpoint env var, the code paths
still run but every span is a free no-op from `opentelemetry-api`'s
built-in tracer. **stdout is never used** — even with telemetry on —
because MCP speaks JSON over stdin/stdout and a stray console exporter
would corrupt the channel. The OTLP HTTP exporter posts to your
collector; SDK warnings route through Python `logging` (stderr).

Wired into Claude Code:

```json
{
  "mcpServers": {
    "github-twin": {
      "command": "uvx",
      "args": ["--with", "github-twin[otel]", "github-twin", "serve"],
      "env": {
        "GITHUB_TOKEN": "ghp_...",
        "GT_PATHS__DATA_DIR": "/path/to/twin-data",
        "OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:4318",
        "OTEL_SERVICE_NAME": "github-twin"
      }
    }
  }
}
```

Span names + key attributes you can pivot on:

| Span | Useful attributes |
|---|---|
| `mcp.tool.{find_review_comments,find_style_examples,find_code,find_applicable_rules,predict_review_outcome,summarize_review_patterns,sync}` | `gh_twin.tool.k`, `gh_twin.filter.*`, `gh_twin.result.count` (or `.prediction`/`.confidence` for predict) |
| `embedder.embed` | `gh_twin.embed.input_chars`, `gh_twin.embed.model` |
| `retrieval.hybrid_search` | `gh_twin.retrieval.{chunk_kind,k,expander,hits,top_distance}` |
| `retrieval.vector_search` (predict_review_outcome) | same shape, sans `expander` |

A broken or unreachable collector emits a single `Failed to export
span batch` log line per flush attempt and never propagates into the
tool handler — pinned by `tests/test_observability.py`.

gRPC us

…

## Source & license

This open-source MCP server is cataloged on AgentStack and links to its original source — we do not rehost the code.

- **Author:** [ChristopherDavenport](https://github.com/ChristopherDavenport)
- **Source:** [ChristopherDavenport/github-twin](https://github.com/ChristopherDavenport/github-twin)
- **License:** MIT

Install and usage instructions live in the source repository linked above.

## Pricing

- **Free** — Free

## Versions

- **0.0.10** — security scan: pending review — Imported from the upstream source.

## Links

- Listing page: https://agentstack.voostack.com/l/mcp-christopherdavenport-github-twin
- Seller: https://agentstack.voostack.com/s/christopherdavenport
- Browse the marketplace: https://agentstack.voostack.com/browse

---
Listed on AgentStack — the marketplace for AI agent skills and MCP servers. Every listing is security-reviewed. Creators keep 70%.
