# Research Hub

> Research workspace for Zotero + Obsidian + NotebookLM. Search, ingest, sync, verify briefs.

- **Type:** MCP server
- **Install:** `agentstack add mcp-wenyuchiou-research-hub`
- **Verified:** Pending review
- **Seller:** [WenyuChiou](https://agentstack.voostack.com/s/wenyuchiou)
- **Installs:** 0
- **Latest version:** 1.0.0
- **License:** MIT
- **Upstream author:** [WenyuChiou](https://github.com/WenyuChiou)
- **Source:** https://github.com/WenyuChiou/research-hub
- **Website:** https://pypi.org/project/research-hub-pipeline/

## Install

```sh
agentstack add mcp-wenyuchiou-research-hub
```

Requires the [AgentStack CLI](https://agentstack.voostack.com/docs/cli). Works with Claude Code, Cursor, and any MCP-compatible agent.

## About

# research-hub

> **Turn your research stack into an AI-operable workspace.**
> Use Zotero, Obsidian, and NotebookLM together, or start with any two. research-hub gives your AI assistant a real CLI, MCP server, REST API, and dashboard for repeatable literature workflows.

[](https://pypi.org/project/research-hub-pipeline/)
[](pyproject.toml)
[](LICENSE)
[](https://github.com/punkpeye/awesome-mcp-servers)

[](https://www.zotero.org/)
[](https://obsidian.md/)
[](https://notebooklm.google.com/)

Traditional Chinese: [README.zh-TW.md](README.zh-TW.md) | [Watch the full-res mp4](docs/demo/dashboard-walkthrough.mp4)

> 📚 Part of the [**agentic AI learning roadmap**](https://github.com/WenyuChiou/awesome-agentic-ai-zh) — a 7-stage curated path for building agentic AI, multilingual (zh-TW · zh-Hans · English). This workspace is referenced in §13 (research workflow skills).

> 🧪 **Real-use signal:** in daily use by 1 PhD researcher (Lehigh CEE) tracking 7+ research clusters across Zotero + Obsidian + NotebookLM. Shipping since Apr 2026, docs updated for v0.95.0.

---

## Quick start

```bash
pip install research-hub-pipeline
research-hub dashboard --sample   # preview with sample data, no accounts needed
```

For a real research-hub vault with Zotero / Obsidian / NotebookLM integration,
pick the install path matching your stack in [§ Start Here](#start-here).

---

## Contents

1. [Quick start](#quick-start)
2. [Real Screenshots](#real-screenshots)
3. [Is this for me?](#is-this-for-me--vs-alternatives)
4. [Start Here](#start-here)
5. [First-Run Checklist](#first-run-checklist)
6. [Credential Reference](#credential-reference)
7. [Connect your AI host](#connect-your-ai-host)
8. [Why this exists](#why-this-exists)
9. [What it does](#what-it-does)
10. [Operator Modes](#operator-modes)
11. [Dashboard tour](#dashboard-tour)
12. [Inside Zotero](#inside-zotero)
13. [Feature matrix](#feature-matrix)
14. [Troubleshooting](#troubleshooting)
15. [Known limitations](#known-limitations)
16. [Docs + Status + Dev](#docs--status--dev)
17. [License](#license)

---

## Real Screenshots

These are generated by a real research-hub vault, not mockups.

**Obsidian paper note**: Markdown note with title, authors, DOI, Zotero key,
tags, cluster, status, and verification metadata.

**Obsidian Bases dashboard**: generated `.base` file with sortable paper
metadata and reading status.

**Obsidian graph view**: managed topic folders and labels can be colored with
`research-hub vault graph-colors --refresh`.

Generated crystals are also plain Markdown notes under
`hub//crystals/*.md`, so they can be linked, searched, and read
by MCP tools at low token cost.

---

## Is this for me? — vs alternatives

research-hub does not replace Zotero, Obsidian, or NotebookLM. It connects them so an AI agent can operate the workflow.

| What you can do | Zotero alone | NotebookLM alone | Generic RAG | Obsidian-Zotero plugin | research-hub |
|---|---:|---:|---:|---:|---:|
| Search arXiv + Semantic Scholar in one command | No | No | DIY | No | Yes |
| Ingest into Zotero and Obsidian and NotebookLM | No | No | DIY | Partial | Yes |
| AI brief from your collection | No | Manual | DIY | No | Yes |
| Cached canonical answers | No | No | Re-fetches | No | Yes |
| Structured memory layer | No | No | Usually chunks | No | Yes |
| Direct AI-agent control via MCP | No | No | DIY | No | Yes |
| Live dashboard with action buttons | No | No | No | No | Yes |
| Per-cluster Obsidian Bases dashboard | No | No | No | No | Yes |
| No OpenAI/Anthropic API key required | n/a | Yes | Usually no | n/a | Yes |
| Local-first vault you own | Partial | No | Depends | Yes | Yes |

The practical fit: research-hub is most useful if you already use at least two of Zotero, Obsidian, and NotebookLM and want your AI assistant to run the repetitive steps.

---

## Start Here

Pick the path with the fewest moving parts. You can add Zotero,
NotebookLM, MCP, or AI-host skills later.

| Goal | Accounts needed | Commands |
|---|---|---|
| Preview the dashboard only | None | `pip install research-hub-pipeline` then `research-hub dashboard --sample` |
| Try a demo vault | None | `pip install research-hub-pipeline` then `research-hub init --sample` |
| Work from local PDFs/DOCX/Markdown | Obsidian optional | `pip install "research-hub-pipeline[import,secrets]"` then `research-hub setup --persona analyst` |
| Zotero + Obsidian, no browser automation | Zotero | `pip install "research-hub-pipeline[secrets]"` then `research-hub setup --skip-login` |
| Full Zotero + Obsidian + NotebookLM loop | Zotero + Google | `pip install "research-hub-pipeline[playwright,secrets]"` then `research-hub setup` |
| Autonomous agent bootstrap | Existing vault or target folder | `python -m research_hub setup --autonomous --vault ./vault --persona agent` |

After setup, run:

```bash
research-hub doctor
research-hub serve --dashboard
```

For the first real ingestion, keep NotebookLM out of the path until
Zotero and Obsidian are healthy:

```bash
research-hub auto "agent-based modeling" --max-papers 3 --no-nlm
```

Then enable NotebookLM after the browser login works:

```bash
research-hub notebooklm login --auto-detect
research-hub notebooklm bundle --cluster 
research-hub notebooklm upload --cluster 
research-hub notebooklm generate --cluster  --type brief
research-hub notebooklm download --cluster 
```

`research-hub setup` also prints these next steps when it finishes.

## First-Run Checklist

| Item | Needed when | How to handle it |
|---|---|---|
| Python 3.10+ | Always | Use the same Python that runs `pip install research-hub-pipeline` |
| Zotero API key + library ID | Zotero-backed paper ingestion | Set `ZOTERO_API_KEY` and `ZOTERO_LIBRARY_ID`, then run `research-hub doctor` |
| Obsidian vault | Markdown note workflow | Point `setup` at a folder you can open in Obsidian; it is still plain Markdown |
| NotebookLM browser login | NotebookLM upload/generate/download | Run `research-hub notebooklm login --auto-detect`; Google OAuth still requires a visible human sign-in |
| LLM CLI for relevance judging | `research-hub auto` default path | Install `claude`, `codex`, `gemini`, `opencode`, `aichat`, `cursor`, configure a custom adapter, or pass `--no-fit-check` |
| AI-host integration | Claude/Codex/Cursor/Gemini/OpenClaw/etc. | Use MCP/REST for tool-calling hosts; use `research-hub install --platform ...` only for verified skill installer targets |

## Credential Reference

These variables are required only for Zotero-backed workflows. Local
file import, sample dashboards, MCP server startup, and REST API
inspection can run without them.

| Name | Required | Purpose |
|---|---|---|
| `ZOTERO_API_KEY` | yes | Zotero web API auth, required for paper ingestion |
| `ZOTERO_LIBRARY_ID` | yes | Zotero library identifier |
| `SEMANTIC_SCHOLAR_API_KEY` | no | Uses an S2 API key and defaults to a conservative ~1 request/sec throttle |
| `SEMANTIC_SCHOLAR_RPS` | no | Optional S2 request-rate override; leave unset unless your key has a different quota |
| `TAVILY_API_KEY` | no | Web search backend (alternative to DDG) |
| `BRAVE_API_KEY` | no | Web search backend (alternative to DDG) |

Semantic Scholar searches are deliberately paced. Without
`SEMANTIC_SCHOLAR_API_KEY`, research-hub uses a slower anonymous delay
because public traffic shares capacity. With a key, the default is
approximately one request per second and 429 responses are retried with
`Retry-After` / exponential backoff. If Semantic Scholar grants your key
a different quota, set `SEMANTIC_SCHOLAR_RPS` instead of editing code.

## Connect your AI host

research-hub has two AI-facing integration layers:

| Layer | Best for | Current status |
|---|---|---|
| MCP / REST | Claude Desktop, Claude Code, Cursor, Continue.dev, Cline, Roo Code, VS Code Copilot, OpenClaw, and other tool-calling hosts | Host-agnostic; configure the MCP server or call the REST API |
| Installed `SKILL.md` files | Claude Code, Codex, Cursor, Gemini | Built-in installer targets via `research-hub install --platform ...` |
| Manual `SKILL.md` loading | Hermes, OpenClaw, other agents with skill/rules directories | Copy or reference the bundled skill directories manually; not release-verified as installer targets |

For Claude Desktop, Cursor, Continue.dev, Cline, VS Code Copilot, OpenClaw, or another MCP host, configure the MCP server:

```json
{ "mcpServers": { "research-hub": { "command": "research-hub", "args": ["serve"] } } }
```

Restart the host. Then ask naturally:

> Find me 5 papers on agent-based modeling and put them in a notebook.

The AI can call `auto_research_topic(topic="agent-based modeling", max_papers=5)` and ingest papers, generate a NotebookLM brief, and update the vault.

Install host-specific skill files for the platforms with known default skill directories:

```bash
research-hub install --platform claude-code
research-hub install --platform cursor
research-hub install --platform codex
research-hub install --platform gemini
```

OpenClaw, Hermes, and other agents can still use research-hub through MCP/REST. If the host supports `SKILL.md`-style directories or rules files, copy the bundled directories from `skills/` or inline the relevant `SKILL.md` into the host's instructions. `research-hub install --platform` does not currently verify those hosts.

Browser-only or HTTP-capable AIs can use the REST API after starting the local server with `research-hub serve --dashboard`:

```bash
curl -X POST http://127.0.0.1:8765/api/v1/plan \
     -H "Content-Type: application/json" \
     -d "{\"intent\":\"research harness engineering\"}"
```

Full reference: [MCP tools](docs/mcp-tools.md), [AI integrations](docs/ai-integrations.md), [AI host support matrix](docs/ai-host-support.md), and [live smoke checklist](docs/live-smoke.md).

---

## Why this exists

Most research tools are good at one part of the workflow:

- Zotero stores citations, metadata, and PDFs.
- Obsidian stores notes, links, and synthesis.
- NotebookLM turns source bundles into AI-readable briefs.

The painful part is the handoff. research-hub connects those handoffs so an AI agent can search, ingest, tag, summarize, repair, brief, and inspect your workspace without turning your library into an opaque RAG box.

You do **not** need all three tools on day one.

| Your current stack | What research-hub gives you first |
|---|---|
| Zotero + Obsidian | Paper search, Zotero metadata, Markdown notes, tags, Obsidian Bases dashboards |
| Obsidian + NotebookLM | Local PDF/DOCX/MD/TXT ingest, cluster dashboards, NotebookLM bundles and briefs |
| Zotero + NotebookLM | Zotero-backed paper selection, namespaced tags, NotebookLM upload/generate/download |
| Zotero + Obsidian + NotebookLM | Full loop: discover -> ingest -> organize -> brief -> answer -> maintain |
| No accounts yet | Sample dashboard and local smoke tests before connecting anything |

---

## What it does

research-hub is a local-first orchestration layer for research workflows:

- **CLI:** `research-hub auto`, `import-folder`, `ask`, `doctor`, `tidy`, `clusters`, `zotero`, `notebooklm`, `crystal`, and more.
- **MCP server:** lets Claude Desktop, Claude Code, Cursor, Continue.dev, Cline, Roo Code, OpenClaw, and other MCP hosts operate the same workflow.
- **REST API:** exposes `/api/v1/*` for browser-only or HTTP-capable assistants.
- **Portable skill pack:** `SKILL.md` workflow instructions can be installed directly for Claude Code, Codex, Cursor, and Gemini, or copied manually into hosts that support skill/rules directories.
- **Dashboard:** gives humans a live view of clusters, papers, diagnostics, briefs, writing support, and management actions.
- **Vault format:** writes normal Markdown, frontmatter, `.base` dashboards, cache files, and logs that you can inspect directly.
- **Authenticity gate (v0.95+):** every discovered paper must resolve to a real identifier (DOI / arXiv / PMID), pass integrity and relevance checks, or it is **quarantined with a recorded reason** and never written to the vault. No fabricated references — inspect rejects with `research-hub quarantine list`.

The core loop:

```text
topic or source folder
  -> discover or import sources
  -> verify authenticity (resolve + integrity + relevance) or quarantine
  -> enrich metadata
  -> write Zotero tags/notes when enabled
  -> write Obsidian Markdown notes and cluster dashboards
  -> bundle/upload/generate with NotebookLM when enabled
  -> cache answers as crystals and structured memory
```

---

## Operator Modes

research-hub supports both human-first and agent-first setup.

For a human researcher, `research-hub setup` runs the onboarding wizard,
installs host-specific skills when it can detect the host, optionally
launches NotebookLM login, and offers a small sample run.

For an autonomous agent or Cowork-style host:

```bash
pip install research-hub-pipeline
python -m research_hub describe > capabilities.json
python -m research_hub setup --autonomous --vault ./vault --persona agent
# emits BootstrapReport JSON; exit code 0 if ready, 1 otherwise
```

Then drive operations via CLI `--json` mode or the bundled MCP server
(`research-hub-mcp`). All report-shaped commands accept `--json`;
capability introspection lives in `research-hub describe`.

**NotebookLM boundary.** NotebookLM upload still requires one-time
human-driven browser-based Google OAuth. Headless agents can prepare
bundles and read downloaded briefs, but they cannot complete Google's
first sign-in or phone challenge by themselves.

**Relevance judge boundary.** `auto_research_topic` and `research-hub
auto` run a fail-closed relevance check by default. With no supported
LLM CLI and no `--no-fit-check`, `auto` stops before search and prints
the fix instead of silently producing an empty vault.

| Persona | Best for | Install extra |
|---|---|---|
| Researcher | STEM papers, DOI/arXiv, Zotero-first workflows | `[playwright,secrets]` |
| Humanities | books, quotes, URL-only sources, Zotero + Obsidian | `[playwright,secrets]` |
| Analyst | industry research, local PDFs/reports, no Zotero required | `[import,secrets]` |
| Internal KM | lab/company knowledge bases, mixed file types | `[import,secrets]` |

Field presets for `discover new`, `search`, and related planning flows
are `cs`, `bio`, `med`, `physics`, `math`, `social`, `econ`, `chem`,
`astro`, `edu`, and `general`. There is no `hydrology` preset; use
`general` intentionally.

---

## Dashboard tour

`research-hub serve --dashboard` opens `http://127.0.0.1:8765/`.

**Overview**: treemap over clusters, storage map, and health summary.

**Library**: per-cluster drill-down with papers, sub-topics, and per-paper actions.

**Diagnostics**: grouped drift alerts and readiness checks.

**Manage**: CLI actions as buttons, inline result drawer, confirmation modal, and per-paper row actions.

Briefings and Writing tabs are also available. See the [dashboard walkthrough](docs/dashboard-walkthrough.md) and [persona variants](docs/personas.md).

---

## Inside Zotero

Every ingested paper gets a namespaced tag set so you can filter your library by research-hub context:

| Tag | Meaning |
|---|---|
| `research-hub` | Ingested through this pipeline |
| `cluster/` | Which research cluster the paper belongs to |
| `category/` | arXiv category like `cs.AI` or `econ.GN` |
| `type/` | `Review`, `JournalArticle`, etc. from Semantic Scholar |
| `src/` | Search backend that discovered it: `arxiv`, `semantic_scholar`, `crossref`, `zotero` |

Every paper can also get a child note with `Summary / Key Findings / Methodology / Relevance`, derived from the Obsidian frontmatter. Papers that were in Zotero before research-hub existed can be backfilled with:

```bash
research-hub zotero backfill --tags --notes --apply
```

---

## Feature matrix

| Capability | Command or MCP tool | Notes |
|---|---|---|
| One-shot setup | `research-hub setup` | init + install + optional NotebookLM login + guided sample r

…

## Source & license

This open-source MCP server is cataloged on AgentStack and links to its original source — we do not rehost the code.

- **Author:** [WenyuChiou](https://github.com/WenyuChiou)
- **Source:** [WenyuChiou/research-hub](https://github.com/WenyuChiou/research-hub)
- **License:** MIT
- **Homepage:** https://pypi.org/project/research-hub-pipeline/

Install and usage instructions live in the source repository linked above.

## Pricing

- **Free** — Free

## Versions

- **1.0.0** — security scan: pending review — Imported from the upstream source.

## Links

- Listing page: https://agentstack.voostack.com/l/mcp-wenyuchiou-research-hub
- Seller: https://agentstack.voostack.com/s/wenyuchiou
- Browse the marketplace: https://agentstack.voostack.com/browse

---
Listed on AgentStack — the marketplace for AI agent skills and MCP servers. Every listing is security-reviewed. Creators keep 70%.
