Case Studies

My Second Brain: One Memory Layer Feeding Every AI Tool I Use

Dan Crane

June 19, 2026

I wrote previously about why the Cloudflare developer stack is my default choice for building AI applications. This post is the personal version of that argument: the system I actually run, every day, that gives every AI tool I use, and my own notes, a single shared memory.

The problem this solves is one most people building seriously with AI run into eventually, and it gets worse the more tools you use. Every conversation with most AI assistants starts from zero. You re-explain your context, your projects, your preferences, your decisions, over and over, in every new chat. If you only use one tool, that's already frustrating. I don't use one tool. Through a working day I move between Claude, ChatGPT, Codex, and Hermes depending on what each is best at, a pattern I wrote about in a previous post on chaining specialist AI tools rather than relying on one model to do everything. Without a shared memory, switching tools means losing context every single time.

The setup I run solves this with a single central memory layer, self-hosted on Cloudflare's free tier, that everything else feeds into and reads from. Not a memory per tool. Not a separate system for my notes versus my AI conversations. One brain, several mouths.

Here's how it actually works.

The Core: One Worker, One Memory

The foundation is an open-source project called Second Brain, built by developer Rahil Pirani. It's a single Cloudflare Worker that exposes a small set of memory operations, backed by D1 for structured storage and Vectorize for semantic search, all sitting comfortably inside Cloudflare's free tier.

The architecture is deliberately simple. Every piece of content stored gets converted into a 384-dimensional vector using Workers AI's bge-small-en-v1.5 embedding model, which means recall works on meaning rather than requiring an exact keyword match. Store a note about "users dropping off at the payment step" and later ask about "onboarding problems," and the system finds the connection even though the two phrasings share no words.

What makes this genuinely useful rather than just technically interesting is that the project has grown well beyond a single integration. There's an official client for almost every place context actually originates: a browser extension, an Obsidian plugin, a CLI tool, iOS Shortcuts, a plain REST API, and full MCP support for any AI client that speaks the protocol. All of them write to and read from the exact same Worker. One memory, fed and queried from everywhere.

Deployment took about ten minutes using the one-click Cloudflare deploy button, which forks the repository and provisions the D1 database and Vectorize index automatically. After that it's a matter of running the schema migration, setting an auth token, and connecting whichever clients you actually use.

Obsidian as a Capture Surface, Not a Separate System

I'd originally thought about my Obsidian vault as a deliberately separate layer from the AI memory: my notes over here, the AI's running memory over there. Once I installed the official Second Brain Sync plugin for Obsidian, that separation stopped making sense, and I dropped it.

The plugin syncs your vault, or a scoped subset of it tagged however you choose, directly into the same Worker that every other client reads from. Auto-sync on save means a note I write in Obsidian is searchable by Claude, ChatGPT, or Codex within moments, with no export step and no manual copying. There's also a sidebar built into Obsidian itself for searching the brain without leaving the app, which is useful when I want to check what's already been captured before writing something new.

The scoping matters. I don't sync my entire vault wholesale. I tag the notes that represent decisions, project context, and reference material I actually want available to every AI tool, and leave the more exploratory or half-formed writing out of the shared pool until it's settled into something worth surfacing. That's a five-second decision at the point of writing, not a separate maintenance task.

The result is that Obsidian, which used to be the place I wrote things that AI tools would never see unless I pasted them in manually, is now simply one of the inputs into the same central memory everything else uses.

Every Other Way Context Gets In

Obsidian covers deliberate, structured writing. Most of what's worth remembering doesn't happen at a desk in front of a vault, so the rest of the capture surfaces matter just as much.

The browser extension lets me highlight text on any page, right-click, and save it straight into the brain: articles, documentation, research, a tweet worth remembering. No tab-switching, no separate read-it-later tool that then needs its own export step.

iOS Shortcuts cover the moments away from a keyboard entirely. A voice-dictated brain dump while walking, or a quick text capture from the share sheet on any app, both posting straight to the same Worker. I have the voice one wired to a Siri phrase, so capturing a half-formed idea while driving doesn't require touching the phone at all.

The CLI is the one I reach for most during actual work. brain remember "switching to React Query" from a terminal, or piping the output of a command straight in with cat notes.md | brain remember, means I never have to leave the terminal to log a decision while I'm mid-task. brain recall "state library decision" pulls it back the same way.

The REST API sits underneath all of the official clients and is there for anything custom: a webhook, a script, an automation that needs to write to or read from the brain without going through a packaged integration.

Every one of these writes to the same store. A decision logged from the CLI during a coding session is just as available to a conversation in Claude an hour later as a note synced from Obsidian or a highlight saved from the browser extension.

Connecting Every AI Tool I Actually Use

This is where the central memory pays for itself. Because the server speaks MCP, the open standard that's become the common language for AI tools calling external tools and data sources, it connects natively to every serious AI tool currently on my machine rather than being locked to one vendor's ecosystem.

I have it wired into Claude Desktop, Claude Code, and claude.ai. I also have it connected to ChatGPT and Codex, both of which added proper MCP client support over the past several months, and to Hermes, which is part of my regular agentic toolchain. The setup differs slightly by tool, Claude Desktop takes a small config file addition, ChatGPT and Codex add it through their own connector settings, Hermes connects the same way any MCP-aware client does, but the pattern is identical throughout: point the client at the Worker’s MCP endpoint, and it gains the ability to call remember and recall mid-conversation on its own.

The part that took genuine thought wasn't any of the technical connections. It was writing the custom instructions that make each tool actually use the memory consistently rather than just having access to it, having it sit there unused, or quietly preferring its own built-in memory instead. The exact wording I run in every client's custom instructions field is below, in the setup section, so you can copy it rather than reconstruct it from scratch.

The single most important line in it is the last one: don't fall back to your own built-in memory. Without that instruction explicitly stated, most tools will use the shared memory inconsistently, alongside whatever native memory they already have, and you end up with the exact problem this whole setup is meant to solve.

Setting This Up Yourself, Start to Finish

If you want to actually build this rather than just read about it, here's the sequence, start to finish. None of it requires infrastructure experience.

1. Deploy the Worker. Go to the Second Brain GitHub repo and click the one-click "Deploy to Cloudflare Workers" button, or visit thesecondbrain.dev and use the deploy button there. You'll need a free Cloudflare account if you don't already have one. This forks the repository into your own GitHub account and provisions the Worker, the D1 database, and the Vectorize index automatically. Takes about two minutes.

2. Run the schema migration. Following the deploy, the setup wizard or the repo's README will walk you through running the D1 schema migration via Wrangler. This is a single command that sets up the tables the Worker needs. If you've never used Wrangler before, the repo's wiki has the exact commands.

3. Set your auth token. You'll generate a bearer token that authenticates every client talking to your Worker. Store this somewhere safe, you'll need it for every integration below. Treat it like a password, because functionally that's what it is.

4. Grab your Worker URL and MCP endpoint. Your dashboard will show your Worker's base URL and its MCP endpoint specifically. You'll reuse this URL for every connection in the steps below.

5. Connect your AI clients.

Claude Desktop / Claude Code: add the MCP endpoint to your config (Claude Code: claude mcp add, Claude Desktop: edit the config file directly. Exact syntax is in the connection guide).
claude.ai (web/mobile): add it as a custom connector under Settings → Connectors, pasting in the MCP URL and your token.
ChatGPT: add it as a connector under Settings → Connectors (developer mode), same URL and token.
Codex: add it through Codex's own MCP server configuration, same pattern.
Cursor / Windsurf / Hermes / any other MCP-aware tool: same pattern again, point it at the URL, add the token.

6. Install the Obsidian plugin (optional, but worth it). Search "Second Brain Sync" in Obsidian's community plugins, install it, and add your Worker URL and token in the plugin settings. Choose full-vault sync or scope it to a tag like #brain. I'd strongly recommend scoping it rather than syncing everything from day one.

7. Set up the other capture surfaces (optional, add as you go). The browser extension, the CLI (npm install -g second-brain-cf-cli), and the iOS Shortcuts templates all just need the same Worker URL and token dropped into their settings on first run. None of these are required to get started; add them as the gaps in your capture become obvious.

8. Add the custom instructions to every AI client. This is the step that actually makes it work day to day, and it's the one most people skip because the technical setup feels like the finish line. It isn't. Paste something close to the following into the custom instructions or system prompt field of every AI client you've connected:

You have access to a personal second brain via MCP tools: remember, recall, list_recent, append, update, forget. Rules: 1. At the start of every conversation, call recall with a natural-language query describing both the topic and what I'm trying to do. Don't skip this even for simple topics. 2. Store important information automatically whenever I mention: - personal goals, preferences, habits - work context, decisions, strategy, progress - ideas or plans, even half-formed ones - tasks or commitments ("I need to", "I'm going to", "we decided") 3. Store useful content from your own responses too: recommendations, decisions, plans, or technical solutions worth remembering later. Store a condensed summary, not the full response. 4. Never ask permission before storing. Store silently and continue. 5. Never use your own built-in memory system instead of these tools. If you would normally save a memory, call remember instead. Always. 6. Tag every entry with the relevant project or topic, plus a general category: personal, work, task, idea, or context. 7. Before making any recommendation or suggestion, recall first to check whether you've already made it or whether I've already done it. Never repeat a recommendation without checking. 8. Always frame recall queries with intent, not bare keywords. Good: "User wants to fix a bug in the capture flow, what have we tried before?" Bad: "capture bug".

Adjust the wording to suit how you actually work, but keep instructions 1, 4, and 5 close to verbatim. Those three are doing most of the work: they're what stops the memory from being a capability that exists but goes unused, and what stops each tool from quietly defaulting back to its own separate memory instead of the shared one.

That's the whole setup. None of it took me more than an afternoon, spread across a couple of sittings, and most of that time was spent on step eight rather than anything technical.

Why a Central Brain Beats a Memory Per Tool

Built-in memory features from individual AI vendors are improving, but they're necessarily scoped to that one product. Claude remembers what you've told Claude. ChatGPT remembers what you've told ChatGPT. If you use one tool for everything, that's adequate. If you don't, and most people doing serious work with AI right now don't, you end up maintaining several partial, disconnected memories instead of one complete one, and re-establishing context every time you switch.

The setup described here inverts that. The memory doesn't belong to any vendor. It sits underneath all of them, as infrastructure rather than as a feature bolted onto one product. A decision made in conversation with Claude on Monday is available to Codex on Tuesday without me carrying it over manually. A note written in Obsidian during a planning session is available to whichever AI tool I happen to be using when the relevant question comes up later, with no extra step at the point of writing beyond a tag I was probably going to add anyway.

None of the five AI tools I connect to it, or the four capture surfaces feeding it, know or care that the others exist. They all just read from and write to the same source of truth underneath them.

What This Actually Changes Day to Day

The honest answer is that it removes a specific category of friction rather than transforming how I work in some dramatic sense. I don't re-explain context that's already been established, regardless of which tool I'm in or where the original context came from. A fresh conversation in any of the five AI tools already knows the shape of the projects I'm working on, the preferences I've expressed, and the decisions I've made, because it checks before responding rather than starting from nothing, drawing on a memory that Obsidian, the browser extension, the CLI, and every AI client have all been quietly contributing to.

The cost of running this is, as with the rest of the Cloudflare stack, close to zero. Workers, D1, and Vectorize at this scale sit comfortably inside the free tier. The ongoing cost is the discipline of writing good custom instructions for each client and occasionally pruning entries that are no longer relevant.

If you're already comfortable with the Cloudflare stack from the previous post, this is one of the more immediately useful things you can build on top of it, and Rahil's project, including the full set of official clients at thesecondbrain.dev, is a clean, well-documented starting point rather than something you need to build from scratch. The core is MIT-licensed and on GitHub for anyone who wants to deploy their own version or adapt it further.

If you're setting up your own AI memory layer or want to talk through how the pieces fit together for your own workflow, I'm happy to compare notes.

More Perspectives