add vault link

This commit is contained in:
Andras Schmelczer 2026-05-28 13:16:47 +01:00
parent dc5b49c373
commit d83691323f
4 changed files with 234 additions and 0 deletions

View file

@ -0,0 +1,47 @@
<svg width="200" height="200" viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg">
<defs>
<linearGradient id="grad1" x1="0%" y1="0%" x2="100%" y2="100%">
<stop offset="0%" style="stop-color:#4A90E2;stop-opacity:1" />
<stop offset="100%" style="stop-color:#357ABD;stop-opacity:1" />
</linearGradient>
</defs>
<!-- Background circle -->
<circle cx="100" cy="100" r="90" fill="url(#grad1)" opacity="0.15"/>
<!-- Main vault icon -->
<g transform="translate(100, 100)">
<!-- Vault body -->
<rect x="-45" y="-50" width="90" height="80" rx="8" fill="none" stroke="url(#grad1)" stroke-width="6"/>
<!-- Vault door circle -->
<circle cx="0" cy="-10" r="22" fill="none" stroke="url(#grad1)" stroke-width="5"/>
<circle cx="0" cy="-10" r="14" fill="none" stroke="url(#grad1)" stroke-width="3"/>
<circle cx="0" cy="-10" r="6" fill="url(#grad1)"/>
<!-- Vault handle -->
<line x1="0" y1="-4" x2="18" y2="-4" stroke="url(#grad1)" stroke-width="3" stroke-linecap="round"/>
<circle cx="18" cy="-4" r="4" fill="url(#grad1)"/>
<!-- Link chain -->
<g opacity="0.9">
<!-- Left link -->
<ellipse cx="-30" cy="40" rx="12" ry="8" fill="none" stroke="url(#grad1)" stroke-width="4"/>
<!-- Right link -->
<ellipse cx="30" cy="40" rx="12" ry="8" fill="none" stroke="url(#grad1)" stroke-width="4"/>
<!-- Center link connecting them -->
<ellipse cx="0" cy="40" rx="12" ry="8" fill="none" stroke="url(#grad1)" stroke-width="4"/>
</g>
<!-- Sync arrows (subtle) -->
<g opacity="0.5">
<!-- Clockwise arrow top-right -->
<path d="M 35 -35 Q 50 -35 50 -20 L 50 -15" fill="none" stroke="url(#grad1)" stroke-width="2.5" stroke-linecap="round"/>
<polygon points="50,-15 47,-22 53,-22" fill="url(#grad1)"/>
<!-- Counter-clockwise arrow bottom-left -->
<path d="M -35 25 Q -50 25 -50 10 L -50 5" fill="none" stroke="url(#grad1)" stroke-width="2.5" stroke-linecap="round"/>
<polygon points="-50,5 -47,12 -53,12" fill="url(#grad1)"/>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 2 KiB

View file

@ -0,0 +1,111 @@
---
title: An Obsidian Sync Built Around the Merger I Already Had
description: 'VaultLink: self-hosted Obsidian sync. Edit in any editor, online or off, then come back to a converged vault. The application that justified reconcile-text.'
date: 2026-05-30
projectPeriod: '2025-2026'
thumbnail:
src: ./_assets/vault-link.svg
alt: 'The VaultLink logo: a chain-link mark in a soft gradient.'
tags: ['systems', 'web', 'tools']
role: Sync engine and server author
stack:
[
'Rust',
'axum',
'sqlx',
'SQLite',
'WebSockets',
'TypeScript',
'Obsidian plugin',
'ts-rs',
'wasm-bindgen',
'reconcile-text',
]
scale: One Rust server, one TypeScript sync engine, three published consumers (Obsidian plugin, CLI, fuzz/deterministic test harnesses)
outcome: A self-hosted Obsidian sync I trust enough to use as my primary vault transport
audience: technical
links:
- label: Source
url: https://github.com/schmelczer/vault-link
- label: Docs
url: https://vault-link.schmelczer.dev
---
**The two-bullet pitch:**
- Self-hosted Obsidian sync. One Rust server (axum + sqlx + SQLite), one TypeScript sync engine, three consumers: an Obsidian plugin, a standalone CLI, and two test harnesses. The point is to let me edit notes in Vim, VS Code, Obsidian desktop, and Obsidian mobile without the system caring which one I'm in.
- The merge primitive is [reconcile-text](/articles/reconcile-text-3-way-merge/), which I wrote first. VaultLink is the question that made it worth writing, finally asked in earnest.
## The constraint that picks the algorithm
The whole shape of VaultLink is downstream of one decision: I refuse to give up the editor. Obsidian on the phone, Vim on the laptop, VS Code at work, the occasional headless `sed` across the whole vault. None of those know about each other; none of them are going to learn to.
The consequence is that the server never sees keystrokes. It sees end states: a file as it stood when sync caught it. That kills CRDTs (which need every operation) and OT-as-it's-usually-implemented (same). It leaves you with one primitive: 3-way merge given a parent, a left, and a right. Which is reconcile-text. Which I'd written exactly because no existing tool took three independently-edited file states and gave one back.
The other consequence is that the _path placement_ is its own problem. Two clients might both move the same file. A file might land on a slot another file already occupies. A rename and a content edit might race. That's the part I underestimated.
## Two loops, separate invariants
The sync engine is two loops, deliberately disentangled:
- **Wire loop** (`syncer.ts`). Drains the single-consumer FIFO of pending HTTP and WebSocket ops. Updates a document's record fields (`remoteRelativePath`, `parentVersionId`, `remoteHash`) and writes content to whatever path the record currently holds. _Never moves files for path placement._
- **Path reconciler** (`reconciler.ts`). Runs after every drained event. Best-effort pass that moves files on disk so `localPath === remoteRelativePath`. The move graph is topologically sorted. Records with pending local events are skipped; the reconciler only operates on settled ones. Failures (slot occupied by something untracked) are silent skips; the next pass retries.
The split is the load-bearing decision. It used to be one loop with both responsibilities, and the bug catalogue was a parade of slot-collision stashes, "conflict-uuid" hacks, and `MoveOnConflict.NEW`/`EXISTING` policy choices. Separating wire transport from path placement made most of that vanish: the wire loop can freely write `remoteRelativePath` to whatever the server returned, even if it disagrees with the file on disk, because the reconciler won't move anything out from under a queued user rename.
Cycles in the move graph (A→B, B→C, C→A) are resolved by reading every file in the cycle into memory and writing each back to its new slot; no tmp files. A write-ahead marker at `.vaultlink/swap-<uuid>.json` lists each leg. On startup the reconciler reads the marker, hashes each `from` to determine which legs ran, and replays the rest. `.vaultlink/**` is hardcoded into the internal ignore pattern so the swap markers never themselves get synced.
## Pending creates are Promises, not strings
When the user creates a file locally and _then_ immediately edits or renames it before the create has been acknowledged, the engine doesn't know the document's id yet; the server assigns it. So queued events for that doc carry a `Promise<DocumentId>` in their `documentId` slot, threaded back to the still-in-flight `LocalCreate`. When the server acks the create, `resolveCreate` fulfils the promise and `replacePendingDocumentId` walks the queue swapping the resolved string into every dependent event.
If you're walking `events[]` and comparing docIds with `===`, you'll silently fail to match until the swap happens. There's a comment in `sync-event-queue.ts` that warns about exactly that, in slightly more alarmed punctuation. The shape is unusual but the alternative (synchronously waiting for the create ack before letting the user type more) is the kind of thing that makes a notes app feel like a 1998 webform.
## MinCovered: the watermark that doesn't lie
The catch-up handshake says "give me everything newer than `lastSeenUpdateId`." If the client advances that id as it receives a stream of RemoteChange ids out of order, it'll publish a too-high cursor, and the next reconnect will request from a point past events it never actually applied. Permanent gap. Replay-forever bug, with extra steps.
The fix is a small data structure called `MinCovered`: a contiguous-prefix tracker over a stream of integers. It advances the public min only when the next consecutive id has been processed. Out-of-order arrivals stash without bumping the cursor. Five files of test, one screen of implementation, and an entire category of confusing data-loss bugs disappears.
## reconcile-text on the server
The merge sits on the server. When two clients submit edits against the same `parent_version_id`, the second submission triggers a 3-way merge against the parent and the freshly-committed first edit. Three strings in, one out. No conflict markers. The engine commits the merged result, increments the version, and broadcasts the new state to every connected client.
Two restrictions, both honest:
- **Only `.md` and `.txt`.** Markdown that fails UTF-8 validation gets treated as binary, same as PNGs and PDFs.
- **Last-write-wins for everything else.** Concurrent edits to a `.docx` lose one of the writes. The right fix is "don't edit binaries concurrently," which is unsatisfying but true.
Merge quality is exactly what reconcile-text gives me. Word-level tokenisation turns most prose conflicts into two adjacent edits that coexist. If the merge looks slightly clumsy now and then, the alternative is a `<<<<<<< HEAD` block in my notes, and I'd take the clumsy sentence every time.
## Two test harnesses, one workflow
Distributed-sync bugs are confusing the first time and impossible the second. The fix is two harnesses:
- **`test-client` (fuzz).** N parallel processes hammering random ops against a shared server for minutes at a time. Catches bugs nobody thought to write a test for. Reproductions are noisy.
- **`deterministic-tests`.** Scripted multi-client scenarios with a step grammar (`pause-server`, `pause-websocket`, `barrier`, `assert-consistent`) using an in-memory filesystem against a real server binary. Used to capture a fuzz-found bug as a minimal repro before fixing it.
The workflow: fuzz finds something, I sift logs for a root cause, write the minimal deterministic test that fails on it, fix until both that test and the fuzz pass. Without the deterministic harness, every bug fix would be vibes-based.
## Smaller calls
- **TS types are generated from Rust via `ts-rs`.** The HTTP/WS API has one source of truth: the Serde types in the server. `scripts/update-api-types.sh` re-emits `frontend/sync-client/src/services/types/`. Hand-edits to those files are explicitly banned.
- **`sqlx::query!` macros over a checked-in `.sqlx` cache.** SQL is verified against the schema at compile time. Touching SQL means re-running `cargo sqlx prepare --workspace`; if you forget, CI catches it.
- **One sync engine, four consumers.** `sync-client` is the engine. Obsidian plugin, standalone CLI, fuzz harness, and deterministic harness all depend on it via `file:../sync-client`. Bugs are fixed once and inherited everywhere.
- **`record.localPath` mutates in place across awaits.** The watcher can rename a doc while a wire-loop handler is mid-HTTP. Snapshotting `localPath` into a local at function entry and reading it after the await reads a vacated slot. Read it live; only snapshot when you deliberately want to compare _before_ and _after_ the await.
- **Watermark advancement is load-bearing both ways.** Branches that skip a remote event without advancing `lastSeenUpdateId` create permanent gaps that re-deliver forever. Branches that advance without applying the content lose data. The rule that survives review is: advance only if you applied the event or deliberately discarded it.
## The race I haven't structurally fixed
Pause-or-disable-sync mid-flight is the one left. An HTTP that committed server-side but whose response was dropped leaves the server holding a doc the client never recorded. On resume, the offline scan finds the file again, uploads it as a new create, and server-side dedupe merges the duplicate into the existing doc. If the merge produces a deconflict file (two real divergences), the user picks up an extra file in their vault. Not data loss, but a small ugliness.
The two-loop split doesn't fix this and probably shouldn't. The honest path is something like a persisted client-side "have I acked this op?" log, sitting in the same SQLite the engine already uses. It's on my list, below several things I want more.
## What I'd change
- **Move the merge to the client.** Right now reconcile-text runs on the server. Putting it in the WASM build of reconcile-text on each client, and letting the server be a dumb commit log, would let the merge benefit from device-specific tokenisers (Markdown-aware on the desktop, word-level on mobile). It would also stop the server from needing to understand the file format at all.
- **Property tests for the move graph.** The cycle resolver is the part I trust least under crash. Snapshot tests can't go where proptest can; I should be generating arbitrary move-graph + interruption combinations.
- **A first-class "pause" with a write-ahead op log.** See above.
- **More than `.md` and `.txt`.** A canvas-aware merge for Obsidian's `.canvas` files is one reconcile-text tokeniser away. Not because anyone asked, but because the asymmetry annoys me.
The way I think about VaultLink now: reconcile-text was the bet. VaultLink is what I built once the bet looked like it might pay off. The interesting part of the bet was always that three independently-edited files can become one without anyone telling the system about the keystrokes that produced them. The interesting part of the application is everything you have to do _around_ that merge to stop the rest of the system from undoing it.

View file

@ -0,0 +1,47 @@
<svg width="200" height="200" viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg">
<defs>
<linearGradient id="grad1" x1="0%" y1="0%" x2="100%" y2="100%">
<stop offset="0%" style="stop-color:#4A90E2;stop-opacity:1" />
<stop offset="100%" style="stop-color:#357ABD;stop-opacity:1" />
</linearGradient>
</defs>
<!-- Background circle -->
<circle cx="100" cy="100" r="90" fill="url(#grad1)" opacity="0.15"/>
<!-- Main vault icon -->
<g transform="translate(100, 100)">
<!-- Vault body -->
<rect x="-45" y="-50" width="90" height="80" rx="8" fill="none" stroke="url(#grad1)" stroke-width="6"/>
<!-- Vault door circle -->
<circle cx="0" cy="-10" r="22" fill="none" stroke="url(#grad1)" stroke-width="5"/>
<circle cx="0" cy="-10" r="14" fill="none" stroke="url(#grad1)" stroke-width="3"/>
<circle cx="0" cy="-10" r="6" fill="url(#grad1)"/>
<!-- Vault handle -->
<line x1="0" y1="-4" x2="18" y2="-4" stroke="url(#grad1)" stroke-width="3" stroke-linecap="round"/>
<circle cx="18" cy="-4" r="4" fill="url(#grad1)"/>
<!-- Link chain -->
<g opacity="0.9">
<!-- Left link -->
<ellipse cx="-30" cy="40" rx="12" ry="8" fill="none" stroke="url(#grad1)" stroke-width="4"/>
<!-- Right link -->
<ellipse cx="30" cy="40" rx="12" ry="8" fill="none" stroke="url(#grad1)" stroke-width="4"/>
<!-- Center link connecting them -->
<ellipse cx="0" cy="40" rx="12" ry="8" fill="none" stroke="url(#grad1)" stroke-width="4"/>
</g>
<!-- Sync arrows (subtle) -->
<g opacity="0.5">
<!-- Clockwise arrow top-right -->
<path d="M 35 -35 Q 50 -35 50 -20 L 50 -15" fill="none" stroke="url(#grad1)" stroke-width="2.5" stroke-linecap="round"/>
<polygon points="50,-15 47,-22 53,-22" fill="url(#grad1)"/>
<!-- Counter-clockwise arrow bottom-left -->
<path d="M -35 25 Q -50 25 -50 10 L -50 5" fill="none" stroke="url(#grad1)" stroke-width="2.5" stroke-linecap="round"/>
<polygon points="-50,5 -47,12 -53,12" fill="url(#grad1)"/>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 2 KiB

View file

@ -0,0 +1,29 @@
---
title: VaultLink
description: Self-hosted Obsidian sync. Two-loop engine (wire and path reconciler), reconcile-text for 3-way merges, ts-rs single-source-of-truth API types.
thumbnail:
src: ./_assets/vault-link.svg
alt: 'The VaultLink logo: a chain-link mark in a soft gradient.'
period: '2025-2026'
sortDate: 2025-12-01
technologies:
[
'Rust',
'axum',
'sqlx',
'SQLite',
'WebSockets',
'TypeScript',
'Obsidian plugin',
'ts-rs',
'wasm-bindgen',
'reconcile-text',
]
selected: true
essay: vault-link-obsidian-sync
links:
- label: Source
url: https://github.com/schmelczer/vault-link
- label: Docs
url: https://vault-link.schmelczer.dev
---