This commit is contained in:
parent
bb5b4c4cf3
commit
018cbf68cf
5 changed files with 121 additions and 0 deletions
|
|
@ -66,6 +66,14 @@ export default defineConfig({
|
|||
image: {
|
||||
service: { entrypoint: 'astro/assets/services/sharp' },
|
||||
},
|
||||
vite: {
|
||||
server: {
|
||||
watch: {
|
||||
// Avoid inotify instance limits in dev containers and mounted volumes.
|
||||
usePolling: true,
|
||||
},
|
||||
},
|
||||
},
|
||||
markdown: {
|
||||
shikiConfig: {
|
||||
themes: {
|
||||
|
|
|
|||
BIN
src/content/posts/_assets/reconcile.png
Normal file
BIN
src/content/posts/_assets/reconcile.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 47 KiB |
89
src/content/posts/reconcile-text-3-way-merge.md
Normal file
89
src/content/posts/reconcile-text-3-way-merge.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
title: A 3-Way Text Merger That Never Shows Conflict Markers
|
||||
description: How reconcile-text borrows the idea of operational transformation and applies it to consolidated diffs to auto-resolve conflicting edits.
|
||||
date: 2026-05-21
|
||||
projectPeriod: '2025'
|
||||
thumbnail:
|
||||
src: ./_assets/reconcile.png
|
||||
alt: The reconcile-text logo and tagline "Conflict-free 3-way text merging".
|
||||
tags: ['systems', 'tools', 'web']
|
||||
selected: true
|
||||
featuredOrder: 2
|
||||
project: reconcile
|
||||
role: Author
|
||||
stack: ['Rust', 'WebAssembly', 'Python', 'pyo3', 'wasm-bindgen']
|
||||
scale: One Rust core, three published packages (crates.io, npm, PyPI), driving an Obsidian sync plugin
|
||||
outcome: A small, well-tested library that fills a gap between git, CRDTs, and patch-based merging
|
||||
audience: recruiter-relevant
|
||||
links:
|
||||
- label: Demo
|
||||
type: demo
|
||||
url: https://schmelczer.dev/reconcile
|
||||
- label: Source
|
||||
type: source
|
||||
url: https://github.com/schmelczer/reconcile
|
||||
- label: crates.io
|
||||
type: package
|
||||
url: https://crates.io/crates/reconcile-text
|
||||
- label: npm
|
||||
type: package
|
||||
url: https://www.npmjs.com/package/reconcile-text
|
||||
- label: PyPI
|
||||
type: package
|
||||
url: https://pypi.org/project/reconcile-text/
|
||||
media:
|
||||
- type: image
|
||||
src: ./_assets/reconcile.png
|
||||
alt: The reconcile-text logo, a stylised merge arrow, with the tagline "Conflict-free 3-way text merging".
|
||||
caption: reconcile-text resolves conflicting edits to prose by weaving them together instead of asking a human to choose.
|
||||
---
|
||||
|
||||
`reconcile-text` started from a concrete need. I wanted to synchronise Markdown notes across devices where the editor was not under my control, and where the only thing I could observe was the final text on each side. Vim on one machine, VS Code on another, Obsidian on a third. No keystroke stream, no operation log, just the documents and a shared common ancestor from the last successful sync.
|
||||
|
||||
That setting is awkward for almost every existing tool. Git is the closest fit, but `git merge-file` answers conflicts with markers, which is exactly what a sync tool cannot ship to a user's note. CRDTs and operational transformation assume you control the editing infrastructure all the way down to the keystroke. `diff-match-patch` produces patches without a common ancestor, and on adjacent edits it silently corrupts the output. None of these matched the shape of the problem I had.
|
||||
|
||||
So I wrote a library that does one specific thing: given a parent and two edited versions, return a single merged text that contains both sets of changes, without conflict markers and without dropping edits on the floor.
|
||||
|
||||
## The Problem
|
||||
|
||||
The hard part is not detecting a conflict. The hard part is resolving it well enough that a human is happy to read the result without thinking about merge mechanics.
|
||||
|
||||
Source code has hard correctness requirements, so refusing to choose and emitting markers is the right default. Human prose is more forgiving. A merged paragraph that is slightly clumsy is almost always preferable to one that interrupts the reader with `<<<<<<< HEAD`. That observation is the entire reason this library exists in the form it does.
|
||||
|
||||
The challenge was to commit to that asymmetry honestly. The library should always produce a result. It should never silently lose an edit. It should preserve cursors so a collaborative editor can rely on it. And it should do all of this from end states alone, with no operation history available.
|
||||
|
||||
## Constraints
|
||||
|
||||
The library had to live in three places: a Rust crate, a JavaScript package built through WebAssembly, and a Python package built through `pyo3`. The cross-language story was a constraint, not a stretch goal. The Obsidian plugin I was writing alongside it consumed the npm build, but I also wanted a clean Rust crate for sync engines and a Python package for scripting.
|
||||
|
||||
That ruled out anything that depended on language-specific runtime tricks. Generics, closures, and trait objects could live freely inside the Rust core, but the public surface had to be flat enough to cross both `wasm-bindgen` and `pyo3` without per-binding glue.
|
||||
|
||||
It also had to be predictable. There is no async story, no networking, no concurrency. A merge is a pure function from three strings to one string with some metadata. Everything that is not the merge itself was deliberately kept out.
|
||||
|
||||
## Design
|
||||
|
||||
The pipeline is short. The library tokenises the parent and the two edited versions, runs Myers' diff to compare each edited version against the parent, optimises the resulting edit sequences so that adjacent changes group together cleanly, and then weaves the two diffs into a single ordered sequence of operations that produces the merged text.
|
||||
|
||||
The weaving step borrows the concept of operational transformation, but applies it to a different problem. Classic OT transforms individual keystrokes against each other in real time. Here, OT is applied to the consolidated diff output of two complete edits. The structure is similar, but the inputs are batched and the algorithm only needs to run once per merge point. It became the simplest way I could find to describe how two sets of changes should be interleaved.
|
||||
|
||||
The tokeniser turned out to be more important than I initially expected. It is what decides whether a conflict exists in the first place. Word-level tokenisation, the default for prose, often turns a "conflict" into two adjacent independent edits that can coexist. Line-level tokenisation makes the library behave more like `git merge-file`. Markdown-level tokenisation merges on headings and list items rather than characters. Exposing this as a user-facing knob meant the library could be shaped to the document, not the other way around.
|
||||
|
||||
Cursors and selections were added as first-class merge inputs rather than something users reconstruct after the fact. Each cursor carries a stable ID and rides through the merge, ending up at a sensible position even when both sides edited the surrounding text. This is what made the library useful to anything resembling a collaborative editor.
|
||||
|
||||
The cross-language surface needed extra care. The tokeniser inside Rust is a `dyn Fn(&str) -> Vec<Token<T>>`, which is convenient in Rust and impossible to pass through `wasm-bindgen` or `pyo3`. The fix was to expose a closed enum of built-in tokenisers to non-Rust callers and reserve the generic version for Rust users. WebAssembly users also paid a real binary-size cost, so the release profile is tuned aggressively, and the JS package ships a small leak detector to remind callers that wasm-bindgen objects must be freed explicitly.
|
||||
|
||||
## What Worked
|
||||
|
||||
The strongest part of the project is that the result never has conflict markers and never silently drops an edit. That sounds modest, but it is exactly the property that makes the library usable inside a sync engine without an escape hatch.
|
||||
|
||||
Choosing the tokeniser as the main user-facing knob also held up well. Most of the "tuning" people want when merging prose is not a different algorithm, it is a different idea of what counts as a unit. Letting users choose between character, word, line, and Markdown granularity covered the realistic cases without inventing new merge strategies.
|
||||
|
||||
The comparison example against `diff-match-patch` was probably the most useful piece of writing in the repository. It is a runnable program, not a benchmark table, showing concrete cases where a popular alternative quietly produces wrong output. Having that as a falsifiable claim in the source tree made the value proposition much clearer than any prose description would have.
|
||||
|
||||
## What I Would Change
|
||||
|
||||
If I revisited this now, I would invest more in formal property tests around the merge. Three-way merging is exactly the kind of problem where generated inputs find behaviours that hand-written tests do not, and the snapshot tests I have are good at catching regressions but not at finding unknown edge cases.
|
||||
|
||||
I would also be more explicit about the boundary the library does not cross. It is a merge point primitive, not a live collaboration engine. CRDTs and OT remain the right tools when you actually have a keystroke stream and a real-time channel. `reconcile-text` is for the part of the problem space where you do not.
|
||||
|
||||
The part I would keep is the asymmetry the project rests on. Human text deserves a merger that prefers a slightly imperfect sentence over a conflict marker, and that decision is what shaped every other choice in the design.
|
||||
BIN
src/content/projects/_assets/reconcile.png
Normal file
BIN
src/content/projects/_assets/reconcile.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 47 KiB |
24
src/content/projects/reconcile.md
Normal file
24
src/content/projects/reconcile.md
Normal file
|
|
@ -0,0 +1,24 @@
|
|||
---
|
||||
sourceProjectId: reconcile
|
||||
title: reconcile-text
|
||||
description: A Rust library that auto-resolves conflicting text edits without conflict markers, with WebAssembly and Python bindings.
|
||||
thumbnail:
|
||||
src: ./_assets/reconcile.png
|
||||
alt: The reconcile-text logo and tagline "Conflict-free 3-way text merging".
|
||||
period: '2025'
|
||||
sortDate: 2025-05-01
|
||||
technologies: ['Rust', 'WebAssembly', 'Python', 'pyo3', 'Operational Transformation']
|
||||
selected: true
|
||||
essay: reconcile-text-3-way-merge
|
||||
links:
|
||||
- label: Demo
|
||||
url: https://schmelczer.dev/reconcile
|
||||
- label: Source
|
||||
url: https://github.com/schmelczer/reconcile
|
||||
- label: crates.io
|
||||
url: https://crates.io/crates/reconcile-text
|
||||
- label: npm
|
||||
url: https://www.npmjs.com/package/reconcile-text
|
||||
- label: PyPI
|
||||
url: https://pypi.org/project/reconcile-text/
|
||||
---
|
||||
Loading…
Add table
Add a link
Reference in a new issue