diff --git a/.github/workflows/check.yml b/.github/workflows/check.yml index c977dc4..f2c84d7 100644 --- a/.github/workflows/check.yml +++ b/.github/workflows/check.yml @@ -43,3 +43,44 @@ jobs: cargo test --features serde cargo test --features wasm wasm-pack test --node --features wasm + + publish: + needs: build + runs-on: ubuntu-latest + if: github.ref == 'refs/heads/main' + + steps: + - uses: actions/checkout@v4 + + - name: Setup Node.js environment + uses: actions/setup-node@v4.2.0 + with: + node-version: "22.x" + check-latest: true + registry-url: 'https://registry.npmjs.org' + + - name: Setup rust + run: | + cargo install wasm-pack + + - name: Build wasm + run: | + wasm-pack build --target web --features wasm + + - name: Publish to crates.io + run: cargo publish --token ${{ secrets.CRATES_TOKEN }} + continue-on-error: true + + - name: Build reconcile-js + run: | + cd reconcile-js + npm ci + npm run build + + - name: Publish reconcile-js to NPM + run: | + cd reconcile-js + npm publish + env: + NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} + continue-on-error: true diff --git a/.github/workflows/gh-pages.yml b/.github/workflows/gh-pages.yml index b6eafaa..612bb8f 100644 --- a/.github/workflows/gh-pages.yml +++ b/.github/workflows/gh-pages.yml @@ -15,7 +15,7 @@ permissions: # Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. # However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. concurrency: - group: "pages" + group: 'pages' cancel-in-progress: false jobs: @@ -31,13 +31,17 @@ jobs: run: | cargo install wasm-pack wasm-pack build --target web --features wasm - cp -R pkg/reconcile.js examples/website/ - cp -R pkg/reconcile_bg.wasm examples/website/ + cd reconcile-js + npm ci + npm run build + cd ../examples/website + npm ci + npm run build - name: Upload artifact uses: actions/upload-pages-artifact@v3 with: - path: examples/website + path: examples/website/dist deploy: environment: diff --git a/.prettierrc b/.prettierrc index 3885ed5..a07289d 100644 --- a/.prettierrc +++ b/.prettierrc @@ -3,8 +3,5 @@ "printWidth": 90, "tabWidth": 2, "singleQuote": true, - "endOfLine": "lf", - "importOrder": ["^[./]", ".*", ".scss$"], - "importOrderSeparation": true, - "importOrderSortSpecifiers": true -} + "endOfLine": "lf" +} \ No newline at end of file diff --git a/README.md b/README.md index f54d845..b73c375 100644 --- a/README.md +++ b/README.md @@ -3,148 +3,172 @@ [![Check](https://github.com/schmelczer/reconcile/actions/workflows/check.yml/badge.svg)](https://github.com/schmelczer/reconcile/actions/workflows/check.yml) [![Publish to GitHub Pages](https://github.com/schmelczer/reconcile/actions/workflows/gh-pages.yml/badge.svg)](https://github.com/schmelczer/reconcile/actions/workflows/gh-pages.yml) -> [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) (or `git merge`) but with automatic conflict resolution. +> Think [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) or `git merge`, but with intelligent conflict resolution that just works. -Reconcile is a Rust and JavaScript (through WebAssembly) library for merging text without user intervention. It automatically resolves conflicts that would typically require user action in traditional 3-way merge tools. +Reconcile is a Rust and JavaScript (via WebAssembly) library that merges conflicting text edits without requiring manual intervention. Where traditional 3-way merge tools would leave you with conflict markers to resolve by hand, Reconcile automatically weaves changes together using sophisticated algorithms inspired by Operational Transformation. -Try out the [interactive demo](https://schmelczer.dev/reconcile)! +✨ **[Try the interactive demo](https://schmelczer.dev/reconcile)** to see it in action! TODO: add links for crates and npm -## Features +## What makes Reconcile special? -- **Conflict-free output** - No more git conflict markers in the result -- **Cursor/selection position tracking** - Automatically updates cursor positions during merging -- **Pluggable tokenizer** - Choose between word-level, character-level, or custom tokenization -- **Full UTF-8 support** - Handles Unicode text correctly -- **WebAssembly support** - Use from JavaScript/TypeScript applications +- **🚫 No conflict markers** — Clean, merged output without Git's `<<<<<<<` noise +- **📍 Cursor tracking** — Automatically repositions cursors and selections during merging +- **🔧 Flexible tokenisation** — Word-level (default), character-level, or custom strategies +- **🌍 Unicode-first** — Full UTF-8 support +- **🕸️ Cross-platform** — Native Rust performance with WebAssembly for JavaScript -## Quick Start +## Quick start ### Rust -Add to your `Cargo.toml`: +Add `reconcile` to your `Cargo.toml`: ```toml [dependencies] reconcile = "0.4" ``` +Then merge away: + ```rust use reconcile::{reconcile, BuiltinTokenizer}; +// Start with original text let parent = "Hello world"; -let left = "Hello beautiful world"; -let right = "Hi world"; +// Two people edit simultaneously +let left = "Hello beautiful world"; // Added "beautiful" +let right = "Hi world"; // Changed "Hello" to "Hi" +// Reconcile combines both changes intelligently let result = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Word); assert_eq!(result.apply().text(), "Hi beautiful world"); ``` ### JavaScript/TypeScript +Install via npm: + ```bash npm install reconcile ``` -```javascript -import { init, reconcile } from "reconcile"; +Then use in your application: -// Initialize the WASM module (required before first use) +```javascript +import { init, reconcile } from 'reconcile'; + +// One-time setup: initialise the WASM module await init(); -const parent = "Hello world"; -const left = "Hello beautiful world"; -const right = "Hi world"; +// Same example as above +const parent = 'Hello world'; +const left = 'Hello beautiful world'; +const right = 'Hi world'; const result = reconcile(parent, left, right); console.log(result.text); // "Hi beautiful world" ``` -## API +## Advanced usage -### Tokenizers +### Edit provenance -Reconcile supports different tokenization strategies: - -- **Word tokenizer** (`BuiltinTokenizer::Word`): Splits text into words (default, recommended for most use cases) -- **Character tokenizer** (`BuiltinTokenizer::Character`): Splits text into individual characters (fine-grained merging) -- **Custom tokenizer**: Implement your own tokenization logic - -### Cursor Tracking - -Reconcile can automatically update cursor and selection positions during merging: - -```javascript -const result = reconcile( - "Hello world", - { - text: "Hello beautiful world", - cursors: [{ id: 1, position: 6 }], // After "Hello " - }, - { - text: "Hi world", - cursors: [{ id: 2, position: 0 }], // At beginning - } -); - -// Result includes updated cursor positions -console.log(result.cursors); // [{ id: 1, position: 3 }, { id: 2, position: 0 }] -``` - -### History Tracking - -Use `reconcileWithHistory` to get detailed information about the merge process: +Track which changes came from where using `reconcileWithHistory`: ```javascript const result = reconcileWithHistory(parent, left, right); -console.log(result.history); // Array of spans with their origins +console.log(result.history); // Detailed breakdown of each text span's origin ``` -## Algorithm +### Tokenisation strategies -The algorithm starts similarly to `diff3`. Its inputs are a **parent** document and two conflicting versions: `left` and `right` which have been created from the parent through any series of concurrent edits. +Reconcile offers different ways to split text for merging: -1. **Diff calculation**: First, 2-way diffs of (parent & left) and (parent & right) are computed using Myers' algorithm -2. **Tokenization**: The text is split into tokens (words, characters, etc.) for granular merging -3. **Diff cleaning**: The tokens of the same diff are reordered and merged to end up to maximise patch sizes -4. **Operation transformation (OT)**: The resulting edits are weaved together using operational transformation principles, ensuring no changes are lost +- **Word tokeniser** (`BuiltinTokenizer::Word`) — Splits on word boundaries (recommended for prose) +- **Character tokeniser** (`BuiltinTokenizer::Character`) — Individual characters (fine-grained control) +- **Line tokeniser** (`BuiltinTokenizer::Line`) — Line-by-line (similar to `git merge`) +- **Custom tokeniser** — Roll your own for specialised use cases -`EditedText` (at least in the Rust library) exposes an implementation of OT. The primary purpose of this library isn't to implement OT but to provide automated text merging, howver, OT happens to provide an easy way of merging the output of Myers' diff. The same result could be achieved through many CRDT implementations as well. However, the merging quality is only as good as the 2-way diffs are. For instance, `reconcile` doesn't support `move` semantics as these are decomposed into an `insert` and `delete` operation by Myers'. +### Cursor tracking -## Motivation +Ideal for collaborative editors — Reconcile tracks cursor positions through merges: -Sometimes documents get edited concurrently by multiple users (or the same user from multiple devices) resulting in divergent changes. +```javascript +const result = reconcile( + 'Hello world', + { + text: 'Hello beautiful world', + cursors: [{ id: 1, position: 6 }], // After "Hello " + }, + { + text: 'Hi world', + cursors: [{ id: 2, position: 0 }], // At the beginning + } +); -To allow for offline editing, we could use CRDTs or Operational Transformation (OT) to come to a consistent resolution of the competing version. However, this requires capturing all user actions: insertions, deletes, move, copies, and pastes. In some applications, this is trivial if the document can only be edited through an editor that's in our control. But this isn't always the case. Users enjoy composable systems that don't lock them in. For example, one of the unique selling points of Obsidian is to provide an editor experience over a folder of Markdown files leaving the user free to change their technology of choice on a whim. +// Cursors are automatically repositioned in the merged text +console.log(result.cursors); // [{ id: 1, position: 3 }, { id: 2, position: 0 }] +``` -This means that files can be edited out-of-channel and the only information a text synchronization system can know is the current content of each tracked file. This is described as Differential Synchronization [1]. This is the same problem as what Git and similar version control systems solve but in a manual way. Although the problem is similar, there's a relevant difference between syncing source code and personal notes: in the case of the former, a semantically incorrect conflict resolution can wreak havoc in a code base, or worse, introduce a correctness bug unnoticed. Text notes are different though, humans are well-equipped to finding the signal in a noisy environment and "bad merges" might result in a clumsy sentence but the reader will likely still understand the gist and can fix it if necessary. +## How it works -> There are domains of human text which are less tolerant of mis-merges: for instance, two conflicting changes to a contract could result in a term getting negated in different ways from both sides, resulting in a double-negation, thus unknowingly changing the meaning. +Reconcile builds upon the foundation of `diff3` but adds intelligent conflict resolution. Given a **parent** document and two modified versions (`left` and `right`), here's what happens: + +1. **Diff computation** — Myers' algorithm calculates differences between (parent ↔ left) and (parent ↔ right) +2. **Tokenisation** — Text splits into meaningful units (words, characters, etc.) for granular merging +3. **Diff optimisation** — Operations are reordered and consolidated to maximise coherent changes +4. **Operational Transformation** — Edits are woven together using OT principles, preserving all modifications + +Whilst Reconcile's primary goal isn't implementing Operational Transformation, OT provides an elegant way to merge Myers' diff output. The same could be achieved with CRDTs, though the quality depends entirely on the underlying 2-way diffs. Note that `move` operations aren't supported, as Myers' algorithm decomposes them into separate `insert` and `delete` operations. + +## Why Reconcile exists + +Collaborative editing is everywhere — multiple users editing documents simultaneously, or the same person working across devices. This creates the inevitable challenge of conflicting changes. + +Traditional solutions like CRDTs or Operational Transformation work brilliantly when you control the entire editing environment. They capture every keystroke, cursor movement, and operation. But real-world workflows are messier: users love tools that don't lock them in. Take Obsidian's approach with plain Markdown files — users can edit with any tool they fancy, from Vim to Word. + +This creates what's known as **Differential Synchronisation** [¹]: you only know the final state of each document, not how it got there. It's the same challenge Git tackles, but Git expects humans to resolve conflicts manually. + +Here's the key insight: whilst incorrect merges in source code can introduce devastating bugs, human text is more forgiving. People excel at extracting meaning from imperfect text — a slightly clumsy sentence is preferable to conflict markers interrupting the flow. + +> **Caveat**: Some text domains are less tolerant of imperfect merges. Legal contracts, for instance, could have unintended meaning changes from double-negations created by conflicting edits. ## Development ### Prerequisites -#### Install Node.js +#### Node.js setup -- Install [nvm](https://github.com/nvm-sh/nvm): `curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash` -- `nvm install 22` -- `nvm use 22` -- Optionally set the system-wide default: `nvm alias default 22` +1. Install [nvm](https://github.com/nvm-sh/nvm): + ```bash + curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash + ``` +2. Install and use Node 22: + ```bash + nvm install 22 && nvm use 22 + ``` +3. Optionally set as default: `nvm alias default 22` -#### Set up Rust +#### Rust toolchain -- Install [`rustup`](https://rustup.rs): `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh` -- `cargo install wasm-pack cargo-insta cargo-edit` +1. Install [rustup](https://rustup.rs): + ```bash + curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh + ``` +2. Install additional tools: + ```bash + cargo install wasm-pack cargo-insta cargo-edit + ``` -### Scripts +### Development scripts -- **Running tests**: `scripts/test.sh` -- **Formatting**: `scripts/lint.sh` -- **Building website**: `scripts/dev-website.sh` -- **Publishing new version**: `scripts/bump-version.sh patch` +- **Run tests**: `scripts/test.sh` +- **Lint and format**: `scripts/lint.sh` +- **Build demo website**: `scripts/dev-website.sh` +- **Publish new version**: `scripts/bump-version.sh patch` TODO: license -[1]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35605.pdf +[¹]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35605.pdf diff --git a/examples/website/index.html b/examples/website/index.html index 9a2c5f1..c570dfd 100644 --- a/examples/website/index.html +++ b/examples/website/index.html @@ -8,18 +8,18 @@ /> - + - 3-Way Text Merge + Reconcile: conflict-free text merging @@ -28,49 +28,58 @@
-

Reconcile: automated 3-way text merge

+

Reconcile: conflict-free 3-way text merging

- The - reconcile - library solves a fundamental challenge in collaborative editing: what happens - when multiple users edit the same text simultaneously but we can only capture - the end result, not the intermediary edits? Essentially, it's + Think diff3 - (or git merge) but with automatic conflict resolution. -

-

- The - reconcile(parent: str, left: str, right: str) -> str - takes conflicting concurrent edits and intelligently merges them into a - unified result. Beyond basic conflict resolution, it offers sophisticated - merging heuristics, flexible tokenization options, and cursor position - tracking. -

-

- The algorithm begins with your chosen tokenizer, then applies Myers' diff - algorithm to compare the original text with both conflicting versions. These - diffs undergo transformation to preserve meaningful change sequences, before a - final merge strategy—inspired by Operational Transformation reconciles all - conflicting modifications without losing any edits. -

-

- For more details, see the - README. + or git merge, but with intelligent conflict resolution that + requires no user intervention. The + Reconcile + library tackles a fundamental challenge in collaborative editing: what happens + when multiple users edit the same text simultaneously, but the conflict + resolver only has access to the final results, not the intermediate steps?

- Use the tokenization options below to experiment with different strategies. - The library supports user-defined tokenizers as well. + Where traditional merge tools leave you with conflict markers to resolve + manually, Reconcile automatically weaves changes together. The + reconcile(parent, left, right) function takes conflicting edits + and produces clean, unified results using an algorithm inspired by Operational + Transformation. No more <<<<<<< markers + cluttering your text. +

+ +

+ The process starts with your chosen tokenisation strategy, then applies Myers' + diff algorithm to compare the original with both modified versions. These + diffs are optimised and transformed to preserve meaningful changes, before a + final merge strategy combines all modifications without losing any edits. +

+ +

+ Ready to dive deeper? Check out the + documentation + or try editing the text boxes below to see Reconcile in action. +

+ +

+ Use the tokenisation options below to experiment with different approaches — + the library also supports custom tokenisers.

@@ -87,7 +96,9 @@
Character - Split by individual characters + Fine-grained character-level merging
@@ -120,7 +131,7 @@
@@ -129,9 +140,9 @@
@@ -140,9 +151,9 @@
@@ -151,9 +162,9 @@