Update docs
This commit is contained in:
parent
3b32d56915
commit
564dfe4418
2 changed files with 149 additions and 119 deletions
180
README.md
180
README.md
|
|
@ -3,148 +3,172 @@
|
|||
[](https://github.com/schmelczer/reconcile/actions/workflows/check.yml)
|
||||
[](https://github.com/schmelczer/reconcile/actions/workflows/gh-pages.yml)
|
||||
|
||||
> [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) (or `git merge`) but with automatic conflict resolution.
|
||||
> Think [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) or `git merge`, but with intelligent conflict resolution that just works.
|
||||
|
||||
Reconcile is a Rust and JavaScript (through WebAssembly) library for merging text without user intervention. It automatically resolves conflicts that would typically require user action in traditional 3-way merge tools.
|
||||
Reconcile is a Rust and JavaScript (via WebAssembly) library that merges conflicting text edits without requiring manual intervention. Where traditional 3-way merge tools would leave you with conflict markers to resolve by hand, Reconcile automatically weaves changes together using sophisticated algorithms inspired by Operational Transformation.
|
||||
|
||||
Try out the [interactive demo](https://schmelczer.dev/reconcile)!
|
||||
✨ **[Try the interactive demo](https://schmelczer.dev/reconcile)** to see it in action!
|
||||
|
||||
TODO: add links for crates and npm
|
||||
|
||||
## Features
|
||||
## What makes Reconcile special?
|
||||
|
||||
- **Conflict-free output** - No more git conflict markers in the result
|
||||
- **Cursor/selection position tracking** - Automatically updates cursor positions during merging
|
||||
- **Pluggable tokenizer** - Choose between word-level, character-level, or custom tokenization
|
||||
- **Full UTF-8 support** - Handles Unicode text correctly
|
||||
- **WebAssembly support** - Use from JavaScript/TypeScript applications
|
||||
- **🚫 No conflict markers** — Clean, merged output without Git's `<<<<<<<` noise
|
||||
- **📍 Cursor tracking** — Automatically repositions cursors and selections during merging
|
||||
- **🔧 Flexible tokenisation** — Word-level (default), character-level, or custom strategies
|
||||
- **🌍 Unicode-first** — Full UTF-8 support
|
||||
- **🕸️ Cross-platform** — Native Rust performance with WebAssembly for JavaScript
|
||||
|
||||
## Quick Start
|
||||
## Quick start
|
||||
|
||||
### Rust
|
||||
|
||||
Add to your `Cargo.toml`:
|
||||
Add `reconcile` to your `Cargo.toml`:
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
reconcile = "0.4"
|
||||
```
|
||||
|
||||
Then merge away:
|
||||
|
||||
```rust
|
||||
use reconcile::{reconcile, BuiltinTokenizer};
|
||||
|
||||
// Start with original text
|
||||
let parent = "Hello world";
|
||||
let left = "Hello beautiful world";
|
||||
let right = "Hi world";
|
||||
// Two people edit simultaneously
|
||||
let left = "Hello beautiful world"; // Added "beautiful"
|
||||
let right = "Hi world"; // Changed "Hello" to "Hi"
|
||||
|
||||
// Reconcile combines both changes intelligently
|
||||
let result = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Word);
|
||||
assert_eq!(result.apply().text(), "Hi beautiful world");
|
||||
```
|
||||
|
||||
### JavaScript/TypeScript
|
||||
|
||||
Install via npm:
|
||||
|
||||
```bash
|
||||
npm install reconcile
|
||||
```
|
||||
|
||||
```javascript
|
||||
import { init, reconcile } from "reconcile";
|
||||
Then use in your application:
|
||||
|
||||
// Initialize the WASM module (required before first use)
|
||||
```javascript
|
||||
import { init, reconcile } from 'reconcile';
|
||||
|
||||
// One-time setup: initialise the WASM module
|
||||
await init();
|
||||
|
||||
const parent = "Hello world";
|
||||
const left = "Hello beautiful world";
|
||||
const right = "Hi world";
|
||||
// Same example as above
|
||||
const parent = 'Hello world';
|
||||
const left = 'Hello beautiful world';
|
||||
const right = 'Hi world';
|
||||
|
||||
const result = reconcile(parent, left, right);
|
||||
console.log(result.text); // "Hi beautiful world"
|
||||
```
|
||||
|
||||
## API
|
||||
## Advanced usage
|
||||
|
||||
### Tokenizers
|
||||
### Edit provenance
|
||||
|
||||
Reconcile supports different tokenization strategies:
|
||||
|
||||
- **Word tokenizer** (`BuiltinTokenizer::Word`): Splits text into words (default, recommended for most use cases)
|
||||
- **Character tokenizer** (`BuiltinTokenizer::Character`): Splits text into individual characters (fine-grained merging)
|
||||
- **Custom tokenizer**: Implement your own tokenization logic
|
||||
|
||||
### Cursor Tracking
|
||||
|
||||
Reconcile can automatically update cursor and selection positions during merging:
|
||||
|
||||
```javascript
|
||||
const result = reconcile(
|
||||
"Hello world",
|
||||
{
|
||||
text: "Hello beautiful world",
|
||||
cursors: [{ id: 1, position: 6 }], // After "Hello "
|
||||
},
|
||||
{
|
||||
text: "Hi world",
|
||||
cursors: [{ id: 2, position: 0 }], // At beginning
|
||||
}
|
||||
);
|
||||
|
||||
// Result includes updated cursor positions
|
||||
console.log(result.cursors); // [{ id: 1, position: 3 }, { id: 2, position: 0 }]
|
||||
```
|
||||
|
||||
### History Tracking
|
||||
|
||||
Use `reconcileWithHistory` to get detailed information about the merge process:
|
||||
Track which changes came from where using `reconcileWithHistory`:
|
||||
|
||||
```javascript
|
||||
const result = reconcileWithHistory(parent, left, right);
|
||||
console.log(result.history); // Array of spans with their origins
|
||||
console.log(result.history); // Detailed breakdown of each text span's origin
|
||||
```
|
||||
|
||||
## Algorithm
|
||||
### Tokenisation strategies
|
||||
|
||||
The algorithm starts similarly to `diff3`. Its inputs are a **parent** document and two conflicting versions: `left` and `right` which have been created from the parent through any series of concurrent edits.
|
||||
Reconcile offers different ways to split text for merging:
|
||||
|
||||
1. **Diff calculation**: First, 2-way diffs of (parent & left) and (parent & right) are computed using Myers' algorithm
|
||||
2. **Tokenization**: The text is split into tokens (words, characters, etc.) for granular merging
|
||||
3. **Diff cleaning**: The tokens of the same diff are reordered and merged to end up to maximise patch sizes
|
||||
4. **Operation transformation (OT)**: The resulting edits are weaved together using operational transformation principles, ensuring no changes are lost
|
||||
- **Word tokeniser** (`BuiltinTokenizer::Word`) — Splits on word boundaries (recommended for prose)
|
||||
- **Character tokeniser** (`BuiltinTokenizer::Character`) — Individual characters (fine-grained control)
|
||||
- **Line tokeniser** (`BuiltinTokenizer::Line`) — Line-by-line (similar to `git merge`)
|
||||
- **Custom tokeniser** — Roll your own for specialised use cases
|
||||
|
||||
`EditedText` (at least in the Rust library) exposes an implementation of OT. The primary purpose of this library isn't to implement OT but to provide automated text merging, howver, OT happens to provide an easy way of merging the output of Myers' diff. The same result could be achieved through many CRDT implementations as well. However, the merging quality is only as good as the 2-way diffs are. For instance, `reconcile` doesn't support `move` semantics as these are decomposed into an `insert` and `delete` operation by Myers'.
|
||||
### Cursor tracking
|
||||
|
||||
## Motivation
|
||||
Ideal for collaborative editors — Reconcile tracks cursor positions through merges:
|
||||
|
||||
Sometimes documents get edited concurrently by multiple users (or the same user from multiple devices) resulting in divergent changes.
|
||||
```javascript
|
||||
const result = reconcile(
|
||||
'Hello world',
|
||||
{
|
||||
text: 'Hello beautiful world',
|
||||
cursors: [{ id: 1, position: 6 }], // After "Hello "
|
||||
},
|
||||
{
|
||||
text: 'Hi world',
|
||||
cursors: [{ id: 2, position: 0 }], // At the beginning
|
||||
}
|
||||
);
|
||||
|
||||
To allow for offline editing, we could use CRDTs or Operational Transformation (OT) to come to a consistent resolution of the competing version. However, this requires capturing all user actions: insertions, deletes, move, copies, and pastes. In some applications, this is trivial if the document can only be edited through an editor that's in our control. But this isn't always the case. Users enjoy composable systems that don't lock them in. For example, one of the unique selling points of Obsidian is to provide an editor experience over a folder of Markdown files leaving the user free to change their technology of choice on a whim.
|
||||
// Cursors are automatically repositioned in the merged text
|
||||
console.log(result.cursors); // [{ id: 1, position: 3 }, { id: 2, position: 0 }]
|
||||
```
|
||||
|
||||
This means that files can be edited out-of-channel and the only information a text synchronization system can know is the current content of each tracked file. This is described as Differential Synchronization [1]. This is the same problem as what Git and similar version control systems solve but in a manual way. Although the problem is similar, there's a relevant difference between syncing source code and personal notes: in the case of the former, a semantically incorrect conflict resolution can wreak havoc in a code base, or worse, introduce a correctness bug unnoticed. Text notes are different though, humans are well-equipped to finding the signal in a noisy environment and "bad merges" might result in a clumsy sentence but the reader will likely still understand the gist and can fix it if necessary.
|
||||
## How it works
|
||||
|
||||
> There are domains of human text which are less tolerant of mis-merges: for instance, two conflicting changes to a contract could result in a term getting negated in different ways from both sides, resulting in a double-negation, thus unknowingly changing the meaning.
|
||||
Reconcile builds upon the foundation of `diff3` but adds intelligent conflict resolution. Given a **parent** document and two modified versions (`left` and `right`), here's what happens:
|
||||
|
||||
1. **Diff computation** — Myers' algorithm calculates differences between (parent ↔ left) and (parent ↔ right)
|
||||
2. **Tokenisation** — Text splits into meaningful units (words, characters, etc.) for granular merging
|
||||
3. **Diff optimisation** — Operations are reordered and consolidated to maximise coherent changes
|
||||
4. **Operational Transformation** — Edits are woven together using OT principles, preserving all modifications
|
||||
|
||||
Whilst Reconcile's primary goal isn't implementing Operational Transformation, OT provides an elegant way to merge Myers' diff output. The same could be achieved with CRDTs, though the quality depends entirely on the underlying 2-way diffs. Note that `move` operations aren't supported, as Myers' algorithm decomposes them into separate `insert` and `delete` operations.
|
||||
|
||||
## Why Reconcile exists
|
||||
|
||||
Collaborative editing is everywhere — multiple users editing documents simultaneously, or the same person working across devices. This creates the inevitable challenge of conflicting changes.
|
||||
|
||||
Traditional solutions like CRDTs or Operational Transformation work brilliantly when you control the entire editing environment. They capture every keystroke, cursor movement, and operation. But real-world workflows are messier: users love tools that don't lock them in. Take Obsidian's approach with plain Markdown files — users can edit with any tool they fancy, from Vim to Word.
|
||||
|
||||
This creates what's known as **Differential Synchronisation** [¹]: you only know the final state of each document, not how it got there. It's the same challenge Git tackles, but Git expects humans to resolve conflicts manually.
|
||||
|
||||
Here's the key insight: whilst incorrect merges in source code can introduce devastating bugs, human text is more forgiving. People excel at extracting meaning from imperfect text — a slightly clumsy sentence is preferable to conflict markers interrupting the flow.
|
||||
|
||||
> **Caveat**: Some text domains are less tolerant of imperfect merges. Legal contracts, for instance, could have unintended meaning changes from double-negations created by conflicting edits.
|
||||
|
||||
## Development
|
||||
|
||||
### Prerequisites
|
||||
|
||||
#### Install Node.js
|
||||
#### Node.js setup
|
||||
|
||||
- Install [nvm](https://github.com/nvm-sh/nvm): `curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash`
|
||||
- `nvm install 22`
|
||||
- `nvm use 22`
|
||||
- Optionally set the system-wide default: `nvm alias default 22`
|
||||
1. Install [nvm](https://github.com/nvm-sh/nvm):
|
||||
```bash
|
||||
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
|
||||
```
|
||||
2. Install and use Node 22:
|
||||
```bash
|
||||
nvm install 22 && nvm use 22
|
||||
```
|
||||
3. Optionally set as default: `nvm alias default 22`
|
||||
|
||||
#### Set up Rust
|
||||
#### Rust toolchain
|
||||
|
||||
- Install [`rustup`](https://rustup.rs): `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh`
|
||||
- `cargo install wasm-pack cargo-insta cargo-edit`
|
||||
1. Install [rustup](https://rustup.rs):
|
||||
```bash
|
||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
||||
```
|
||||
2. Install additional tools:
|
||||
```bash
|
||||
cargo install wasm-pack cargo-insta cargo-edit
|
||||
```
|
||||
|
||||
### Scripts
|
||||
### Development scripts
|
||||
|
||||
- **Running tests**: `scripts/test.sh`
|
||||
- **Formatting**: `scripts/lint.sh`
|
||||
- **Building website**: `scripts/dev-website.sh`
|
||||
- **Publishing new version**: `scripts/bump-version.sh patch`
|
||||
- **Run tests**: `scripts/test.sh`
|
||||
- **Lint and format**: `scripts/lint.sh`
|
||||
- **Build demo website**: `scripts/dev-website.sh`
|
||||
- **Publish new version**: `scripts/bump-version.sh patch`
|
||||
|
||||
TODO: license
|
||||
|
||||
[1]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35605.pdf
|
||||
[¹]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35605.pdf
|
||||
|
|
|
|||
88
src/lib.rs
88
src/lib.rs
|
|
@ -1,80 +1,86 @@
|
|||
//! # Reconcile
|
||||
//! # Reconcile: conflict-free 3-way text merging
|
||||
//!
|
||||
//! [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) (or `git merge`)
|
||||
//! but with automatic conflict resolution.
|
||||
//! Think [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) or `git merge`,
|
||||
//! but with intelligent conflict resolution.
|
||||
//!
|
||||
//! Reconcile is a Rust and JavaScript (through WebAssembly) library for merging
|
||||
//! text without user intervention. It automatically resolves conflicts that
|
||||
//! would typically require user action in traditional 3-way merge tools.
|
||||
//! Reconcile is a Rust and JavaScript (via WebAssembly) library that merges
|
||||
//! conflicting text edits without requiring manual intervention. Where
|
||||
//! traditional 3-way merge tools would leave you with conflict markers to
|
||||
//! resolve by hand, Reconcile automatically weaves changes together using
|
||||
//! sophisticated algorithms inspired by Operational Transformation.
|
||||
//!
|
||||
//! Try out the [interactive demo](https://schmelczer.dev/reconcile)!
|
||||
//! ✨ **[Try the interactive demo](https://schmelczer.dev/reconcile)** to see it in action!
|
||||
//!
|
||||
//! ```
|
||||
//! use reconcile::{reconcile, BuiltinTokenizer};
|
||||
//!
|
||||
//! // Start with original text
|
||||
//! let parent = "Merging text is hard!";
|
||||
//! let left = "Merging text is easy!";
|
||||
//! let right = "With reconcile, merging documents is hard!";
|
||||
//! // Two people edit simultaneously
|
||||
//! let left = "Merging text is easy!"; // Changed "hard" to "easy"
|
||||
//! let right = "With reconcile, merging documents is hard!"; // Added prefix and changed word
|
||||
//!
|
||||
//! let deconflicted = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Word);
|
||||
//! assert_eq!(deconflicted.apply().text(), "With reconcile, merging documents is easy!");
|
||||
//! // Reconcile combines both changes intelligently
|
||||
//! let result = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Word);
|
||||
//! assert_eq!(result.apply().text(), "With reconcile, merging documents is easy!");
|
||||
//! ```
|
||||
//! > You can also try out an interactive demo at [schmelczer.dev/reconcile](https://schmelczer.dev/reconcile).
|
||||
//!
|
||||
//! ## Tokenizing
|
||||
//! ## Tokenisation strategies
|
||||
//!
|
||||
//! Merging is done on the token level, the granularity of which is
|
||||
//! configurable. By default, words are the atoms for merging and thus words
|
||||
//! can't get jumbled up at the end of reconciling.
|
||||
//! Merging happens at the token level, where you control the granularity.
|
||||
//! By default, words serve as the atomic units for merging, ensuring words
|
||||
//! remain intact during the reconciliation process.
|
||||
//!
|
||||
//! ### Built-in tokenizers
|
||||
//! ### Built-in tokenisers
|
||||
//!
|
||||
//! ```
|
||||
//! use reconcile::{reconcile, BuiltinTokenizer};
|
||||
//!
|
||||
//! let parent = "The quick brown fox\n";
|
||||
//! let left = "The very quick brown fox\n";
|
||||
//! let right = "The quick red fox\n";
|
||||
//! let left = "The very quick brown fox\n"; // Added "very"
|
||||
//! let right = "The quick red fox\n"; // Changed "brown" to "red"
|
||||
//!
|
||||
//! // Using line-based tokenisation
|
||||
//! let result = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Line);
|
||||
//! assert_eq!(result.apply().text(), "The quick red foxThe very quick brown fox\n");
|
||||
//! ```
|
||||
//!
|
||||
//! ### Custom tokenization
|
||||
//! ### Custom tokenisation
|
||||
//!
|
||||
//! If something custom is needed, for instance, to better support structured
|
||||
//! text such as Markdown or HTML, a custom tokenizer can be implemented:
|
||||
//! For specialised use cases—such as structured text like Markdown or HTML—
|
||||
//! you can implement custom tokenisation logic:
|
||||
//!
|
||||
//! ```
|
||||
//! use reconcile::{reconcile, Token, BuiltinTokenizer};
|
||||
//!
|
||||
//! // Example with custom tokenizer - split by sentences
|
||||
//! let sentence_tokenizer = |text: &str| {
|
||||
//! // Example: custom sentence-based tokeniser
|
||||
//! let sentence_tokeniser = |text: &str| {
|
||||
//! text.split(". ")
|
||||
//! .map(|sentence| Token::new(
|
||||
//! sentence.to_string(),
|
||||
//! sentence.to_string(),
|
||||
//! false, // don't allow joining token with the preceding one
|
||||
//! false, // don't allow joining token with the following one
|
||||
//! false, // don't allow joining with the preceding token
|
||||
//! false, // don't allow joining with the following token
|
||||
//! ))
|
||||
//! .collect::<Vec<_>>()
|
||||
//! };
|
||||
//!
|
||||
//! let parent = "Hello world. This is a test.";
|
||||
//! let left = "Hello beautiful world. This is a test.";
|
||||
//! let right = "Hello world. This is a great test.";
|
||||
//! let left = "Hello beautiful world. This is a test."; // Added "beautiful"
|
||||
//! let right = "Hello world. This is a great test."; // Changed "a" to "great"
|
||||
//!
|
||||
//! // Using built-in tokenizer is usually sufficient
|
||||
//! // For most cases, the built-in word tokeniser works perfectly
|
||||
//! let result = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Word);
|
||||
//! assert_eq!(result.apply().text(), "Hello beautiful world. This is a great test.");
|
||||
//! ```
|
||||
//! > By setting the joinability to `false`, longer runs of inserts will be
|
||||
//! > interleaved like LRLRLR instead of LLLRRR.
|
||||
//! > **Tip**: Setting joinability to `false` causes longer runs of insertions
|
||||
//! > to interleave (LRLRLR) rather than group together (LLLRRR), which can
|
||||
//! > produce more natural-looking merged text.
|
||||
//!
|
||||
//! ## Cursors and selection ranges
|
||||
//! ## Cursor tracking
|
||||
//!
|
||||
//! The library supports updating cursor and selection ranges during the merging
|
||||
//! for interactive workflows:
|
||||
//! Perfect for collaborative editors—the library automatically repositions
|
||||
//! cursors and selection ranges during merging:
|
||||
//!
|
||||
//! ```
|
||||
//! use reconcile::{reconcile, BuiltinTokenizer, TextWithCursors, CursorPosition};
|
||||
|
|
@ -86,21 +92,21 @@
|
|||
//! );
|
||||
//! let right = TextWithCursors::new(
|
||||
//! "Hi world".to_string(),
|
||||
//! vec![CursorPosition { id: 2, char_index: 0 }] // At beginning
|
||||
//! vec![CursorPosition { id: 2, char_index: 0 }] // At the beginning
|
||||
//! );
|
||||
//!
|
||||
//! let result = reconcile(parent, &left, &right, &*BuiltinTokenizer::Word);
|
||||
//! let merged = result.apply();
|
||||
//!
|
||||
//! assert_eq!(merged.text(), "Hi beautiful world");
|
||||
//! // Cursors are automatically repositioned
|
||||
//! // Cursors are automatically repositioned in the merged text
|
||||
//! assert_eq!(merged.cursors().len(), 2);
|
||||
//! ```
|
||||
//!
|
||||
//! ## The algorithm
|
||||
//! ## How it works
|
||||
//!
|
||||
//! For a discussion of the algorithm and architecture, see the
|
||||
//! [README](README.md#algorithm) page.
|
||||
//! For a detailed explanation of the algorithm and architecture, see the
|
||||
//! [README](README.md#how-it-works).
|
||||
|
||||
mod operation_transformation;
|
||||
mod raw_operation;
|
||||
|
|
@ -108,8 +114,8 @@ mod tokenizer;
|
|||
mod types;
|
||||
mod utils;
|
||||
|
||||
pub use operation_transformation::{EditedText, reconcile};
|
||||
pub use tokenizer::{BuiltinTokenizer, Tokenizer, token::Token};
|
||||
pub use operation_transformation::{reconcile, EditedText};
|
||||
pub use tokenizer::{token::Token, BuiltinTokenizer, Tokenizer};
|
||||
pub use types::{
|
||||
cursor_position::CursorPosition, history::History, side::Side,
|
||||
span_with_history::SpanWithHistory, text_with_cursors::TextWithCursors,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue