Improve docs

This commit is contained in:
Andras Schmelczer 2025-07-06 12:28:46 +01:00
parent 2a4b5dd496
commit 077ba9416a
No known key found for this signature in database
GPG key ID: FC8F2C3D3D1A718C
6 changed files with 23 additions and 162 deletions

View file

@ -1,22 +1,13 @@
# Reconcile: conflict-free 3-way text merging
> `diff3` but with automatic conflict resolution.
> [`diff3`](https://www.gnu.org/software/diffutils/manual/html_node/Invoking-diff3.html) but with automatic conflict resolution.
[![Check](https://github.com/schmelczer/reconcile/actions/workflows/check.yml/badge.svg)](https://github.com/schmelczer/reconcile/actions/workflows/check.yml)
[![Publish to GitHub Pages](https://github.com/schmelczer/reconcile/actions/workflows/gh-pages.yml/badge.svg)](https://github.com/schmelczer/reconcile/actions/workflows/gh-pages.yml)
Reconcile is a Rust and JavaScript (through WebAssembly) library for merging text without user intervention. It automatically resolves conflicts that would typically require manual intervention in traditional 3-way merge tools.
TODO: add links for crates and npm
```rust
use reconcile::{reconcile, BuiltinTokenizer};
let parent = "Merging text is hard!";
let left = "Merging text is easy!";
let right = "With reconcile, merging documents is hard!";
let deconflicted = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::Word);
assert_eq!(deconflicted.apply().text(), "With reconcile, merging documents is easy!");
```
Reconcile is a Rust and JavaScript (through WebAssembly) library for merging text without user intervention. It automatically resolves conflicts that would typically require user action in traditional 3-way merge tools.
## Features
@ -31,6 +22,7 @@ assert_eq!(deconflicted.apply().text(), "With reconcile, merging documents is ea
### Rust
Add to your `Cargo.toml`:
```toml
[dependencies]
reconcile = "0.4"
@ -54,7 +46,7 @@ npm install reconcile
```
```javascript
import { init, reconcile } from 'reconcile';
import { init, reconcile } from "reconcile";
// Initialize the WASM module (required before first use)
await init();
@ -73,9 +65,9 @@ console.log(result.text); // "Hi beautiful world"
Reconcile supports different tokenization strategies:
- **Word tokenizer** (`BuiltinTokenizer::Word`): Splits text into words (default, recommended for most use cases)
- **Character tokenizer** (`BuiltinTokenizer::Character`): Splits text into individual characters (fine-grained merging)
- **Custom tokenizer**: Implement your own tokenization logic
- **Word tokenizer** (`BuiltinTokenizer::Word`): Splits text into words (default, recommended for most use cases)
- **Character tokenizer** (`BuiltinTokenizer::Character`): Splits text into individual characters (fine-grained merging)
- **Custom tokenizer**: Implement your own tokenization logic
### Cursor Tracking
@ -86,11 +78,11 @@ const result = reconcile(
"Hello world",
{
text: "Hello beautiful world",
cursors: [{ id: 1, position: 6 }] // After "Hello "
cursors: [{ id: 1, position: 6 }], // After "Hello "
},
{
text: "Hi world",
cursors: [{ id: 2, position: 0 }] // At beginning
cursors: [{ id: 2, position: 0 }], // At beginning
}
);
@ -113,10 +105,10 @@ The algorithm starts similarly to `diff3`. Its inputs are a **parent** document
1. **Diff calculation**: First, 2-way diffs of (parent & left) and (parent & right) are computed using Myers' algorithm
2. **Tokenization**: The text is split into tokens (words, characters, etc.) for granular merging
3. **Operation transformation**: The resulting edits are weaved together using operational transformation principles, ensuring no changes are lost
4. **Conflict resolution**: Unlike traditional 3-way merge tools, Reconcile automatically resolves conflicts without producing conflict markers
3. **Diff cleaning**: The tokens of the same diff are reordered and merged to end up to maximise patch sizes
4. **Operation transformation (OT)**: The resulting edits are weaved together using operational transformation principles, ensuring no changes are lost
The key insight is that both insertions and deletions are preserved: if either side inserted text, it appears in the result; if either side deleted text, the deletion is applied, but insertions into deleted regions are still preserved.
`EditedText` (at least in the Rust library) exposes an implementation of OT. The primary purpose of this library isn't to implement OT but to provide automated text merging, howver, OT happens to provide an easy way of merging the output of Myers' diff. The same result could be achieved through many CRDT implementations as well. However, the merging quality is only as good as the 2-way diffs are. For instance, `reconcile` doesn't support `move` semantics as these are decomposed into an `insert` and `delete` operation by Myers'.
## Motivation
@ -124,7 +116,7 @@ Sometimes documents get edited concurrently by multiple users (or the same user
To allow for offline editing, we could use CRDTs or Operational Transformation (OT) to come to a consistent resolution of the competing version. However, this requires capturing all user actions: insertions, deletes, move, copies, and pastes. In some applications, this is trivial if the document can only be edited through an editor that's in our control. But this isn't always the case. Users enjoy composable systems that don't lock them in. For example, one of the unique selling points of Obsidian is to provide an editor experience over a folder of Markdown files leaving the user free to change their technology of choice on a whim.
This means that files can be edited out-of-channel and the only information a text synchronization system can know is the current content of each tracked file. This is the same problem as what Git and similar version control systems solve. Although the problem is similar, there's a relevant difference between syncing source code and personal notes: in the case of the former, a semantically incorrect conflict resolution can wreak havoc in a code base, or worse, introduce a correctness bug unnoticed. Text notes are different though, humans are well-equipped to finding the signal in a noisy environment and "bad merges" might result in a clumsy sentence but the reader will likely still understand the gist and can fix it if necessary.
This means that files can be edited out-of-channel and the only information a text synchronization system can know is the current content of each tracked file. This is described as Differential Synchronization [1]. This is the same problem as what Git and similar version control systems solve but in a manual way. Although the problem is similar, there's a relevant difference between syncing source code and personal notes: in the case of the former, a semantically incorrect conflict resolution can wreak havoc in a code base, or worse, introduce a correctness bug unnoticed. Text notes are different though, humans are well-equipped to finding the signal in a noisy environment and "bad merges" might result in a clumsy sentence but the reader will likely still understand the gist and can fix it if necessary.
> There are domains of human text which are less tolerant of mis-merges: for instance, two conflicting changes to a contract could result in a term getting negated in different ways from both sides, resulting in a double-negation, thus unknowingly changing the meaning.
@ -133,49 +125,24 @@ This means that files can be edited out-of-channel and the only information a te
### Prerequisites
#### Install Node.js
- Install [nvm](https://github.com/nvm-sh/nvm): `curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash`
- `nvm install 22`
- `nvm use 22`
- Optionally set the system-wide default: `nvm alias default 22`
#### Set up Rust
- Install [`rustup`](https://rustup.rs): `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh`
- Install [`wasm-pack`](https://rustwasm.github.io/wasm-pack/installer): `curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh`
- `cargo install cargo-insta cargo-edit`
### Building
```bash
# Build Rust library
cargo build
# Build WASM bindings
wasm-pack build --target web
# Build JavaScript package
cd reconcile-js
npm install
npm run build
```
### Testing
```bash
# Test Rust library
cargo test
# Test JavaScript bindings
cd reconcile-js
npm test
```
- `cargo install wasm-pack cargo-insta cargo-edit`
### Scripts
#### Publish new version
```sh
scripts/bump-version.sh patch
```
- **Running tests**: `scripts/test.sh`
- **Formatting**: `scripts/lint.sh`
- **Building website**: `scripts/dev-website.sh`
- **Publishing new version**: `scripts/bump-version.sh patch`
## License
TODO: license
MIT
[1]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35605.pdf

1
a.md
View file

@ -1 +0,0 @@
`EditedText` (at least in the Rust library) exposes an implementation of OT. The primary purpose of this library isn't to implement OT but to provide automated text merging, howver, OT happens to provide an easy way of merging the output of Myers' diff. The same result could be achieved through many CRDT implementations as well. However, the merging quality is only as good as the 2-way diffs are. For instance, `reconcile` doesn't support `move` operations the best as these are decomposed into an `insert` and `delete` operation by Myers'.

View file

@ -1,54 +0,0 @@
# Reconcile: Interactive Demo
This is the interactive demo website for the Reconcile library. Visit [schmelczer.dev/reconcile](https://schmelczer.dev/reconcile) to try it out.
## About the Demo
The demo allows you to:
- Enter three text versions (parent, left, right)
- See the reconciled result in real-time
- Experiment with different tokenization strategies
- Observe how cursor positions are updated during merging
- View the history of operations that led to the result
## Features Demonstrated
- **Conflict-free merging**: No conflict markers in the output
- **Cursor tracking**: See how cursor positions are automatically updated
- **Different tokenizers**: Compare word-level vs. character-level tokenization
- **Operation history**: Understand the merge process step-by-step
## Running Locally
```bash
# Build the WASM module first
cd ../..
wasm-pack build --target web
# Install dependencies and run the demo
cd examples/website
npm install
npm run dev
```
## Usage Examples
Try these examples in the demo:
### Basic merge
- **Parent**: "Hello world"
- **Left**: "Hello beautiful world"
- **Right**: "Hi world"
- **Result**: "Hi beautiful world"
### Cursor tracking
- **Parent**: "The quick brown fox"
- **Left**: "The very quick brown fox" (cursor at position 4)
- **Right**: "The quick red fox" (cursor at position 10)
- **Result**: Cursors automatically repositioned
### Character-level merging
Switch to character tokenizer for fine-grained merging of individual characters rather than whole words.
For more examples and detailed documentation, see the [main README](../../README.md).

View file

@ -16,9 +16,6 @@ export interface TextWithCursors {
cursors: null | undefined | CursorPosition[];
}
/**
* Represents a cursor position with a unique identifier.
*/
export interface CursorPosition {
/** Unique identifier for the cursor */
id: number;
@ -42,9 +39,6 @@ export interface SpanWithHistory {
history: History;
}
/**
* Supported tokenizer types for text processing.
*/
export type Tokenizer = "word" | "character";
let isInitialised = false;

View file

@ -1,7 +0,0 @@
#!/bin/bash
set -e
rm -rf pkg
wasm-pack build --target web --features wasm,wee_alloc

View file

@ -2,44 +2,6 @@
This directory contains YAML test cases that demonstrate various reconcile scenarios.
## Format
Each YAML file contains test documents with the following structure:
```yaml
parent: "Original text"
left:
text: "Left version"
cursors:
- id: 1
char_index: 5
right:
text: "Right version"
cursors:
- id: 2
char_index: 10
expected:
text: "Expected result"
cursors:
- id: 1
char_index: 8
- id: 2
char_index: 12
```
## Cursor Position Notation
In some test cases, the `|` character is used to denote cursor positions within the text. These characters are stripped before the actual reconcile logic is run, making it easier to visualize where cursors should be positioned.
## Running Tests
These examples are automatically tested as part of the test suite:
```bash
cargo test
```
The tests verify that:
1. Text is merged correctly without conflicts
2. Cursor positions are updated accurately
3. The merge result is consistent regardless of argument order (left/right swap)