Update docs

This commit is contained in:
Andras Schmelczer 2025-11-22 12:13:22 +00:00
parent 38810579ec
commit 00d2061627
20 changed files with 1149 additions and 569 deletions

View file

@ -9,11 +9,13 @@ Operational transformation is a technique for managing concurrent edits to the s
### Why OT?
Traditional conflict resolution approaches:
- **Last write wins**: Loses data, frustrating for users
- **Manual merging**: Interrupts workflow, requires user intervention
- **Version branching**: Complex, not suitable for real-time sync
Operational transformation:
- **Automatic**: No user intervention required
- **Preserves all edits**: No data loss
- **Real-time**: Changes appear immediately
@ -23,6 +25,39 @@ Operational transformation:
VaultLink uses the [`reconcile-text`](https://crates.io/crates/reconcile-text) Rust library for operational transformation on text documents.
### Why reconcile-text over CRDTs?
VaultLink faces a **differential synchronization** challenge: users edit Obsidian vaults with various editors (Obsidian desktop, Obsidian mobile, Vim, VS Code, or any text editor), often while offline. This means we only observe the **final state** of each document after editing, not the individual keystrokes or operations that produced it.
**The fundamental problem**:
- **CRDTs and traditional OT** require capturing every individual operation (each character insertion, deletion, cursor movement)
- **VaultLink's reality**: Users edit files with arbitrary tools, sync happens after the fact
- **What we know**: Parent version and two modified versions
- **What we don't know**: The sequence of operations that created those modifications
**Why reconcile-text wins for this use case**:
1. **Works with end states only**: reconcile-text performs conflict-free 3-way merging given just parent, left, and right versions—no operation history needed
2. **Editor-agnostic**: Users can edit with any tool without requiring VaultLink-specific plugins or operation tracking
3. **Offline-first**: Edits made while disconnected are merged cleanly when sync resumes, because we're diffing final states rather than replaying operations
4. **No conflict markers**: Unlike Git merge, produces clean merged output without `<<<<<<<` markers that interrupt note-taking flow
5. **Human text forgiveness**: For knowledge bases and documentation, a slightly imperfect merge (e.g., minor word order issues) is vastly preferable to manual conflict resolution
6. **Simpler infrastructure**: No need for complex operation capture, transformation logs, or tombstone management that CRDTs require
**The tradeoff**:
CRDTs excel when you control the entire editing infrastructure and can capture every operation. reconcile-text excels when you're synchronizing independently-edited files—exactly VaultLink's scenario. The merge quality depends on Myers' diff algorithm rather than operation history, which is the correct tradeoff for differential sync.
For note-taking workflows where users value editor freedom and offline editing, this approach provides superior user experience compared to either CRDTs (which would require operation tracking) or Git-style merging (which requires manual conflict resolution).
[Learn more about reconcile-text →](https://schmelczer.dev/reconcile)
### How It Works
Given a base document and two sets of changes, OT produces a merged result that includes both changes.
@ -41,6 +76,7 @@ OT result: "Hello beautiful world!" (both changes applied)
### Operation Types
The algorithm handles these operations:
- **Insert**: Add text at position
- **Delete**: Remove text from position
- **Retain**: Keep existing text unchanged
@ -62,10 +98,12 @@ VaultLink maintains sync state to track which changes have been applied.
### Version Vectors
Each document has a version tracked by:
- **Server version**: Incremented on each change
- **Client cursors**: Track which version each client has seen
This enables:
- Efficient syncing (only send changes since last sync)
- Conflict detection (concurrent edits to same version)
- Ordering of operations
@ -84,6 +122,7 @@ struct Cursor {
```
On sync:
1. Client sends cursor (last seen version)
2. Server returns all changes since that version
3. Client applies changes and updates cursor
@ -95,42 +134,47 @@ On sync:
Two users edit the same paragraph simultaneously.
**Initial state**:
```
Version 10: "The quick brown fox jumps over the lazy dog."
```
**User A's edit** (version 11):
```
"The quick brown fox jumps over the very lazy dog."
```
*Inserts "very " at position 40*
_Inserts "very " at position 40_
**User B's edit** (also from version 10):
```
"The quick red fox jumps over the lazy dog."
```
*Replaces "brown" with "red" at position 10*
_Replaces "brown" with "red" at position 10_
### Server Processing
1. **Receive User A's operation**:
- Base: version 10
- Operation: Insert("very ", position=40)
- Apply to database → version 11
- Base: version 10
- Operation: Insert("very ", position=40)
- Apply to database → version 11
2. **Receive User B's operation**:
- Base: version 10
- Operation: Replace("brown"→"red", position=10)
- **Conflict detected**: Base is version 10, but current is version 11
- Base: version 10
- Operation: Replace("brown"→"red", position=10)
- **Conflict detected**: Base is version 10, but current is version 11
3. **Transform User B's operation**:
- Transform against User A's operation
- Adjust positions/content as needed
- Apply transformed operation → version 12
- Transform against User A's operation
- Adjust positions/content as needed
- Apply transformed operation → version 12
4. **Broadcast updates**:
- Send User A's operation to User B
- Send transformed User B's operation to User A
- Send User A's operation to User B
- Send transformed User B's operation to User A
### Final Result
@ -147,11 +191,13 @@ Both edits are preserved in the final document.
**Scenario**: User A deletes a paragraph while User B edits it.
**Resolution**:
- OT algorithm prioritizes preservation of content
- Insert operation is transformed to account for deletion
- Typically results in inserted content appearing nearby
**Example**:
```
Base: "Line 1\nLine 2\nLine 3"
@ -160,6 +206,7 @@ User B: Edit Line 2 → "Line 1\nLine 2 modified\nLine 3"
Result: "Line 1\nLine 2 modified\nLine 3"
```
(Insert takes precedence, preserving user content)
### 2. Overlapping Edits
@ -167,6 +214,7 @@ Result: "Line 1\nLine 2 modified\nLine 3"
**Scenario**: Two users edit overlapping regions.
**Resolution**:
- OT splits operations into non-overlapping segments
- Applies each segment independently
- Merges results
@ -176,6 +224,7 @@ Result: "Line 1\nLine 2 modified\nLine 3"
**Scenario**: Two users delete overlapping text.
**Resolution**:
- Deletes are merged
- Final result has the union of deleted ranges removed
@ -184,6 +233,7 @@ Result: "Line 1\nLine 2 modified\nLine 3"
**Scenario**: Client loses connection, makes edits offline, reconnects.
**Resolution**:
1. Client queues edits locally
2. On reconnect, sends all queued operations
3. Server applies OT against all operations that happened during partition
@ -206,6 +256,7 @@ Result: "Line 1\nLine 2 modified\nLine 3"
### Optimization
VaultLink optimizes for:
- Small, frequent edits (typical typing patterns)
- Text documents (not binary files)
- Real-time processing (no batching delay)
@ -215,6 +266,7 @@ VaultLink optimizes for:
### Binary Files
OT works best for text files. Binary files:
- Cannot be meaningfully merged
- Use last-write-wins strategy
- May cause data loss on concurrent edits
@ -224,6 +276,7 @@ OT works best for text files. Binary files:
### Large Documents
Very large documents (> 1MB) may have:
- Higher transformation costs
- Slower sync times
- Increased memory usage
@ -233,6 +286,7 @@ Very large documents (> 1MB) may have:
### Complex Formatting
Markdown with complex structures may occasionally produce unexpected results:
- Nested lists
- Tables
- Code blocks
@ -244,6 +298,7 @@ Markdown with complex structures may occasionally produce unexpected results:
### Strong Consistency
VaultLink provides **strong eventual consistency**:
- All clients eventually converge to the same state
- Operations applied in causal order
- No data loss under normal operation
@ -264,32 +319,36 @@ VaultLink provides **strong eventual consistency**:
### Git-style Merging
| Aspect | Git Merge | VaultLink OT |
|--------|-----------|--------------|
| Real-time | No | Yes |
| Manual conflict resolution | Yes | No |
| Branching | Yes | No |
| Automatic merge | Limited | Always |
| Use case | Code changes | Collaborative documents |
| Aspect | Git Merge | VaultLink OT |
| -------------------------- | ------------ | ----------------------- |
| Real-time | No | Yes |
| Manual conflict resolution | Yes | No |
| Branching | Yes | No |
| Automatic merge | Limited | Always |
| Use case | Code changes | Collaborative documents |
### CRDTs (Conflict-free Replicated Data Types)
| Aspect | CRDTs | VaultLink OT |
|--------|-------|--------------|
| Server required | No | Yes |
| Memory overhead | Higher | Lower |
| Complexity | Higher | Lower |
| Deletion handling | Complex (tombstones) | Simple |
| Best for | Distributed systems | Centralized sync |
| Aspect | CRDTs | VaultLink (reconcile-text) |
| ----------------------------- | ------------------------------------ | ------------------------------------------------- |
| **Operation tracking** | Required (every keystroke) | Not required (end states only) |
| **Editor freedom** | Limited (must use CRDT-aware editor) | Unlimited (any text editor works) |
| **Offline editing** | Requires operation log | Works with file comparison |
| **Server required** | No | Yes |
| **Memory overhead** | Higher (tombstones, metadata) | Lower (versions only) |
| **Infrastructure complexity** | Higher | Lower |
| **Best for** | Controlled editing environments | Independent file editing (Obsidian, Vim, VS Code) |
**Key insight**: CRDTs are superior when you can capture every operation. reconcile-text is superior when users edit files independently with arbitrary tools—exactly VaultLink's scenario.
### Last Write Wins
| Aspect | LWW | VaultLink OT |
|--------|-----|--------------|
| Data loss | Yes | No |
| Simplicity | High | Medium |
| User experience | Poor | Excellent |
| Performance | Best | Good |
| Aspect | LWW | VaultLink OT |
| --------------- | ---- | ------------ |
| Data loss | Yes | No |
| Simplicity | High | Medium |
| User experience | Poor | Excellent |
| Performance | Best | Good |
## Algorithm Details
@ -298,20 +357,20 @@ VaultLink provides **strong eventual consistency**:
When transforming operation `A` against operation `B`:
1. **Insert vs Insert**:
- If positions equal: Order by client ID
- If different positions: Adjust positions
- If positions equal: Order by client ID
- If different positions: Adjust positions
2. **Insert vs Delete**:
- If insert in deleted range: Shift insert position
- If insert after delete: Adjust position by deleted length
- If insert in deleted range: Shift insert position
- If insert after delete: Adjust position by deleted length
3. **Delete vs Delete**:
- If ranges overlap: Merge delete ranges
- If ranges disjoint: Adjust positions
- If ranges overlap: Merge delete ranges
- If ranges disjoint: Adjust positions
4. **Retain vs Any**:
- Retain operations don't conflict
- Simply adjust positions
- Retain operations don't conflict
- Simply adjust positions
### Transformation Example