This commit is contained in:
Andras Schmelczer 2025-11-22 11:19:08 +00:00
parent 56c1f4d58b
commit 50a95b114d
19 changed files with 4663 additions and 1 deletions

View file

@ -0,0 +1,532 @@
# Data Flow
This document provides a detailed look at how data flows through the VaultLink system, from client to server and back.
## Connection Lifecycle
### 1. Initial Connection
```mermaid
sequenceDiagram
participant C as Client
participant S as Server
participant DB as Database
C->>S: WebSocket connect
S->>S: Accept connection
C->>S: Auth message (token + vault)
S->>S: Validate token
S->>S: Check vault access
S-->>C: Auth success
Note over C,S: Connection established
```
**Steps**:
1. Client initiates WebSocket connection to server
2. Server accepts connection
3. Client sends authentication message with token and vault name
4. Server validates token against `config.yml`
5. Server checks if user has access to requested vault
6. Server responds with success or error
7. Connection is ready for syncing
### 2. Initial Sync
After authentication, the client performs initial synchronization:
```mermaid
sequenceDiagram
participant C as Client
participant S as Server
participant DB as SQLite
C->>C: Scan local filesystem
C->>S: Request file list
S->>DB: Query all files
DB-->>S: File metadata
S-->>C: File list with versions
loop For each local file
C->>C: Check if file on server
alt File not on server
C->>S: Upload file
S->>DB: Store file + metadata
else File on server (different version)
C->>C: Compare versions
C->>S: Upload newer or merge
end
end
loop For each server file
C->>C: Check if file local
alt File not local
C->>S: Download file
S->>DB: Retrieve file
DB-->>S: File content
S-->>C: File content
C->>C: Write to disk
end
end
S-->>C: Sync complete message
```
**Process**:
1. Client scans local filesystem
2. Client requests file list from server
3. Server queries database and returns metadata
4. Client uploads missing or changed local files
5. Client downloads missing files from server
6. Server sends sync complete notification
### 3. Real-Time Synchronization
After initial sync, changes are pushed in real-time:
```mermaid
sequenceDiagram
participant FS as Filesystem
participant C1 as Client 1
participant S as Server
participant DB as Database
participant C2 as Client 2
FS->>C1: File changed (fs.watch)
C1->>C1: Read file content
C1->>S: Upload file
S->>DB: Store new version
S->>S: Apply OT if needed
S-->>C1: Upload ACK
S->>C2: File update notification
C2->>S: Download file
S->>DB: Retrieve file
DB-->>S: File content
S-->>C2: File content
C2->>FS: Write to disk
```
**Flow**:
1. Filesystem watcher detects local change
2. Client reads file content
3. Client uploads file via WebSocket
4. Server stores in database
5. Server applies operational transformation if concurrent edits
6. Server acknowledges upload to sender
7. Server broadcasts update to other clients
8. Other clients download and apply changes
## File Operations
### Upload
```
┌─────────┐
│ Client │
└────┬────┘
│ 1. Detect file change
├─► 2. Read file content
├─► 3. Create upload message
│ {
│ type: "upload_file",
│ path: "notes/daily.md",
│ content: "...",
│ version: 42,
│ timestamp: "2024-01-01T12:00:00Z"
│ }
┌─────────┐
│ Server │
└────┬────┘
│ 4. Validate message
├─► 5. Check permissions
├─► 6. Apply OT (if conflicts)
├─► 7. Store in database
├─► 8. Update version
├─► 9. Broadcast to clients
└─► 10. Send ACK to uploader
```
### Download
```
┌─────────┐
│ Server │
└────┬────┘
│ 1. File updated by another client
├─► 2. Broadcast notification
│ {
│ type: "file_updated",
│ path: "notes/daily.md",
│ version: 43
│ }
┌─────────┐
│ Client │
└────┬────┘
│ 3. Receive notification
├─► 4. Request file download
│ {
│ type: "download_file",
│ path: "notes/daily.md",
│ version: 43
│ }
┌─────────┐
│ Server │
└────┬────┘
│ 5. Retrieve from database
└─► 6. Send file content
{
type: "file_content",
path: "notes/daily.md",
content: "...",
version: 43
}
┌─────────┐
│ Client │
└────┬────┘
│ 7. Write to filesystem
└─► 8. Update local metadata
```
### Delete
```
┌─────────┐
│ Client │
└────┬────┘
│ 1. File deleted locally
├─► 2. Send delete message
│ {
│ type: "delete_file",
│ path: "notes/old.md"
│ }
┌─────────┐
│ Server │
└────┬────┘
│ 3. Mark as deleted in DB
│ (soft delete for history)
├─► 4. Broadcast deletion
└─► 5. ACK to sender
┌─────────┐
│ Other │
│ Clients │
└────┬────┘
│ 6. Delete local file
└─► 7. Update metadata
```
## Conflict Resolution Flow
### Concurrent Edits Scenario
```
Time →
Client A Server Client B
│ │ │
│ Edit file v10 │ │
│ "Add line A" │ │ Edit file v10
│ │ │ "Add line B"
│ │ │
├─── Upload @ t1 ─────────►│ │
│ │◄────── Upload @ t2 ────────┤
│ │ │
│ │ 1. Receive both edits │
│ │ (based on v10) │
│ │ │
│ │ 2. Apply first edit │
│ │ → v11 (line A added) │
│ │ │
│ │ 3. Transform second edit │
│ │ against first │
│ │ │
│ │ 4. Apply transformed edit │
│ │ → v12 (both lines) │
│ │ │
│◄──── v12 content ────────┤ │
│ ├───── v12 content ─────────►│
│ │ │
│ Apply v12 │ │ Apply v12
│ (has both lines) │ │ (has both lines)
│ │ │
```
### Conflict Resolution Steps
1. **Detection**: Server receives two edits based on the same version
2. **Ordering**: Determine which edit to apply first (by timestamp or client ID)
3. **First edit**: Apply directly to database
4. **Transformation**: Transform second edit against first using OT
5. **Second edit**: Apply transformed edit to database
6. **Broadcast**: Send merged result to all clients
7. **Application**: Clients apply merged version locally
## Database Schema
### Core Tables
```sql
-- Document metadata
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
path TEXT NOT NULL,
version INTEGER NOT NULL,
content_hash TEXT,
size INTEGER,
created_at TIMESTAMP,
updated_at TIMESTAMP,
deleted BOOLEAN DEFAULT FALSE
);
-- Version history
CREATE TABLE versions (
id INTEGER PRIMARY KEY,
document_id INTEGER,
version INTEGER,
content BLOB,
created_at TIMESTAMP,
FOREIGN KEY (document_id) REFERENCES documents(id)
);
-- Client sync cursors
CREATE TABLE cursors (
client_id TEXT PRIMARY KEY,
last_version INTEGER,
last_updated TIMESTAMP
);
```
### Queries
**Get files since version**:
```sql
SELECT * FROM documents
WHERE version > ? AND deleted = FALSE
ORDER BY version ASC;
```
**Store new version**:
```sql
INSERT INTO versions (document_id, version, content, created_at)
VALUES (?, ?, ?, ?);
UPDATE documents
SET version = ?, updated_at = ?
WHERE id = ?;
```
**Update cursor**:
```sql
INSERT OR REPLACE INTO cursors (client_id, last_version, last_updated)
VALUES (?, ?, ?);
```
## Message Protocol
### Client → Server Messages
**Upload File**:
```json
{
"type": "upload_file",
"path": "notes/example.md",
"content": "File content here...",
"base_version": 10,
"timestamp": "2024-01-01T12:00:00Z"
}
```
**Download File**:
```json
{
"type": "download_file",
"path": "notes/example.md"
}
```
**Delete File**:
```json
{
"type": "delete_file",
"path": "notes/old.md"
}
```
**List Files**:
```json
{
"type": "list_files",
"since_version": 0
}
```
### Server → Client Messages
**File Updated**:
```json
{
"type": "file_updated",
"path": "notes/example.md",
"version": 11,
"size": 1024,
"hash": "abc123..."
}
```
**File Content**:
```json
{
"type": "file_content",
"path": "notes/example.md",
"content": "Updated content...",
"version": 11
}
```
**File Deleted**:
```json
{
"type": "file_deleted",
"path": "notes/old.md",
"version": 12
}
```
**Sync Complete**:
```json
{
"type": "sync_complete",
"total_files": 150,
"current_version": 200
}
```
**Error**:
```json
{
"type": "error",
"message": "File too large",
"code": "FILE_TOO_LARGE"
}
```
## Error Handling
### Client-Side Errors
**Network failure**:
1. Detect WebSocket disconnect
2. Queue pending operations
3. Retry connection with exponential backoff
4. Replay queued operations on reconnect
**File read error**:
1. Log error
2. Skip file
3. Continue with other files
4. Report to user
**Write conflict**:
1. Receive updated version from server
2. Apply OT merge locally
3. Overwrite local file
4. Continue syncing
### Server-Side Errors
**Database error**:
1. Log error
2. Return error to client
3. Client retries operation
**Invalid operation**:
1. Validate message format
2. Return specific error code
3. Client handles error appropriately
**Authentication failure**:
1. Reject connection
2. Send auth error
3. Client prompts for new credentials
## Performance Optimizations
### Batching
- Small, rapid changes are batched together
- Reduces message overhead
- Applied as single atomic update
### Compression
- Large files compressed before transmission
- Reduces bandwidth usage
- Transparent to application layer
### Incremental Sync
- Only changed portions of files sent
- Uses content-based diffing
- Significantly reduces data transfer
### Caching
- Server caches recent file versions
- Reduces database queries
- Improves response time
## Monitoring Data Flow
### Server Logs
```
2024-01-01 12:00:00 INFO WebSocket connection from 192.168.1.100
2024-01-01 12:00:01 INFO User 'alice' authenticated for vault 'personal'
2024-01-01 12:00:05 INFO Upload: notes/daily.md (v10 -> v11)
2024-01-01 12:00:06 INFO Broadcast to 3 clients
2024-01-01 12:00:10 INFO Conflict resolved: notes/shared.md (v12)
```
### Client Logs
```
2024-01-01 12:00:00 INFO Connecting to ws://sync.example.com
2024-01-01 12:00:01 INFO Connected, authenticating...
2024-01-01 12:00:01 INFO Authentication successful
2024-01-01 12:00:02 INFO Starting initial sync
2024-01-01 12:00:10 INFO Sync complete: 150 files, 200 MB
2024-01-01 12:00:15 INFO Uploaded: notes/daily.md
2024-01-01 12:00:20 INFO Downloaded: notes/shared.md (merged)
```
## Next Steps
- [Understand the sync algorithm →](/architecture/sync-algorithm)
- [Configure the server →](/config/server)
- [Deploy VaultLink →](/guide/getting-started)

344
docs/architecture/index.md Normal file
View file

@ -0,0 +1,344 @@
# Architecture Overview
VaultLink is built as a distributed system with a central sync server and multiple clients. This document explains the high-level architecture and design decisions.
## System Components
```
┌─────────────────────────────────────────────────────────────┐
│ Clients │
├─────────────────────┬───────────────────┬───────────────────┤
│ Obsidian Plugin │ Obsidian Plugin │ CLI Client │
│ (User A - Device1) │ (User A - Device2│ (Server/Backup) │
└──────────┬──────────┴─────────┬─────────┴──────────┬────────┘
│ │ │
│ WebSocket │ WebSocket │ WebSocket
│ │ │
└────────────────────┼────────────────────┘
┌───────────▼───────────┐
│ Sync Server │
│ (Rust + Axum) │
│ │
│ ┌─────────────────┐ │
│ │ WebSocket Hub │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Sync Engine │ │
│ │ (OT Algorithm) │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ SQLite Database │ │
│ │ (Per Vault) │ │
│ └─────────────────┘ │
└───────────────────────┘
```
## Core Components
### Sync Server
The central authority for synchronization, written in Rust using Axum framework.
**Responsibilities**:
- Accept WebSocket connections from clients
- Authenticate users via token-based auth
- Store document versions in SQLite
- Coordinate real-time updates between clients
- Apply operational transformation for conflict resolution
- Manage vault access control
**Technology**:
- **Language**: Rust 1.89+
- **Framework**: Axum (async web framework)
- **Database**: SQLite with SQLx
- **Protocol**: WebSockets for real-time communication
- **Sync Algorithm**: reconcile-text (operational transformation)
### Sync Client Library
TypeScript library providing core synchronization logic, used by both the Obsidian plugin and CLI client.
**Responsibilities**:
- Manage WebSocket connection to server
- Watch local filesystem for changes
- Upload and download files
- Apply remote changes locally
- Handle conflict resolution
- Maintain sync metadata
**Technology**:
- **Language**: TypeScript
- **Build**: Webpack
- **Protocol**: WebSocket client
- **File System**: Node.js `fs` API / Obsidian API
### Obsidian Plugin
Integration layer between sync client and Obsidian.
**Responsibilities**:
- Provide UI for configuration
- Bridge sync client with Obsidian's file system API
- Handle Obsidian lifecycle events
- Display sync status to users
**Technology**:
- **Platform**: Obsidian Plugin API
- **Core**: sync-client library
- **UI**: Obsidian settings UI
### CLI Client
Standalone executable for syncing vaults without Obsidian.
**Responsibilities**:
- Command-line interface
- File system access via Node.js
- Daemon mode for continuous sync
- Health check endpoint for monitoring
**Technology**:
- **Language**: TypeScript
- **Runtime**: Node.js
- **CLI**: Commander.js
- **Core**: sync-client library
## Data Flow
### Initial Connection
1. Client connects via WebSocket to server
2. Server authenticates using provided token
3. Server verifies user has access to requested vault
4. Connection established, sync begins
### File Upload Flow
```
Client Server
│ │
│ 1. File changed locally │
│ │
│ 2. Read file content │
│ │
│ 3. WebSocket: Upload file │
├──────────────────────────────►│
│ │ 4. Store in SQLite
│ │
│ │ 5. Broadcast to other clients
│ ├───────────────────────►
│ 6. Ack upload │
│◄──────────────────────────────┤
```
### File Download Flow
```
Client A Server Client B
│ │ │
│ │ 1. File uploaded │
│ │◄────────────────────────┤
│ │ │
│ │ 2. Store in DB │
│ │ │
│ 3. Push notification │ │
│◄────────────────────────┤ │
│ │ │
│ 4. Download file │ │
├────────────────────────►│ │
│ │ │
│ 5. Write locally │ │
│ │ │
```
### Conflict Resolution
When two clients edit the same file simultaneously:
```
Client A Server Client B
│ │ │
│ 1. Edit file │ │ 1. Edit same file
│ │ │
│ 2. Upload changes │ │ 2. Upload changes
├────────────────────────►│◄────────────────────────┤
│ │ │
│ │ 3. Apply OT algorithm │
│ │ - Merge both edits │
│ │ - Preserve all changes│
│ │ │
│ 4. Receive merged ver. │ 5. Receive merged ver. │
│◄────────────────────────┤────────────────────────►│
│ │ │
│ 6. Apply locally │ │ 6. Apply locally
```
## Storage Architecture
### Server Storage
Each vault has its own SQLite database:
```
databases/
├── vault-1.db
├── vault-2.db
└── shared-team.db
```
**Database Schema** (simplified):
- **documents**: File metadata (path, size, modified time)
- **versions**: Document content with version history
- **cursors**: Client sync state
### Client Storage
Clients maintain sync metadata:
```
.vaultlink/
├── metadata.json # Sync state
└── cache/ # Optional local cache
```
The `.vaultlink` directory tracks which files have been synced and their versions to enable efficient synchronization.
## Communication Protocol
### WebSocket Messages
Client-server communication uses JSON messages over WebSocket.
**Message Types**:
- `upload_file`: Client → Server (file upload)
- `download_file`: Client → Server (request file)
- `file_updated`: Server → Client (file changed notification)
- `file_deleted`: Server → Client (file deleted notification)
- `sync_complete`: Server → Client (initial sync finished)
### Authentication
Token-based authentication on connection:
```typescript
// Client sends token on connect
{
type: "auth",
token: "user-auth-token",
vault: "vault-name"
}
// Server responds
{
type: "auth_success"
}
// or
{
type: "auth_error",
message: "Invalid token"
}
```
## Scalability Considerations
### Current Architecture
- **SQLite per vault**: Simple, performant, limited to single server
- **WebSocket connections**: Stateful, requires sticky sessions for load balancing
- **Operational transformation**: Centralized on server
### Scaling Approaches
**Vertical Scaling**:
- Increase server resources (CPU, RAM, storage)
- Optimize database queries and indexing
- Tune connection limits
**Horizontal Scaling** (future):
- Separate vault servers (vault sharding)
- Load balancer with sticky sessions
- Shared storage layer for SQLite databases
- Consider alternative databases (PostgreSQL) for multi-server setups
### Performance Characteristics
- **Small vaults** (< 1000 files): Excellent performance
- **Medium vaults** (1000-10000 files): Good performance with tuning
- **Large vaults** (> 10000 files): May require optimization
- **Concurrent users**: Tested with dozens of simultaneous clients per vault
## Security Model
### Authentication
- Token-based authentication
- Tokens configured in server `config.yml`
- No password hashing (tokens are secrets)
### Authorization
- Per-user vault access control
- Allow-list or deny-list patterns
- Global access or vault-specific access
### Network Security
- WebSocket over TLS (WSS) for encrypted transport
- No built-in SSL (use reverse proxy)
- CORS configured for web clients
### Data Security
- No encryption at rest (use encrypted filesystems if needed)
- No end-to-end encryption (server sees all content)
- Self-hosted model: you control the data
## Technology Choices
### Why Rust for Server?
- **Performance**: Low latency for real-time sync
- **Memory safety**: No crashes from memory bugs
- **Concurrency**: Excellent async support with Tokio
- **Type safety**: Catch bugs at compile time
- **SQLx**: Compile-time SQL verification
### Why SQLite?
- **Simplicity**: No separate database server required
- **Performance**: Fast for read-heavy workloads
- **Reliability**: Battle-tested, ACID compliant
- **Portability**: Single file per vault
- **Backups**: Simple file copy
### Why WebSocket?
- **Real-time**: Bidirectional push for instant updates
- **Efficiency**: Persistent connection, no polling overhead
- **Simplicity**: Built-in browser/Node.js support
- **Standards**: Well-supported protocol
### Why Operational Transformation?
- **Automatic conflict resolution**: No manual merging required
- **Preserves intent**: All edits are kept
- **Real-time collaboration**: Users see changes as they happen
- **Proven algorithm**: Used by Google Docs, etc.
## Design Principles
1. **Self-hosted first**: Users control their data and infrastructure
2. **Simplicity**: Easy to deploy and operate
3. **Real-time**: Changes appear immediately
4. **Reliability**: Handle network failures gracefully
5. **Performance**: Fast sync for typical vault sizes
6. **Privacy**: No third-party services or telemetry
## Next Steps
- [Learn about the sync algorithm →](/architecture/sync-algorithm)
- [Understand data flow in detail →](/architecture/data-flow)
- [Deploy the server →](/guide/server-setup)

View file

@ -0,0 +1,361 @@
# Sync Algorithm
VaultLink uses operational transformation (OT) to handle concurrent edits and maintain consistency across clients. This document explains how the algorithm works.
## Operational Transformation
Operational transformation is a technique for managing concurrent edits to the same document. It transforms operations (edits) so they can be applied in different orders while preserving user intent.
### Why OT?
Traditional conflict resolution approaches:
- **Last write wins**: Loses data, frustrating for users
- **Manual merging**: Interrupts workflow, requires user intervention
- **Version branching**: Complex, not suitable for real-time sync
Operational transformation:
- **Automatic**: No user intervention required
- **Preserves all edits**: No data loss
- **Real-time**: Changes appear immediately
- **Intuitive**: Behavior matches user expectations
## The reconcile-text Library
VaultLink uses the [`reconcile-text`](https://crates.io/crates/reconcile-text) Rust library for operational transformation on text documents.
### How It Works
Given a base document and two sets of changes, OT produces a merged result that includes both changes.
**Example**:
```
Base document: "Hello world"
User A: "Hello beautiful world" (inserts "beautiful ")
User B: "Hello world!" (inserts "!")
OT result: "Hello beautiful world!" (both changes applied)
```
### Operation Types
The algorithm handles these operations:
- **Insert**: Add text at position
- **Delete**: Remove text from position
- **Retain**: Keep existing text unchanged
### Transformation Process
1. **Client A** makes edit and sends to server
2. **Client B** makes concurrent edit and sends to server
3. **Server** receives both edits
4. **Server** transforms operations to account for concurrent changes
5. **Server** applies merged result to database
6. **Server** sends transformed operations to both clients
7. **Clients** apply transformed operations locally
## Sync State Management
VaultLink maintains sync state to track which changes have been applied.
### Version Vectors
Each document has a version tracked by:
- **Server version**: Incremented on each change
- **Client cursors**: Track which version each client has seen
This enables:
- Efficient syncing (only send changes since last sync)
- Conflict detection (concurrent edits to same version)
- Ordering of operations
### Cursor Management
Clients maintain a cursor position:
```rust
struct Cursor {
vault_id: String,
client_id: String,
last_version: u64,
last_updated: DateTime,
}
```
On sync:
1. Client sends cursor (last seen version)
2. Server returns all changes since that version
3. Client applies changes and updates cursor
## Conflict Resolution Flow
### Scenario: Concurrent Edits
Two users edit the same paragraph simultaneously.
**Initial state**:
```
Version 10: "The quick brown fox jumps over the lazy dog."
```
**User A's edit** (version 11):
```
"The quick brown fox jumps over the very lazy dog."
```
*Inserts "very " at position 40*
**User B's edit** (also from version 10):
```
"The quick red fox jumps over the lazy dog."
```
*Replaces "brown" with "red" at position 10*
### Server Processing
1. **Receive User A's operation**:
- Base: version 10
- Operation: Insert("very ", position=40)
- Apply to database → version 11
2. **Receive User B's operation**:
- Base: version 10
- Operation: Replace("brown"→"red", position=10)
- **Conflict detected**: Base is version 10, but current is version 11
3. **Transform User B's operation**:
- Transform against User A's operation
- Adjust positions/content as needed
- Apply transformed operation → version 12
4. **Broadcast updates**:
- Send User A's operation to User B
- Send transformed User B's operation to User A
### Final Result
```
Version 12: "The quick red fox jumps over the very lazy dog."
```
Both edits are preserved in the final document.
## Edge Cases
### 1. Delete vs Insert Conflict
**Scenario**: User A deletes a paragraph while User B edits it.
**Resolution**:
- OT algorithm prioritizes preservation of content
- Insert operation is transformed to account for deletion
- Typically results in inserted content appearing nearby
**Example**:
```
Base: "Line 1\nLine 2\nLine 3"
User A: Delete Line 2 → "Line 1\nLine 3"
User B: Edit Line 2 → "Line 1\nLine 2 modified\nLine 3"
Result: "Line 1\nLine 2 modified\nLine 3"
```
(Insert takes precedence, preserving user content)
### 2. Overlapping Edits
**Scenario**: Two users edit overlapping regions.
**Resolution**:
- OT splits operations into non-overlapping segments
- Applies each segment independently
- Merges results
### 3. Delete vs Delete
**Scenario**: Two users delete overlapping text.
**Resolution**:
- Deletes are merged
- Final result has the union of deleted ranges removed
### 4. Network Partitions
**Scenario**: Client loses connection, makes edits offline, reconnects.
**Resolution**:
1. Client queues edits locally
2. On reconnect, sends all queued operations
3. Server applies OT against all operations that happened during partition
4. Client receives transformed operations and applies
## Performance Characteristics
### Time Complexity
- **Single operation**: O(1) for most operations
- **Transformation**: O(n) where n is operation size
- **Conflict resolution**: O(m × n) where m is number of concurrent operations
### Space Complexity
- **Version history**: Grows with number of changes
- **Cursors**: O(clients × vaults)
- **Active operations**: Minimal (processed in real-time)
### Optimization
VaultLink optimizes for:
- Small, frequent edits (typical typing patterns)
- Text documents (not binary files)
- Real-time processing (no batching delay)
## Limitations
### Binary Files
OT works best for text files. Binary files:
- Cannot be meaningfully merged
- Use last-write-wins strategy
- May cause data loss on concurrent edits
**Workaround**: Avoid concurrent edits to binary files, or use versioning.
### Large Documents
Very large documents (> 1MB) may have:
- Higher transformation costs
- Slower sync times
- Increased memory usage
**Workaround**: Split large documents or increase timeout settings.
### Complex Formatting
Markdown with complex structures may occasionally produce unexpected results:
- Nested lists
- Tables
- Code blocks
**Workaround**: Manual cleanup if needed, or minimize concurrent edits to complex structures.
## Consistency Guarantees
### Strong Consistency
VaultLink provides **strong eventual consistency**:
- All clients eventually converge to the same state
- Operations applied in causal order
- No data loss under normal operation
### Ordering Guarantees
- Operations from the same client are applied in order
- Concurrent operations may be applied in any order
- Final result is independent of operation order (commutative)
### Durability
- Operations are written to SQLite before acknowledgment
- SQLite ACID guarantees protect against data loss
- Clients retry failed uploads
## Comparison with Other Approaches
### Git-style Merging
| Aspect | Git Merge | VaultLink OT |
|--------|-----------|--------------|
| Real-time | No | Yes |
| Manual conflict resolution | Yes | No |
| Branching | Yes | No |
| Automatic merge | Limited | Always |
| Use case | Code changes | Collaborative documents |
### CRDTs (Conflict-free Replicated Data Types)
| Aspect | CRDTs | VaultLink OT |
|--------|-------|--------------|
| Server required | No | Yes |
| Memory overhead | Higher | Lower |
| Complexity | Higher | Lower |
| Deletion handling | Complex (tombstones) | Simple |
| Best for | Distributed systems | Centralized sync |
### Last Write Wins
| Aspect | LWW | VaultLink OT |
|--------|-----|--------------|
| Data loss | Yes | No |
| Simplicity | High | Medium |
| User experience | Poor | Excellent |
| Performance | Best | Good |
## Algorithm Details
### Transformation Rules
When transforming operation `A` against operation `B`:
1. **Insert vs Insert**:
- If positions equal: Order by client ID
- If different positions: Adjust positions
2. **Insert vs Delete**:
- If insert in deleted range: Shift insert position
- If insert after delete: Adjust position by deleted length
3. **Delete vs Delete**:
- If ranges overlap: Merge delete ranges
- If ranges disjoint: Adjust positions
4. **Retain vs Any**:
- Retain operations don't conflict
- Simply adjust positions
### Transformation Example
```rust
// Pseudo-code for transformation
fn transform(op_a: Operation, op_b: Operation) -> (Operation, Operation) {
match (op_a, op_b) {
(Insert(pos_a, text_a), Insert(pos_b, text_b)) => {
if pos_a < pos_b {
(op_a, Insert(pos_b + text_a.len(), text_b))
} else if pos_a > pos_b {
(Insert(pos_a + text_b.len(), text_a), op_b)
} else {
// Same position, use client ID to break tie
if client_id_a < client_id_b {
(op_a, Insert(pos_b + text_a.len(), text_b))
} else {
(Insert(pos_a + text_b.len(), text_a), op_b)
}
}
}
// ... other cases
}
}
```
## Best Practices
### For Smooth Collaboration
1. **Small edits**: Make small, focused changes for easier merging
2. **Coordinate major changes**: Discuss large refactors with team
3. **Monitor sync status**: Ensure changes are uploaded before signing off
4. **Test conflict resolution**: Verify behavior matches expectations
### For Developers
1. **Text files preferred**: OT works best on text
2. **Limit file sizes**: Keep documents reasonably sized
3. **Binary files**: Use versioning or avoid concurrent edits
4. **Testing**: Test concurrent edit scenarios thoroughly
## Further Reading
- [reconcile-text library](https://crates.io/crates/reconcile-text)
- [Operational Transformation FAQ](https://en.wikipedia.org/wiki/Operational_transformation)
- [Data flow architecture →](/architecture/data-flow)