Add docs
This commit is contained in:
parent
56c1f4d58b
commit
50a95b114d
19 changed files with 4663 additions and 1 deletions
532
docs/architecture/data-flow.md
Normal file
532
docs/architecture/data-flow.md
Normal file
|
|
@ -0,0 +1,532 @@
|
|||
# Data Flow
|
||||
|
||||
This document provides a detailed look at how data flows through the VaultLink system, from client to server and back.
|
||||
|
||||
## Connection Lifecycle
|
||||
|
||||
### 1. Initial Connection
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant C as Client
|
||||
participant S as Server
|
||||
participant DB as Database
|
||||
|
||||
C->>S: WebSocket connect
|
||||
S->>S: Accept connection
|
||||
C->>S: Auth message (token + vault)
|
||||
S->>S: Validate token
|
||||
S->>S: Check vault access
|
||||
S-->>C: Auth success
|
||||
Note over C,S: Connection established
|
||||
```
|
||||
|
||||
**Steps**:
|
||||
1. Client initiates WebSocket connection to server
|
||||
2. Server accepts connection
|
||||
3. Client sends authentication message with token and vault name
|
||||
4. Server validates token against `config.yml`
|
||||
5. Server checks if user has access to requested vault
|
||||
6. Server responds with success or error
|
||||
7. Connection is ready for syncing
|
||||
|
||||
### 2. Initial Sync
|
||||
|
||||
After authentication, the client performs initial synchronization:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant C as Client
|
||||
participant S as Server
|
||||
participant DB as SQLite
|
||||
|
||||
C->>C: Scan local filesystem
|
||||
C->>S: Request file list
|
||||
S->>DB: Query all files
|
||||
DB-->>S: File metadata
|
||||
S-->>C: File list with versions
|
||||
|
||||
loop For each local file
|
||||
C->>C: Check if file on server
|
||||
alt File not on server
|
||||
C->>S: Upload file
|
||||
S->>DB: Store file + metadata
|
||||
else File on server (different version)
|
||||
C->>C: Compare versions
|
||||
C->>S: Upload newer or merge
|
||||
end
|
||||
end
|
||||
|
||||
loop For each server file
|
||||
C->>C: Check if file local
|
||||
alt File not local
|
||||
C->>S: Download file
|
||||
S->>DB: Retrieve file
|
||||
DB-->>S: File content
|
||||
S-->>C: File content
|
||||
C->>C: Write to disk
|
||||
end
|
||||
end
|
||||
|
||||
S-->>C: Sync complete message
|
||||
```
|
||||
|
||||
**Process**:
|
||||
1. Client scans local filesystem
|
||||
2. Client requests file list from server
|
||||
3. Server queries database and returns metadata
|
||||
4. Client uploads missing or changed local files
|
||||
5. Client downloads missing files from server
|
||||
6. Server sends sync complete notification
|
||||
|
||||
### 3. Real-Time Synchronization
|
||||
|
||||
After initial sync, changes are pushed in real-time:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant FS as Filesystem
|
||||
participant C1 as Client 1
|
||||
participant S as Server
|
||||
participant DB as Database
|
||||
participant C2 as Client 2
|
||||
|
||||
FS->>C1: File changed (fs.watch)
|
||||
C1->>C1: Read file content
|
||||
C1->>S: Upload file
|
||||
S->>DB: Store new version
|
||||
S->>S: Apply OT if needed
|
||||
S-->>C1: Upload ACK
|
||||
S->>C2: File update notification
|
||||
C2->>S: Download file
|
||||
S->>DB: Retrieve file
|
||||
DB-->>S: File content
|
||||
S-->>C2: File content
|
||||
C2->>FS: Write to disk
|
||||
```
|
||||
|
||||
**Flow**:
|
||||
1. Filesystem watcher detects local change
|
||||
2. Client reads file content
|
||||
3. Client uploads file via WebSocket
|
||||
4. Server stores in database
|
||||
5. Server applies operational transformation if concurrent edits
|
||||
6. Server acknowledges upload to sender
|
||||
7. Server broadcasts update to other clients
|
||||
8. Other clients download and apply changes
|
||||
|
||||
## File Operations
|
||||
|
||||
### Upload
|
||||
|
||||
```
|
||||
┌─────────┐
|
||||
│ Client │
|
||||
└────┬────┘
|
||||
│ 1. Detect file change
|
||||
│
|
||||
├─► 2. Read file content
|
||||
│
|
||||
├─► 3. Create upload message
|
||||
│ {
|
||||
│ type: "upload_file",
|
||||
│ path: "notes/daily.md",
|
||||
│ content: "...",
|
||||
│ version: 42,
|
||||
│ timestamp: "2024-01-01T12:00:00Z"
|
||||
│ }
|
||||
│
|
||||
▼
|
||||
┌─────────┐
|
||||
│ Server │
|
||||
└────┬────┘
|
||||
│ 4. Validate message
|
||||
│
|
||||
├─► 5. Check permissions
|
||||
│
|
||||
├─► 6. Apply OT (if conflicts)
|
||||
│
|
||||
├─► 7. Store in database
|
||||
│
|
||||
├─► 8. Update version
|
||||
│
|
||||
├─► 9. Broadcast to clients
|
||||
│
|
||||
└─► 10. Send ACK to uploader
|
||||
```
|
||||
|
||||
### Download
|
||||
|
||||
```
|
||||
┌─────────┐
|
||||
│ Server │
|
||||
└────┬────┘
|
||||
│ 1. File updated by another client
|
||||
│
|
||||
├─► 2. Broadcast notification
|
||||
│ {
|
||||
│ type: "file_updated",
|
||||
│ path: "notes/daily.md",
|
||||
│ version: 43
|
||||
│ }
|
||||
│
|
||||
▼
|
||||
┌─────────┐
|
||||
│ Client │
|
||||
└────┬────┘
|
||||
│ 3. Receive notification
|
||||
│
|
||||
├─► 4. Request file download
|
||||
│ {
|
||||
│ type: "download_file",
|
||||
│ path: "notes/daily.md",
|
||||
│ version: 43
|
||||
│ }
|
||||
│
|
||||
▼
|
||||
┌─────────┐
|
||||
│ Server │
|
||||
└────┬────┘
|
||||
│ 5. Retrieve from database
|
||||
│
|
||||
└─► 6. Send file content
|
||||
{
|
||||
type: "file_content",
|
||||
path: "notes/daily.md",
|
||||
content: "...",
|
||||
version: 43
|
||||
}
|
||||
│
|
||||
▼
|
||||
┌─────────┐
|
||||
│ Client │
|
||||
└────┬────┘
|
||||
│ 7. Write to filesystem
|
||||
│
|
||||
└─► 8. Update local metadata
|
||||
```
|
||||
|
||||
### Delete
|
||||
|
||||
```
|
||||
┌─────────┐
|
||||
│ Client │
|
||||
└────┬────┘
|
||||
│ 1. File deleted locally
|
||||
│
|
||||
├─► 2. Send delete message
|
||||
│ {
|
||||
│ type: "delete_file",
|
||||
│ path: "notes/old.md"
|
||||
│ }
|
||||
│
|
||||
▼
|
||||
┌─────────┐
|
||||
│ Server │
|
||||
└────┬────┘
|
||||
│ 3. Mark as deleted in DB
|
||||
│ (soft delete for history)
|
||||
│
|
||||
├─► 4. Broadcast deletion
|
||||
│
|
||||
└─► 5. ACK to sender
|
||||
│
|
||||
▼
|
||||
┌─────────┐
|
||||
│ Other │
|
||||
│ Clients │
|
||||
└────┬────┘
|
||||
│ 6. Delete local file
|
||||
│
|
||||
└─► 7. Update metadata
|
||||
```
|
||||
|
||||
## Conflict Resolution Flow
|
||||
|
||||
### Concurrent Edits Scenario
|
||||
|
||||
```
|
||||
Time →
|
||||
|
||||
Client A Server Client B
|
||||
│ │ │
|
||||
│ Edit file v10 │ │
|
||||
│ "Add line A" │ │ Edit file v10
|
||||
│ │ │ "Add line B"
|
||||
│ │ │
|
||||
├─── Upload @ t1 ─────────►│ │
|
||||
│ │◄────── Upload @ t2 ────────┤
|
||||
│ │ │
|
||||
│ │ 1. Receive both edits │
|
||||
│ │ (based on v10) │
|
||||
│ │ │
|
||||
│ │ 2. Apply first edit │
|
||||
│ │ → v11 (line A added) │
|
||||
│ │ │
|
||||
│ │ 3. Transform second edit │
|
||||
│ │ against first │
|
||||
│ │ │
|
||||
│ │ 4. Apply transformed edit │
|
||||
│ │ → v12 (both lines) │
|
||||
│ │ │
|
||||
│◄──── v12 content ────────┤ │
|
||||
│ ├───── v12 content ─────────►│
|
||||
│ │ │
|
||||
│ Apply v12 │ │ Apply v12
|
||||
│ (has both lines) │ │ (has both lines)
|
||||
│ │ │
|
||||
```
|
||||
|
||||
### Conflict Resolution Steps
|
||||
|
||||
1. **Detection**: Server receives two edits based on the same version
|
||||
2. **Ordering**: Determine which edit to apply first (by timestamp or client ID)
|
||||
3. **First edit**: Apply directly to database
|
||||
4. **Transformation**: Transform second edit against first using OT
|
||||
5. **Second edit**: Apply transformed edit to database
|
||||
6. **Broadcast**: Send merged result to all clients
|
||||
7. **Application**: Clients apply merged version locally
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Core Tables
|
||||
|
||||
```sql
|
||||
-- Document metadata
|
||||
CREATE TABLE documents (
|
||||
id INTEGER PRIMARY KEY,
|
||||
path TEXT NOT NULL,
|
||||
version INTEGER NOT NULL,
|
||||
content_hash TEXT,
|
||||
size INTEGER,
|
||||
created_at TIMESTAMP,
|
||||
updated_at TIMESTAMP,
|
||||
deleted BOOLEAN DEFAULT FALSE
|
||||
);
|
||||
|
||||
-- Version history
|
||||
CREATE TABLE versions (
|
||||
id INTEGER PRIMARY KEY,
|
||||
document_id INTEGER,
|
||||
version INTEGER,
|
||||
content BLOB,
|
||||
created_at TIMESTAMP,
|
||||
FOREIGN KEY (document_id) REFERENCES documents(id)
|
||||
);
|
||||
|
||||
-- Client sync cursors
|
||||
CREATE TABLE cursors (
|
||||
client_id TEXT PRIMARY KEY,
|
||||
last_version INTEGER,
|
||||
last_updated TIMESTAMP
|
||||
);
|
||||
```
|
||||
|
||||
### Queries
|
||||
|
||||
**Get files since version**:
|
||||
```sql
|
||||
SELECT * FROM documents
|
||||
WHERE version > ? AND deleted = FALSE
|
||||
ORDER BY version ASC;
|
||||
```
|
||||
|
||||
**Store new version**:
|
||||
```sql
|
||||
INSERT INTO versions (document_id, version, content, created_at)
|
||||
VALUES (?, ?, ?, ?);
|
||||
|
||||
UPDATE documents
|
||||
SET version = ?, updated_at = ?
|
||||
WHERE id = ?;
|
||||
```
|
||||
|
||||
**Update cursor**:
|
||||
```sql
|
||||
INSERT OR REPLACE INTO cursors (client_id, last_version, last_updated)
|
||||
VALUES (?, ?, ?);
|
||||
```
|
||||
|
||||
## Message Protocol
|
||||
|
||||
### Client → Server Messages
|
||||
|
||||
**Upload File**:
|
||||
```json
|
||||
{
|
||||
"type": "upload_file",
|
||||
"path": "notes/example.md",
|
||||
"content": "File content here...",
|
||||
"base_version": 10,
|
||||
"timestamp": "2024-01-01T12:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Download File**:
|
||||
```json
|
||||
{
|
||||
"type": "download_file",
|
||||
"path": "notes/example.md"
|
||||
}
|
||||
```
|
||||
|
||||
**Delete File**:
|
||||
```json
|
||||
{
|
||||
"type": "delete_file",
|
||||
"path": "notes/old.md"
|
||||
}
|
||||
```
|
||||
|
||||
**List Files**:
|
||||
```json
|
||||
{
|
||||
"type": "list_files",
|
||||
"since_version": 0
|
||||
}
|
||||
```
|
||||
|
||||
### Server → Client Messages
|
||||
|
||||
**File Updated**:
|
||||
```json
|
||||
{
|
||||
"type": "file_updated",
|
||||
"path": "notes/example.md",
|
||||
"version": 11,
|
||||
"size": 1024,
|
||||
"hash": "abc123..."
|
||||
}
|
||||
```
|
||||
|
||||
**File Content**:
|
||||
```json
|
||||
{
|
||||
"type": "file_content",
|
||||
"path": "notes/example.md",
|
||||
"content": "Updated content...",
|
||||
"version": 11
|
||||
}
|
||||
```
|
||||
|
||||
**File Deleted**:
|
||||
```json
|
||||
{
|
||||
"type": "file_deleted",
|
||||
"path": "notes/old.md",
|
||||
"version": 12
|
||||
}
|
||||
```
|
||||
|
||||
**Sync Complete**:
|
||||
```json
|
||||
{
|
||||
"type": "sync_complete",
|
||||
"total_files": 150,
|
||||
"current_version": 200
|
||||
}
|
||||
```
|
||||
|
||||
**Error**:
|
||||
```json
|
||||
{
|
||||
"type": "error",
|
||||
"message": "File too large",
|
||||
"code": "FILE_TOO_LARGE"
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Client-Side Errors
|
||||
|
||||
**Network failure**:
|
||||
1. Detect WebSocket disconnect
|
||||
2. Queue pending operations
|
||||
3. Retry connection with exponential backoff
|
||||
4. Replay queued operations on reconnect
|
||||
|
||||
**File read error**:
|
||||
1. Log error
|
||||
2. Skip file
|
||||
3. Continue with other files
|
||||
4. Report to user
|
||||
|
||||
**Write conflict**:
|
||||
1. Receive updated version from server
|
||||
2. Apply OT merge locally
|
||||
3. Overwrite local file
|
||||
4. Continue syncing
|
||||
|
||||
### Server-Side Errors
|
||||
|
||||
**Database error**:
|
||||
1. Log error
|
||||
2. Return error to client
|
||||
3. Client retries operation
|
||||
|
||||
**Invalid operation**:
|
||||
1. Validate message format
|
||||
2. Return specific error code
|
||||
3. Client handles error appropriately
|
||||
|
||||
**Authentication failure**:
|
||||
1. Reject connection
|
||||
2. Send auth error
|
||||
3. Client prompts for new credentials
|
||||
|
||||
## Performance Optimizations
|
||||
|
||||
### Batching
|
||||
|
||||
- Small, rapid changes are batched together
|
||||
- Reduces message overhead
|
||||
- Applied as single atomic update
|
||||
|
||||
### Compression
|
||||
|
||||
- Large files compressed before transmission
|
||||
- Reduces bandwidth usage
|
||||
- Transparent to application layer
|
||||
|
||||
### Incremental Sync
|
||||
|
||||
- Only changed portions of files sent
|
||||
- Uses content-based diffing
|
||||
- Significantly reduces data transfer
|
||||
|
||||
### Caching
|
||||
|
||||
- Server caches recent file versions
|
||||
- Reduces database queries
|
||||
- Improves response time
|
||||
|
||||
## Monitoring Data Flow
|
||||
|
||||
### Server Logs
|
||||
|
||||
```
|
||||
2024-01-01 12:00:00 INFO WebSocket connection from 192.168.1.100
|
||||
2024-01-01 12:00:01 INFO User 'alice' authenticated for vault 'personal'
|
||||
2024-01-01 12:00:05 INFO Upload: notes/daily.md (v10 -> v11)
|
||||
2024-01-01 12:00:06 INFO Broadcast to 3 clients
|
||||
2024-01-01 12:00:10 INFO Conflict resolved: notes/shared.md (v12)
|
||||
```
|
||||
|
||||
### Client Logs
|
||||
|
||||
```
|
||||
2024-01-01 12:00:00 INFO Connecting to ws://sync.example.com
|
||||
2024-01-01 12:00:01 INFO Connected, authenticating...
|
||||
2024-01-01 12:00:01 INFO Authentication successful
|
||||
2024-01-01 12:00:02 INFO Starting initial sync
|
||||
2024-01-01 12:00:10 INFO Sync complete: 150 files, 200 MB
|
||||
2024-01-01 12:00:15 INFO Uploaded: notes/daily.md
|
||||
2024-01-01 12:00:20 INFO Downloaded: notes/shared.md (merged)
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Understand the sync algorithm →](/architecture/sync-algorithm)
|
||||
- [Configure the server →](/config/server)
|
||||
- [Deploy VaultLink →](/guide/getting-started)
|
||||
344
docs/architecture/index.md
Normal file
344
docs/architecture/index.md
Normal file
|
|
@ -0,0 +1,344 @@
|
|||
# Architecture Overview
|
||||
|
||||
VaultLink is built as a distributed system with a central sync server and multiple clients. This document explains the high-level architecture and design decisions.
|
||||
|
||||
## System Components
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Clients │
|
||||
├─────────────────────┬───────────────────┬───────────────────┤
|
||||
│ Obsidian Plugin │ Obsidian Plugin │ CLI Client │
|
||||
│ (User A - Device1) │ (User A - Device2│ (Server/Backup) │
|
||||
└──────────┬──────────┴─────────┬─────────┴──────────┬────────┘
|
||||
│ │ │
|
||||
│ WebSocket │ WebSocket │ WebSocket
|
||||
│ │ │
|
||||
└────────────────────┼────────────────────┘
|
||||
│
|
||||
┌───────────▼───────────┐
|
||||
│ Sync Server │
|
||||
│ (Rust + Axum) │
|
||||
│ │
|
||||
│ ┌─────────────────┐ │
|
||||
│ │ WebSocket Hub │ │
|
||||
│ └────────┬────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ Sync Engine │ │
|
||||
│ │ (OT Algorithm) │ │
|
||||
│ └────────┬────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ SQLite Database │ │
|
||||
│ │ (Per Vault) │ │
|
||||
│ └─────────────────┘ │
|
||||
└───────────────────────┘
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### Sync Server
|
||||
|
||||
The central authority for synchronization, written in Rust using Axum framework.
|
||||
|
||||
**Responsibilities**:
|
||||
- Accept WebSocket connections from clients
|
||||
- Authenticate users via token-based auth
|
||||
- Store document versions in SQLite
|
||||
- Coordinate real-time updates between clients
|
||||
- Apply operational transformation for conflict resolution
|
||||
- Manage vault access control
|
||||
|
||||
**Technology**:
|
||||
- **Language**: Rust 1.89+
|
||||
- **Framework**: Axum (async web framework)
|
||||
- **Database**: SQLite with SQLx
|
||||
- **Protocol**: WebSockets for real-time communication
|
||||
- **Sync Algorithm**: reconcile-text (operational transformation)
|
||||
|
||||
### Sync Client Library
|
||||
|
||||
TypeScript library providing core synchronization logic, used by both the Obsidian plugin and CLI client.
|
||||
|
||||
**Responsibilities**:
|
||||
- Manage WebSocket connection to server
|
||||
- Watch local filesystem for changes
|
||||
- Upload and download files
|
||||
- Apply remote changes locally
|
||||
- Handle conflict resolution
|
||||
- Maintain sync metadata
|
||||
|
||||
**Technology**:
|
||||
- **Language**: TypeScript
|
||||
- **Build**: Webpack
|
||||
- **Protocol**: WebSocket client
|
||||
- **File System**: Node.js `fs` API / Obsidian API
|
||||
|
||||
### Obsidian Plugin
|
||||
|
||||
Integration layer between sync client and Obsidian.
|
||||
|
||||
**Responsibilities**:
|
||||
- Provide UI for configuration
|
||||
- Bridge sync client with Obsidian's file system API
|
||||
- Handle Obsidian lifecycle events
|
||||
- Display sync status to users
|
||||
|
||||
**Technology**:
|
||||
- **Platform**: Obsidian Plugin API
|
||||
- **Core**: sync-client library
|
||||
- **UI**: Obsidian settings UI
|
||||
|
||||
### CLI Client
|
||||
|
||||
Standalone executable for syncing vaults without Obsidian.
|
||||
|
||||
**Responsibilities**:
|
||||
- Command-line interface
|
||||
- File system access via Node.js
|
||||
- Daemon mode for continuous sync
|
||||
- Health check endpoint for monitoring
|
||||
|
||||
**Technology**:
|
||||
- **Language**: TypeScript
|
||||
- **Runtime**: Node.js
|
||||
- **CLI**: Commander.js
|
||||
- **Core**: sync-client library
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Initial Connection
|
||||
|
||||
1. Client connects via WebSocket to server
|
||||
2. Server authenticates using provided token
|
||||
3. Server verifies user has access to requested vault
|
||||
4. Connection established, sync begins
|
||||
|
||||
### File Upload Flow
|
||||
|
||||
```
|
||||
Client Server
|
||||
│ │
|
||||
│ 1. File changed locally │
|
||||
│ │
|
||||
│ 2. Read file content │
|
||||
│ │
|
||||
│ 3. WebSocket: Upload file │
|
||||
├──────────────────────────────►│
|
||||
│ │ 4. Store in SQLite
|
||||
│ │
|
||||
│ │ 5. Broadcast to other clients
|
||||
│ ├───────────────────────►
|
||||
│ 6. Ack upload │
|
||||
│◄──────────────────────────────┤
|
||||
```
|
||||
|
||||
### File Download Flow
|
||||
|
||||
```
|
||||
Client A Server Client B
|
||||
│ │ │
|
||||
│ │ 1. File uploaded │
|
||||
│ │◄────────────────────────┤
|
||||
│ │ │
|
||||
│ │ 2. Store in DB │
|
||||
│ │ │
|
||||
│ 3. Push notification │ │
|
||||
│◄────────────────────────┤ │
|
||||
│ │ │
|
||||
│ 4. Download file │ │
|
||||
├────────────────────────►│ │
|
||||
│ │ │
|
||||
│ 5. Write locally │ │
|
||||
│ │ │
|
||||
```
|
||||
|
||||
### Conflict Resolution
|
||||
|
||||
When two clients edit the same file simultaneously:
|
||||
|
||||
```
|
||||
Client A Server Client B
|
||||
│ │ │
|
||||
│ 1. Edit file │ │ 1. Edit same file
|
||||
│ │ │
|
||||
│ 2. Upload changes │ │ 2. Upload changes
|
||||
├────────────────────────►│◄────────────────────────┤
|
||||
│ │ │
|
||||
│ │ 3. Apply OT algorithm │
|
||||
│ │ - Merge both edits │
|
||||
│ │ - Preserve all changes│
|
||||
│ │ │
|
||||
│ 4. Receive merged ver. │ 5. Receive merged ver. │
|
||||
│◄────────────────────────┤────────────────────────►│
|
||||
│ │ │
|
||||
│ 6. Apply locally │ │ 6. Apply locally
|
||||
```
|
||||
|
||||
## Storage Architecture
|
||||
|
||||
### Server Storage
|
||||
|
||||
Each vault has its own SQLite database:
|
||||
|
||||
```
|
||||
databases/
|
||||
├── vault-1.db
|
||||
├── vault-2.db
|
||||
└── shared-team.db
|
||||
```
|
||||
|
||||
**Database Schema** (simplified):
|
||||
- **documents**: File metadata (path, size, modified time)
|
||||
- **versions**: Document content with version history
|
||||
- **cursors**: Client sync state
|
||||
|
||||
### Client Storage
|
||||
|
||||
Clients maintain sync metadata:
|
||||
|
||||
```
|
||||
.vaultlink/
|
||||
├── metadata.json # Sync state
|
||||
└── cache/ # Optional local cache
|
||||
```
|
||||
|
||||
The `.vaultlink` directory tracks which files have been synced and their versions to enable efficient synchronization.
|
||||
|
||||
## Communication Protocol
|
||||
|
||||
### WebSocket Messages
|
||||
|
||||
Client-server communication uses JSON messages over WebSocket.
|
||||
|
||||
**Message Types**:
|
||||
- `upload_file`: Client → Server (file upload)
|
||||
- `download_file`: Client → Server (request file)
|
||||
- `file_updated`: Server → Client (file changed notification)
|
||||
- `file_deleted`: Server → Client (file deleted notification)
|
||||
- `sync_complete`: Server → Client (initial sync finished)
|
||||
|
||||
### Authentication
|
||||
|
||||
Token-based authentication on connection:
|
||||
|
||||
```typescript
|
||||
// Client sends token on connect
|
||||
{
|
||||
type: "auth",
|
||||
token: "user-auth-token",
|
||||
vault: "vault-name"
|
||||
}
|
||||
|
||||
// Server responds
|
||||
{
|
||||
type: "auth_success"
|
||||
}
|
||||
// or
|
||||
{
|
||||
type: "auth_error",
|
||||
message: "Invalid token"
|
||||
}
|
||||
```
|
||||
|
||||
## Scalability Considerations
|
||||
|
||||
### Current Architecture
|
||||
|
||||
- **SQLite per vault**: Simple, performant, limited to single server
|
||||
- **WebSocket connections**: Stateful, requires sticky sessions for load balancing
|
||||
- **Operational transformation**: Centralized on server
|
||||
|
||||
### Scaling Approaches
|
||||
|
||||
**Vertical Scaling**:
|
||||
- Increase server resources (CPU, RAM, storage)
|
||||
- Optimize database queries and indexing
|
||||
- Tune connection limits
|
||||
|
||||
**Horizontal Scaling** (future):
|
||||
- Separate vault servers (vault sharding)
|
||||
- Load balancer with sticky sessions
|
||||
- Shared storage layer for SQLite databases
|
||||
- Consider alternative databases (PostgreSQL) for multi-server setups
|
||||
|
||||
### Performance Characteristics
|
||||
|
||||
- **Small vaults** (< 1000 files): Excellent performance
|
||||
- **Medium vaults** (1000-10000 files): Good performance with tuning
|
||||
- **Large vaults** (> 10000 files): May require optimization
|
||||
- **Concurrent users**: Tested with dozens of simultaneous clients per vault
|
||||
|
||||
## Security Model
|
||||
|
||||
### Authentication
|
||||
|
||||
- Token-based authentication
|
||||
- Tokens configured in server `config.yml`
|
||||
- No password hashing (tokens are secrets)
|
||||
|
||||
### Authorization
|
||||
|
||||
- Per-user vault access control
|
||||
- Allow-list or deny-list patterns
|
||||
- Global access or vault-specific access
|
||||
|
||||
### Network Security
|
||||
|
||||
- WebSocket over TLS (WSS) for encrypted transport
|
||||
- No built-in SSL (use reverse proxy)
|
||||
- CORS configured for web clients
|
||||
|
||||
### Data Security
|
||||
|
||||
- No encryption at rest (use encrypted filesystems if needed)
|
||||
- No end-to-end encryption (server sees all content)
|
||||
- Self-hosted model: you control the data
|
||||
|
||||
## Technology Choices
|
||||
|
||||
### Why Rust for Server?
|
||||
|
||||
- **Performance**: Low latency for real-time sync
|
||||
- **Memory safety**: No crashes from memory bugs
|
||||
- **Concurrency**: Excellent async support with Tokio
|
||||
- **Type safety**: Catch bugs at compile time
|
||||
- **SQLx**: Compile-time SQL verification
|
||||
|
||||
### Why SQLite?
|
||||
|
||||
- **Simplicity**: No separate database server required
|
||||
- **Performance**: Fast for read-heavy workloads
|
||||
- **Reliability**: Battle-tested, ACID compliant
|
||||
- **Portability**: Single file per vault
|
||||
- **Backups**: Simple file copy
|
||||
|
||||
### Why WebSocket?
|
||||
|
||||
- **Real-time**: Bidirectional push for instant updates
|
||||
- **Efficiency**: Persistent connection, no polling overhead
|
||||
- **Simplicity**: Built-in browser/Node.js support
|
||||
- **Standards**: Well-supported protocol
|
||||
|
||||
### Why Operational Transformation?
|
||||
|
||||
- **Automatic conflict resolution**: No manual merging required
|
||||
- **Preserves intent**: All edits are kept
|
||||
- **Real-time collaboration**: Users see changes as they happen
|
||||
- **Proven algorithm**: Used by Google Docs, etc.
|
||||
|
||||
## Design Principles
|
||||
|
||||
1. **Self-hosted first**: Users control their data and infrastructure
|
||||
2. **Simplicity**: Easy to deploy and operate
|
||||
3. **Real-time**: Changes appear immediately
|
||||
4. **Reliability**: Handle network failures gracefully
|
||||
5. **Performance**: Fast sync for typical vault sizes
|
||||
6. **Privacy**: No third-party services or telemetry
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Learn about the sync algorithm →](/architecture/sync-algorithm)
|
||||
- [Understand data flow in detail →](/architecture/data-flow)
|
||||
- [Deploy the server →](/guide/server-setup)
|
||||
361
docs/architecture/sync-algorithm.md
Normal file
361
docs/architecture/sync-algorithm.md
Normal file
|
|
@ -0,0 +1,361 @@
|
|||
# Sync Algorithm
|
||||
|
||||
VaultLink uses operational transformation (OT) to handle concurrent edits and maintain consistency across clients. This document explains how the algorithm works.
|
||||
|
||||
## Operational Transformation
|
||||
|
||||
Operational transformation is a technique for managing concurrent edits to the same document. It transforms operations (edits) so they can be applied in different orders while preserving user intent.
|
||||
|
||||
### Why OT?
|
||||
|
||||
Traditional conflict resolution approaches:
|
||||
- **Last write wins**: Loses data, frustrating for users
|
||||
- **Manual merging**: Interrupts workflow, requires user intervention
|
||||
- **Version branching**: Complex, not suitable for real-time sync
|
||||
|
||||
Operational transformation:
|
||||
- **Automatic**: No user intervention required
|
||||
- **Preserves all edits**: No data loss
|
||||
- **Real-time**: Changes appear immediately
|
||||
- **Intuitive**: Behavior matches user expectations
|
||||
|
||||
## The reconcile-text Library
|
||||
|
||||
VaultLink uses the [`reconcile-text`](https://crates.io/crates/reconcile-text) Rust library for operational transformation on text documents.
|
||||
|
||||
### How It Works
|
||||
|
||||
Given a base document and two sets of changes, OT produces a merged result that includes both changes.
|
||||
|
||||
**Example**:
|
||||
|
||||
```
|
||||
Base document: "Hello world"
|
||||
|
||||
User A: "Hello beautiful world" (inserts "beautiful ")
|
||||
User B: "Hello world!" (inserts "!")
|
||||
|
||||
OT result: "Hello beautiful world!" (both changes applied)
|
||||
```
|
||||
|
||||
### Operation Types
|
||||
|
||||
The algorithm handles these operations:
|
||||
- **Insert**: Add text at position
|
||||
- **Delete**: Remove text from position
|
||||
- **Retain**: Keep existing text unchanged
|
||||
|
||||
### Transformation Process
|
||||
|
||||
1. **Client A** makes edit and sends to server
|
||||
2. **Client B** makes concurrent edit and sends to server
|
||||
3. **Server** receives both edits
|
||||
4. **Server** transforms operations to account for concurrent changes
|
||||
5. **Server** applies merged result to database
|
||||
6. **Server** sends transformed operations to both clients
|
||||
7. **Clients** apply transformed operations locally
|
||||
|
||||
## Sync State Management
|
||||
|
||||
VaultLink maintains sync state to track which changes have been applied.
|
||||
|
||||
### Version Vectors
|
||||
|
||||
Each document has a version tracked by:
|
||||
- **Server version**: Incremented on each change
|
||||
- **Client cursors**: Track which version each client has seen
|
||||
|
||||
This enables:
|
||||
- Efficient syncing (only send changes since last sync)
|
||||
- Conflict detection (concurrent edits to same version)
|
||||
- Ordering of operations
|
||||
|
||||
### Cursor Management
|
||||
|
||||
Clients maintain a cursor position:
|
||||
|
||||
```rust
|
||||
struct Cursor {
|
||||
vault_id: String,
|
||||
client_id: String,
|
||||
last_version: u64,
|
||||
last_updated: DateTime,
|
||||
}
|
||||
```
|
||||
|
||||
On sync:
|
||||
1. Client sends cursor (last seen version)
|
||||
2. Server returns all changes since that version
|
||||
3. Client applies changes and updates cursor
|
||||
|
||||
## Conflict Resolution Flow
|
||||
|
||||
### Scenario: Concurrent Edits
|
||||
|
||||
Two users edit the same paragraph simultaneously.
|
||||
|
||||
**Initial state**:
|
||||
```
|
||||
Version 10: "The quick brown fox jumps over the lazy dog."
|
||||
```
|
||||
|
||||
**User A's edit** (version 11):
|
||||
```
|
||||
"The quick brown fox jumps over the very lazy dog."
|
||||
```
|
||||
*Inserts "very " at position 40*
|
||||
|
||||
**User B's edit** (also from version 10):
|
||||
```
|
||||
"The quick red fox jumps over the lazy dog."
|
||||
```
|
||||
*Replaces "brown" with "red" at position 10*
|
||||
|
||||
### Server Processing
|
||||
|
||||
1. **Receive User A's operation**:
|
||||
- Base: version 10
|
||||
- Operation: Insert("very ", position=40)
|
||||
- Apply to database → version 11
|
||||
|
||||
2. **Receive User B's operation**:
|
||||
- Base: version 10
|
||||
- Operation: Replace("brown"→"red", position=10)
|
||||
- **Conflict detected**: Base is version 10, but current is version 11
|
||||
|
||||
3. **Transform User B's operation**:
|
||||
- Transform against User A's operation
|
||||
- Adjust positions/content as needed
|
||||
- Apply transformed operation → version 12
|
||||
|
||||
4. **Broadcast updates**:
|
||||
- Send User A's operation to User B
|
||||
- Send transformed User B's operation to User A
|
||||
|
||||
### Final Result
|
||||
|
||||
```
|
||||
Version 12: "The quick red fox jumps over the very lazy dog."
|
||||
```
|
||||
|
||||
Both edits are preserved in the final document.
|
||||
|
||||
## Edge Cases
|
||||
|
||||
### 1. Delete vs Insert Conflict
|
||||
|
||||
**Scenario**: User A deletes a paragraph while User B edits it.
|
||||
|
||||
**Resolution**:
|
||||
- OT algorithm prioritizes preservation of content
|
||||
- Insert operation is transformed to account for deletion
|
||||
- Typically results in inserted content appearing nearby
|
||||
|
||||
**Example**:
|
||||
```
|
||||
Base: "Line 1\nLine 2\nLine 3"
|
||||
|
||||
User A: Delete Line 2 → "Line 1\nLine 3"
|
||||
User B: Edit Line 2 → "Line 1\nLine 2 modified\nLine 3"
|
||||
|
||||
Result: "Line 1\nLine 2 modified\nLine 3"
|
||||
```
|
||||
(Insert takes precedence, preserving user content)
|
||||
|
||||
### 2. Overlapping Edits
|
||||
|
||||
**Scenario**: Two users edit overlapping regions.
|
||||
|
||||
**Resolution**:
|
||||
- OT splits operations into non-overlapping segments
|
||||
- Applies each segment independently
|
||||
- Merges results
|
||||
|
||||
### 3. Delete vs Delete
|
||||
|
||||
**Scenario**: Two users delete overlapping text.
|
||||
|
||||
**Resolution**:
|
||||
- Deletes are merged
|
||||
- Final result has the union of deleted ranges removed
|
||||
|
||||
### 4. Network Partitions
|
||||
|
||||
**Scenario**: Client loses connection, makes edits offline, reconnects.
|
||||
|
||||
**Resolution**:
|
||||
1. Client queues edits locally
|
||||
2. On reconnect, sends all queued operations
|
||||
3. Server applies OT against all operations that happened during partition
|
||||
4. Client receives transformed operations and applies
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Time Complexity
|
||||
|
||||
- **Single operation**: O(1) for most operations
|
||||
- **Transformation**: O(n) where n is operation size
|
||||
- **Conflict resolution**: O(m × n) where m is number of concurrent operations
|
||||
|
||||
### Space Complexity
|
||||
|
||||
- **Version history**: Grows with number of changes
|
||||
- **Cursors**: O(clients × vaults)
|
||||
- **Active operations**: Minimal (processed in real-time)
|
||||
|
||||
### Optimization
|
||||
|
||||
VaultLink optimizes for:
|
||||
- Small, frequent edits (typical typing patterns)
|
||||
- Text documents (not binary files)
|
||||
- Real-time processing (no batching delay)
|
||||
|
||||
## Limitations
|
||||
|
||||
### Binary Files
|
||||
|
||||
OT works best for text files. Binary files:
|
||||
- Cannot be meaningfully merged
|
||||
- Use last-write-wins strategy
|
||||
- May cause data loss on concurrent edits
|
||||
|
||||
**Workaround**: Avoid concurrent edits to binary files, or use versioning.
|
||||
|
||||
### Large Documents
|
||||
|
||||
Very large documents (> 1MB) may have:
|
||||
- Higher transformation costs
|
||||
- Slower sync times
|
||||
- Increased memory usage
|
||||
|
||||
**Workaround**: Split large documents or increase timeout settings.
|
||||
|
||||
### Complex Formatting
|
||||
|
||||
Markdown with complex structures may occasionally produce unexpected results:
|
||||
- Nested lists
|
||||
- Tables
|
||||
- Code blocks
|
||||
|
||||
**Workaround**: Manual cleanup if needed, or minimize concurrent edits to complex structures.
|
||||
|
||||
## Consistency Guarantees
|
||||
|
||||
### Strong Consistency
|
||||
|
||||
VaultLink provides **strong eventual consistency**:
|
||||
- All clients eventually converge to the same state
|
||||
- Operations applied in causal order
|
||||
- No data loss under normal operation
|
||||
|
||||
### Ordering Guarantees
|
||||
|
||||
- Operations from the same client are applied in order
|
||||
- Concurrent operations may be applied in any order
|
||||
- Final result is independent of operation order (commutative)
|
||||
|
||||
### Durability
|
||||
|
||||
- Operations are written to SQLite before acknowledgment
|
||||
- SQLite ACID guarantees protect against data loss
|
||||
- Clients retry failed uploads
|
||||
|
||||
## Comparison with Other Approaches
|
||||
|
||||
### Git-style Merging
|
||||
|
||||
| Aspect | Git Merge | VaultLink OT |
|
||||
|--------|-----------|--------------|
|
||||
| Real-time | No | Yes |
|
||||
| Manual conflict resolution | Yes | No |
|
||||
| Branching | Yes | No |
|
||||
| Automatic merge | Limited | Always |
|
||||
| Use case | Code changes | Collaborative documents |
|
||||
|
||||
### CRDTs (Conflict-free Replicated Data Types)
|
||||
|
||||
| Aspect | CRDTs | VaultLink OT |
|
||||
|--------|-------|--------------|
|
||||
| Server required | No | Yes |
|
||||
| Memory overhead | Higher | Lower |
|
||||
| Complexity | Higher | Lower |
|
||||
| Deletion handling | Complex (tombstones) | Simple |
|
||||
| Best for | Distributed systems | Centralized sync |
|
||||
|
||||
### Last Write Wins
|
||||
|
||||
| Aspect | LWW | VaultLink OT |
|
||||
|--------|-----|--------------|
|
||||
| Data loss | Yes | No |
|
||||
| Simplicity | High | Medium |
|
||||
| User experience | Poor | Excellent |
|
||||
| Performance | Best | Good |
|
||||
|
||||
## Algorithm Details
|
||||
|
||||
### Transformation Rules
|
||||
|
||||
When transforming operation `A` against operation `B`:
|
||||
|
||||
1. **Insert vs Insert**:
|
||||
- If positions equal: Order by client ID
|
||||
- If different positions: Adjust positions
|
||||
|
||||
2. **Insert vs Delete**:
|
||||
- If insert in deleted range: Shift insert position
|
||||
- If insert after delete: Adjust position by deleted length
|
||||
|
||||
3. **Delete vs Delete**:
|
||||
- If ranges overlap: Merge delete ranges
|
||||
- If ranges disjoint: Adjust positions
|
||||
|
||||
4. **Retain vs Any**:
|
||||
- Retain operations don't conflict
|
||||
- Simply adjust positions
|
||||
|
||||
### Transformation Example
|
||||
|
||||
```rust
|
||||
// Pseudo-code for transformation
|
||||
fn transform(op_a: Operation, op_b: Operation) -> (Operation, Operation) {
|
||||
match (op_a, op_b) {
|
||||
(Insert(pos_a, text_a), Insert(pos_b, text_b)) => {
|
||||
if pos_a < pos_b {
|
||||
(op_a, Insert(pos_b + text_a.len(), text_b))
|
||||
} else if pos_a > pos_b {
|
||||
(Insert(pos_a + text_b.len(), text_a), op_b)
|
||||
} else {
|
||||
// Same position, use client ID to break tie
|
||||
if client_id_a < client_id_b {
|
||||
(op_a, Insert(pos_b + text_a.len(), text_b))
|
||||
} else {
|
||||
(Insert(pos_a + text_b.len(), text_a), op_b)
|
||||
}
|
||||
}
|
||||
}
|
||||
// ... other cases
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### For Smooth Collaboration
|
||||
|
||||
1. **Small edits**: Make small, focused changes for easier merging
|
||||
2. **Coordinate major changes**: Discuss large refactors with team
|
||||
3. **Monitor sync status**: Ensure changes are uploaded before signing off
|
||||
4. **Test conflict resolution**: Verify behavior matches expectations
|
||||
|
||||
### For Developers
|
||||
|
||||
1. **Text files preferred**: OT works best on text
|
||||
2. **Limit file sizes**: Keep documents reasonably sized
|
||||
3. **Binary files**: Use versioning or avoid concurrent edits
|
||||
4. **Testing**: Test concurrent edit scenarios thoroughly
|
||||
|
||||
## Further Reading
|
||||
|
||||
- [reconcile-text library](https://crates.io/crates/reconcile-text)
|
||||
- [Operational Transformation FAQ](https://en.wikipedia.org/wiki/Operational_transformation)
|
||||
- [Data flow architecture →](/architecture/data-flow)
|
||||
Loading…
Add table
Add a link
Reference in a new issue