This commit is contained in:
Andras Schmelczer 2025-11-22 19:41:24 +00:00
parent 511ac78e6d
commit d590a2c9c8
8 changed files with 261 additions and 30 deletions

View file

@ -18,6 +18,7 @@ export default defineConfig({
items: [ items: [
{ text: "What is VaultLink?", link: "/guide/what-is-vaultlink" }, { text: "What is VaultLink?", link: "/guide/what-is-vaultlink" },
{ text: "Getting Started", link: "/guide/getting-started" }, { text: "Getting Started", link: "/guide/getting-started" },
{ text: "Limitations", link: "/guide/limitations" },
{ text: "Comparison with Alternatives", link: "/guide/alternatives" } { text: "Comparison with Alternatives", link: "/guide/alternatives" }
] ]
}, },

View file

@ -60,19 +60,27 @@ For note-taking workflows where users value editor freedom and offline editing,
### How It Works ### How It Works
Given a base document and two sets of changes, OT produces a merged result that includes both changes. Given three versions (parent, left, right), reconcile-text produces a merged result.
**How reconcile-text works**:
1. **Tokenisation**: Split text into words (using `BuiltinTokenizer::Word`)
2. **Three-way diff**: Compare parent→left and parent→right changes
3. **Merge**: Combine non-conflicting changes, prefer content preservation for conflicts
4. **Result**: Merged text with both edits applied
**Example**: **Example**:
``` ```
Base document: "Hello world" Parent: "The quick brown fox"
User A: "The quick red fox" (changes "brown" → "red")
User B: "The very quick brown fox" (inserts "very ")
User A: "Hello beautiful world" (inserts "beautiful ") Merged: "The very quick red fox" (both changes applied)
User B: "Hello world!" (inserts "!")
OT result: "Hello beautiful world!" (both changes applied)
``` ```
**Merge conditions**: Only `.md` and `.txt` files with valid UTF-8 get merged. Binary files or other extensions use last-write-wins.
### Operation Types ### Operation Types
The algorithm handles these operations: The algorithm handles these operations:
@ -263,15 +271,25 @@ VaultLink optimises for:
## Limitations ## Limitations
### Binary Files ### Binary and Non-Mergeable Files
OT works best for text files. Binary files: Only **`.md`** and **`.txt`** files get automatic merging. Everything else uses last-write-wins.
- Cannot be meaningfully merged **Binary detection**:
- Use last-write-wins strategy
- May cause data loss on concurrent edits
**Workaround**: Avoid concurrent edits to binary files, or use versioning. - Files with NUL bytes (`0x00`)
- Files failing UTF-8 validation
Even `.md` files are treated as binary if they fail UTF-8 checks.
**Last-write-wins behaviour**:
```
User A uploads image.png → Server version 1
User B uploads image.png → Server version 2 (A's upload lost)
```
**Workaround**: Avoid concurrent edits to non-text files. [See all limitations →](/guide/limitations)
### Large Documents ### Large Documents

View file

@ -55,26 +55,37 @@ done
### Version History Cleanup ### Version History Cleanup
To limit database growth, implement version history pruning (requires custom script): VaultLink stores **all versions indefinitely** by default. Database grows with every change.
**Database schema**: Each version stored in `documents` table with `vault_update_id` (sequential).
Manual cleanup (keep last 100 versions per document):
```bash ```bash
#!/bin/bash #!/bin/bash
# prune-old-versions.sh # prune-old-versions.sh
# Keep only last 100 versions per document
for db in databases/*.db; do for db in databases/*.db; do
sqlite3 "$db" <<EOF sqlite3 "$db" <<EOF
DELETE FROM versions DELETE FROM documents
WHERE id NOT IN ( WHERE vault_update_id NOT IN (
SELECT id FROM versions SELECT vault_update_id FROM documents d2
WHERE document_id = versions.document_id WHERE d2.document_id = documents.document_id
ORDER BY version DESC ORDER BY vault_update_id DESC
LIMIT 100 LIMIT 100
); );
EOF EOF
done done
``` ```
**Warning**: This deletes old versions permanently. No undo.
Run monthly via cron:
```bash
0 3 1 * * /opt/vaultlink/prune-old-versions.sh
```
## Performance Tuning ## Performance Tuning
### Connection Pool Sizing ### Connection Pool Sizing
@ -186,9 +197,9 @@ server {
proxy_pass http://vaultlink; proxy_pass http://vaultlink;
} }
# Health check endpoint # Health check endpoint (use any vault name)
location /health { location /health {
proxy_pass http://vaultlink/vaults/health/ping; proxy_pass http://vaultlink/vaults/test/ping;
access_log off; access_log off;
} }
} }
@ -320,7 +331,7 @@ services:
vaultlink-server: vaultlink-server:
image: ghcr.io/schmelczer/vault-link-server:latest image: ghcr.io/schmelczer/vault-link-server:latest
healthcheck: healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/vaults/health/ping || exit 1"] test: ["CMD-SHELL", "curl -f http://localhost:3000/vaults/test/ping || exit 1"]
interval: 10s interval: 10s
timeout: 5s timeout: 5s
retries: 3 retries: 3
@ -334,7 +345,7 @@ Monitor health in production:
# health-monitor.sh # health-monitor.sh
while true; do while true; do
if ! curl -sf http://localhost:3000/vaults/health/ping > /dev/null; then if ! curl -sf http://localhost:3000/vaults/test/ping > /dev/null; then
echo "Health check failed at $(date)" | mail -s "VaultLink Down" admin@example.com echo "Health check failed at $(date)" | mail -s "VaultLink Down" admin@example.com
# Optionally restart # Optionally restart
# docker restart vaultlink-server # docker restart vaultlink-server

View file

@ -49,7 +49,7 @@ docker run -d \
/app/sync_server /data/config.yml /app/sync_server /data/config.yml
``` ```
Verify: `curl http://localhost:3000/vaults/test/ping` should return `pong` Verify: `curl http://localhost:3000/vaults/test/ping` should return server version and auth status
## Step 2: Connect Client ## Step 2: Connect Client
@ -114,10 +114,12 @@ users:
**Client can't connect**: **Client can't connect**:
1. Verify: `curl http://your-server:3000/vaults/test/ping` 1. Verify server: `curl http://your-server:3000/vaults/test/ping`
2. Check URL: `ws://` for HTTP, `wss://` for HTTPS 2. Check URL: `ws://` for HTTP, `wss://` for HTTPS
3. Verify token matches config.yml 3. Verify token matches config.yml
**Understanding limitations**: [See what VaultLink can and can't do →](/guide/limitations)
**Files not syncing**: Check client logs, verify vault name matches **Files not syncing**: Check client logs, verify vault name matches
[Server setup →](/guide/server-setup) | [Architecture →](/architecture/) [Server setup →](/guide/server-setup) | [Architecture →](/architecture/)

192
docs/guide/limitations.md Normal file
View file

@ -0,0 +1,192 @@
# Limitations
VaultLink works well for most Obsidian vaults, but has some constraints you should know about.
## File Type Limitations
### Mergeable Files
Only **`.md`** and **`.txt`** files get automatic conflict-free merging.
Other file types (images, PDFs, etc.) use last-write-wins:
```
User A updates diagram.png → Server stores version 1
User B updates diagram.png → Server stores version 2 (overwrites A's changes)
```
**Workaround**: Avoid editing the same non-text file simultaneously.
### Binary Detection
Files are treated as binary if they:
- Contain NUL bytes (`0x00`)
- Fail UTF-8 validation
Binary files within `.md` or `.txt` extensions still get last-write-wins (no merge).
## Performance Constraints
### Server Limits (Configurable)
| Resource | Default | Maximum Tested |
| ------------------------ | ------- | -------------- |
| Clients per vault | 256 | ~256 |
| Database connections | 12 | 20 |
| Max file size | 512 MB | 4096 MB |
| Request timeout | 60s | 180s |
| WebSocket cursor timeout | 60s | 300s |
| Database busy timeout | 3600s | - |
### Vault Size
- **Small vaults** (< 1000 files): Excellent performance
- **Medium vaults** (1000-10000 files): Good performance
- **Large vaults** (> 10000 files): Works, but initial sync slower
No hard file count limit—constrained by disk space and sync time.
### Resource Usage
Rough estimates (varies by vault size and activity):
- **RAM**: ~50-200 MB base + ~1-5 MB per active client
- **CPU**: Low (< 5%) for typical usage, spikes during merges
- **Disk**: Vault size + version history (grows over time)
## Version History
### Storage
- All versions stored indefinitely (no automatic cleanup)
- Each vault is a separate SQLite database
- Deleted files marked as deleted (not purged)
**Growth**: Version history grows with every change. A 10 MB vault with frequent edits might grow to 100+ MB over months.
**Cleanup**: Manual only (see [Advanced Configuration](/config/advanced#version-history-cleanup)).
### Implications
- Disk usage grows over time
- Database size affects backup time
- No built-in retention policy
## Merge Quality
### Text Merging
VaultLink uses word-level tokenisation for merging:
```markdown
Parent: "The quick brown fox"
User A: "The quick red fox"
User B: "The very quick brown fox"
Result: "The very quick red fox" ← Both changes preserved
```
**Imperfect scenarios**:
- Complex nested Markdown (tables, code blocks)
- Simultaneous edits to the same sentence
- Large structural changes (moving sections around)
**Result**: Merged file might need manual cleanup in ~1-5% of concurrent edits.
## Scalability
### SQLite Limitations
- One SQLite database per vault
- Single-server architecture (no built-in clustering)
- Write serialisation through database
**For high concurrency**: Consider multiple vaults instead of one massive shared vault.
### Horizontal Scaling
Not currently supported. Running multiple servers requires manual vault partitioning.
## Network Requirements
### Latency
- Real-time sync typically < 500ms on good connections
- Mobile/slow networks: 1-5s latency possible
- Timeout failures on very slow connections (> 60s)
### Offline Behaviour
- Clients queue changes locally
- On reconnect, sync all changes since last connection
- Conflicts resolved automatically (for mergeable files)
**Limitation**: No offline conflict preview—merged result appears after reconnect.
## Security
### No End-to-End Encryption
- Server sees all file contents
- Transport encryption only (WSS/TLS)
- Trust your server
**Workaround**: Self-host on infrastructure you control.
### Authentication
- Token-based only (no OAuth, SAML, etc.)
- Tokens configured in server config file
- No runtime user management
## Known Edge Cases
### Simultaneous Deletes and Edits
```
User A deletes note.md
User B edits note.md
Result: Edit wins (file recreated with B's content)
```
Operational transformation prioritises content preservation.
### Large File Uploads
Files > 100 MB may time out on slow connections. Increase `response_timeout_seconds` or split large files.
### Mobile Sync
- Mobile networks may drop WebSocket connections frequently
- Client auto-reconnects, but causes sync delays
- Battery impact from constant reconnections
## What VaultLink is NOT
- **Not a backup solution**: Version history helps but isn't a backup (make backups!)
- **Not Git**: No branching, no commit messages, no diffs to review before merge
- **Not encrypted storage**: Server sees everything
- **Not multi-master**: One server, multiple clients (not peer-to-peer)
## Recommendations
### Good Use Cases
- Personal multi-device sync (< 10 devices)
- Small team collaboration (< 20 people)
- Primarily text/Markdown content
- Trusted server environment
### Poor Use Cases
- Large teams (> 50 concurrent users per vault)
- Primarily binary files (images, videos, large PDFs)
- Untrusted server (need E2E encryption)
- Highly regulated environments (HIPAA, etc.)
## Next Steps
- [Server configuration limits →](/config/server)
- [Advanced tuning →](/config/advanced)
- [Architecture details →](/architecture/)

View file

@ -280,10 +280,15 @@ Run daily via cron:
The server exposes a ping endpoint: The server exposes a ping endpoint:
```bash ```bash
curl http://localhost:3000/vaults/fake/ping curl http://localhost:3000/vaults/test/ping
# Returns: pong # Returns: {"server_version":"0.10.1","is_authenticated":false}
``` ```
Replace `test` with any vault name. The endpoint returns:
- `server_version`: Current server version
- `is_authenticated`: Whether the request included a valid token
Docker health check is built-in and checks this endpoint every 30 seconds. Docker health check is built-in and checks this endpoint every 30 seconds.
#### Prometheus Metrics #### Prometheus Metrics

View file

@ -13,9 +13,11 @@ Syncing Obsidian vaults across devices or sharing with teammates sucks:
## VaultLink's Solution ## VaultLink's Solution
Differential synchronisation with operational transformation. Differential synchronisation with operational transformation for Markdown and text files.
Edit files with Obsidian, Vim, VS Code, or any editor. VaultLink compares versions and automatically merges all changes. No operation tracking required, no conflict markers, no data loss. Edit `.md` and `.txt` files with Obsidian, Vim, VS Code, or any editor. VaultLink compares versions and automatically merges all changes. No operation tracking required, no conflict markers.
**Note**: Binary files (images, PDFs, etc.) use last-write-wins. [See limitations →](/guide/limitations)
## How It Works ## How It Works

View file

@ -37,7 +37,7 @@ features:
**Edit with any tool.** Other solutions require CRDT-aware editors or break when you edit outside Obsidian. VaultLink uses differential sync: edit files however you want, sync handles the rest. **Edit with any tool.** Other solutions require CRDT-aware editors or break when you edit outside Obsidian. VaultLink uses differential sync: edit files however you want, sync handles the rest.
**No conflict markers.** Git forces manual merging. Other tools use last-write-wins. VaultLink's operational transformation automatically merges all changes without data loss or workflow interruption. **No conflict markers.** Git forces manual merging. Other tools use last-write-wins. VaultLink's operational transformation automatically merges Markdown and text files without conflict markers or workflow interruption. [See what's supported →](/guide/limitations)
[See how VaultLink compares to alternatives →](/guide/alternatives) [See how VaultLink compares to alternatives →](/guide/alternatives)