Performance
VecLabs achieves 4.7ms p99 at 100K vectors, 1536 dimensions (OpenAI ada-002 size). This page explains what makes that number possible and how it compares to alternatives.Benchmark results
Measured on Apple M3, 100K vectors, 1536 dimensions, cosine similarity, top-10 ANN, 1,000 samples:| Percentile | VecLabs | Pinecone s1 | Qdrant |
|---|---|---|---|
| p50 | 2.995ms | ~10ms | ~6ms |
| p95 | 3.854ms | ~20ms | ~12ms |
| p99 | 4.688ms | ~30ms | ~18ms |
| p99.9 | 5.674ms | ~50ms | ~30ms |
cargo run --release --example percentile_bench -p solvec-core
Why VecLabs is faster
1. No network round-trip Pinecone, Qdrant Cloud, and Weaviate Cloud all require a network request for every query. That round-trip adds 5-50ms depending on your network and region. VecLabs runs the HNSW index in-process - the query never leaves your application’s memory. 2. No garbage collector Python (hnswlib, Chroma) and Go (Weaviate) have garbage collectors that cause unpredictable pauses. These pauses show up at p99 and p99.9 as latency spikes. Rust has no GC - there are no pauses. 3. Zero-copy query path Vectors are stored in memory as nativef32 arrays. There’s no serialization, deserialization, or copying on the query hot path. The distance computation accesses memory directly.
4. Cache-optimized data layout
The HNSW graph structure and its associated vectors are laid out in memory to maximize CPU cache hits during graph traversal. The inner loop of nearest-neighbor search hits L1/L2 cache rather than main memory.
Write vs query latency
These are independent and should not be confused:| Operation | Latency | Blocking? |
|---|---|---|
| HNSW insert (upsert) | ~2ms | ✅ Blocks until done |
| Shadow Drive upload | ~500-2000ms | ❌ Async background |
| Solana Merkle root | ~400ms | ❌ Async background |
| HNSW query | 3-5ms p99 | ✅ Blocks until done |
| Verify (Solana RPC) | ~400ms | ✅ Blocks until done |