Skip to main content

Semantic Search

Keyword search finds documents containing the exact words you searched for. Semantic search finds documents that mean what you’re looking for - even if they use completely different words.
QueryKeyword search findsSemantic search finds
”payment failed”Documents containing “payment” and “failed”Documents about card declined, transaction error, billing issue, checkout problem
”dog”Documents containing “dog”Documents about dogs, puppies, canines, pets
”fix the bug”Documents containing “fix” and “bug”Documents about debugging, resolving issues, patching, troubleshooting

  • Internal knowledge base search - employees searching documentation, policies, wikis
  • Customer support - matching customer questions to relevant help articles
  • E-commerce - “comfortable shoes for standing all day” finding relevant products
  • Legal/compliance - finding relevant precedents or clauses across large document sets
  • Code search - “function that sorts a list” finding relevant code snippets

Complete implementation

import { SolVec } from '@veclabs/solvec';

const sv = new SolVec({ network: 'devnet' });
const collection = sv.collection('search-index', {
dimensions: 1536,
metric: 'cosine'
});

// Index your content
async function indexContent(items: Array<{ id: string; title: string; body: string; url: string }>) {
const texts = items.map(item => `${item.title}\n${item.body}`);
const embeddings = await batchEmbed(texts);

await collection.upsert(
items.map((item, i) => ({
id: item.id,
values: embeddings[i],
metadata: {
title: item.title,
body: item.body.slice(0, 500), // store excerpt
url: item.url,
}
}))
);
}

// Search
async function search(query: string, topK = 10) {
const queryEmbedding = await embed(query);

const results = await collection.query({
vector: queryEmbedding,
topK,
minScore: 0.65, // tune for your use case
});

return results.map(r => ({
id: r.id,
title: r.metadata.title,
excerpt: r.metadata.body,
url: r.metadata.url,
relevanceScore: r.score,
}));
}

// Usage
const results = await search('how to reset my password');
// Returns results about password reset, account recovery, forgot password -
// even if those exact words aren't in the query


Improving search quality

Embed title + body together - concatenate the title and body before embedding. The title often contains keywords that improve retrieval without losing the semantic content of the body. Tune minScore - start at 0.65 for broad search, increase to 0.75+ for more precise applications. Evaluate on a test set of real queries. Index at the right granularity - for long documents, chunk and index each chunk separately. For short items like product descriptions, index the whole thing. Store enough metadata - store the original text (or a useful excerpt), title, URL, and any fields you’ll want to display in search results. You can’t retrieve fields you didn’t store.