download dots
Vector Database

Vector Database

8 min read
On this page (17)

Definition: A vector database is a specialized database designed to store, index, and search high-dimensional numeric vectors — typically embeddings produced by a neural network. Where a relational database answers "give me rows where status = 'active'," a vector database answers "give me the 10 items whose meaning is most similar to this one." That single primitive — approximate nearest-neighbor (ANN) search over dense vectors — powers semantic search, RAG, agentic RAG, recommendation systems, and every AI feature where "similar" matters more than "exact match."

The category exploded from a handful of research prototypes in 2018 to a crowded $2B+ infrastructure market by 2026. Pinecone, Weaviate, Qdrant, Milvus, Chroma, pgvector, MongoDB Atlas Vector Search, Redis Vector, and OpenSearch all compete for the same job: turn embeddings into retrievable memory for AI systems.

Why Vector Databases Exist

Embeddings represent meaning as a list of numbers. A 1,536-dimensional OpenAI embedding places the sentence "cat sat on the mat" at a specific coordinate in a 1,536-dimensional space where semantically similar sentences are nearby. The trouble is finding the nearest neighbors in 1,536 dimensions when you have 100 million vectors. A naive linear scan costs 100 million dot products per query — slow, expensive, and impossible in real time.

Vector databases solve this with approximate nearest neighbor indexes. Instead of checking every vector, they build a structure that eliminates most candidates cheaply and only computes distances on a small shortlist. The tradeoff is recall (you might miss the absolute best match) for speed (queries land in milliseconds instead of seconds).

This tradeoff is the entire reason vector databases are a category. A relational database cannot do it. A graph database cannot do it. Even a full-text index (BM25) cannot, because semantic similarity is not keyword overlap. You need an index that respects geometry in continuous space.

How a Vector DB Fits in the Stack

Ingest Query Document Chunk Embed Vector DBHNSW / IVF Question Embed query Top-k Metadata filter Rerank LLM generates

The vector DB is the memory layer. Embeddings go in, nearest-neighbor results come out. Everything else is plumbing around that single primitive.

Core Algorithms

Three index families dominate:

HNSW (Hierarchical Navigable Small World). A layered graph where each node is connected to its nearest neighbors. Search starts at the top layer (few nodes, long edges) and descends to the bottom (many nodes, short edges). Fast, high recall, memory-heavy. The default in Qdrant, Weaviate, pgvector, and most managed offerings. Taskade's semantic search uses HNSW with 1,536-dim embeddings for workspace content.

IVF (Inverted File Index). Clusters vectors into partitions using k-means. At query time, search only the nearest k partitions. Memory-light, recall tunable by nprobe. The workhorse in FAISS and Milvus for billion-scale corpora.

Product Quantization (PQ). Compresses each vector into a compact code by splitting dimensions into sub-vectors and quantizing each. Combined with IVF (IVF-PQ) for disk-backed deployments where memory is the bottleneck.

Most production systems blend these: IVF-HNSW, IVF-PQ, HNSW + PQ compression. The knobs trade recall, latency, memory, and index build time against each other.

HNSW graph (simplified)

 Layer 2:   o─────────o──────────o    coarse, long edges
             \       / \        /
 Layer 1:   o─o───o─o───o────o─o      mid-grain
             \| \ / | \ / \  | |
 Layer 0:   o─o─o─o─o─o─o─o──o─o─o    dense, short edges
                     ^
                     query enters top, descends, lands near match

What a Vector Database Stores

A vector DB typically holds four things per record:

Field Purpose
ID Primary key to look the record up
Vector The dense embedding (e.g. 1,536 floats)
Metadata Structured filters (date, tenant, document ID, category)
Payload Raw text or document pointer for generation

The metadata is critical. A pure similarity search returns "things most like this." Production systems almost always need "things most like this AND in this tenant AND more recent than Tuesday." Metadata filtering turns a vector DB from a toy into infrastructure.

Distance Metrics

Three distances cover 95% of use cases:

  • Cosine similarity — Angle between vectors. Scale-invariant. Default for text embeddings.
  • Dot product — Cosine with magnitude baked in. Cheaper and used when embeddings are already normalized.
  • Euclidean (L2) — Straight-line distance. Used for image embeddings and some clustering tasks.

Match the metric to the embedding model. OpenAI's text-embedding-3-large is normalized; use cosine or dot product. Mismatching the metric silently degrades recall.

Typical Architecture

A production vector DB sits inside a larger retrieval pipeline:

  1. Ingest — Content is chunked (paragraphs, sentences), embedded by a model, and upserted with metadata.
  2. Index — The database builds or updates an ANN index in the background.
  3. Query — The user's question is embedded by the same model family, then used as the query vector.
  4. Filter — Metadata filters narrow to tenant, date range, document type.
  5. Search — ANN returns top-k candidates with scores.
  6. Rerank — A smaller model re-scores candidates for semantic fit.
  7. Generate — An LLM produces the final answer, grounded in the reranked chunks.

The same architecture underlies every RAG system, every agentic RAG pipeline, and the retrieval half of every AI agent that answers from a private corpus.

Pure vector search misses things. Proper nouns, product codes, rare typos — these are exactly what BM25 nails and what embeddings blur. Production systems run hybrid search: vector and keyword queries in parallel, scores combined by a weighted sum or reciprocal rank fusion, then reranked.

Taskade's workspace search is hybrid by default: HNSW for semantic similarity, full-text for precision, OCR for scanned documents. Agents doing agentic RAG inside Taskade call hybrid search as one tool and get the best of both worlds.

Scaling Considerations

Corpus Size Practical Stack
< 100K vectors pgvector + cosine, single Postgres node
100K–10M Qdrant, Weaviate, or pgvector with HNSW, single node
10M–1B Qdrant or Milvus cluster, IVF-HNSW with sharding
> 1B FAISS or Milvus with IVF-PQ, disk-backed, GPU optional

The right pick is a function of corpus size, query QPS, recall target, latency budget, and ops capacity. Over-engineering for 10M vectors with a Kubernetes cluster is a common 2024 mistake; pgvector with HNSW handles it comfortably.

Vector Databases in Taskade Genesis

Every Taskade workspace has a vector index behind it. When you search your projects, ask an agent a question, or build a Genesis app that references your knowledge base, the underlying retrieval runs through semantic HNSW over 1,536-dimensional embeddings — fused with full-text and OCR for hybrid precision.

You do not manage the index. There is no vector DB to provision, no ANN tuning, no reranker configuration. Upload a document, write a project, drop a file — Taskade embeds and indexes it in the background, and your agents can retrieve it the moment it is indexed. The vector database is part of your workspace DNA, not a separate piece of infrastructure.

Frequently Asked Questions About Vector Databases

What is a vector database?

A vector database is a specialized database for storing and searching high-dimensional numeric vectors — usually embeddings from a neural network. Instead of exact matches, it returns items whose vectors are geometrically similar, which approximates semantic similarity.

Why not use a regular database?

A regular database can store vectors but cannot search them efficiently. Finding the 10 nearest vectors out of 100 million in 1,536 dimensions requires an approximate nearest-neighbor index (HNSW, IVF, PQ) that only vector databases implement natively.

What is HNSW?

HNSW (Hierarchical Navigable Small World) is a graph-based approximate nearest-neighbor algorithm that builds a layered graph of vectors. Queries traverse from coarse to fine layers, finding near-matches in logarithmic time. It is the default algorithm in most modern vector databases, including Taskade's workspace search.

Do I need a vector database to use Taskade Genesis?

No. Vector search is built into every Taskade workspace automatically. Upload a document, write a project, or connect an integration, and Taskade embeds and indexes it in the background. Your agents can search your workspace semantically from day one with no setup.

Hybrid search combines vector (semantic) and keyword (BM25) search, then merges the scores. It catches both paraphrased queries and exact-match proper nouns or codes. Taskade's workspace search is hybrid by default, adding OCR for scanned documents.

Further Reading