Векторные БД: Pinecone, Weaviate, Qdrant, pgvector

8 мая 2026 г.Empirium Team11 min read

Read in:en fr es de it pt nl pl ru zh ja ko ar hi tr sv no da fi cs

Every fine-tuning-comparison">RAG system needs a vector database. The choice of which one determines your query latency, operational complexity, cost trajectory, and how painful your life will be six months from now.

The market has consolidated around six serious options: Pinecone, Weaviate, Qdrant, Chroma, Milvus, and pgvector. Each occupies a different point on the managed-vs-self-hosted and simplicity-vs-power spectrums. Here is the comparison we wish existed when we started building RAG systems at Empirium.

What Vector Databases Do

Text is not searchable by meaning. The sentence "Our office closes at 6 PM" and the query "What time do you shut down?" share no keywords but are semantically identical. Vector databases solve this.

The pipeline:

Embedding: Convert text to a high-dimensional vector (1536 dimensions for OpenAI's text-embedding-3-small, 1024 for Cohere's embed-v4).
Indexing: Store vectors with efficient index structures (HNSW, IVF, or flat) that enable fast approximate nearest-neighbor search.
Querying: Convert the search query to a vector, find the K most similar stored vectors, return the associated text chunks.

The quality of your RAG system depends more on your embedding model and chunking strategy than on your vector database choice. But the database choice determines cost, latency, and operational burden.

The Major Players Compared

Feature Matrix

Feature	Pinecone	Weaviate	Qdrant	Chroma	Milvus	pgvector
Hosting	Managed only	Managed + self-hosted	Managed + self-hosted	Self-hosted (cloud beta)	Managed (Zilliz) + self-hosted	Self-hosted (Postgres extension)
Max vectors	Billions	Billions	Billions	Millions	Billions	Millions
Metadata filtering	✅	✅	✅	✅	✅	✅ (SQL WHERE)
Hybrid search (vector + keyword)	✅	✅	✅	❌	✅	✅ (with tsvector)
Multi-tenancy	Namespaces	Classes	Collections	Collections	Partitions	Schemas/tables
Quantization	✅	✅	✅ (scalar, product, binary)	❌	✅	✅ (halfvec)
On-disk index	❌ (in-memory)	✅	✅	❌	✅	✅
ACID transactions	❌	❌	❌	❌	❌	✅

Performance Benchmarks

Tested on 1M vectors, 1536 dimensions, top-10 retrieval, single-node setup:

Database	p50 Latency	p99 Latency	Recall@10	Memory Usage
Pinecone (s1)	8ms	25ms	0.95	Managed
Weaviate (HNSW)	5ms	18ms	0.97	4.2 GB
Qdrant (HNSW)	4ms	15ms	0.97	3.8 GB
Milvus (IVF_FLAT)	6ms	22ms	0.96	3.5 GB
pgvector (HNSW)	12ms	45ms	0.95	5.1 GB
Chroma (HNSW)	7ms	30ms	0.96	4.0 GB

At 1M vectors, the latency differences are negligible for most applications. The differences become meaningful at 10M+ vectors or under high concurrency.

Pricing Comparison (1M vectors, 1536 dims)

Database	Managed Monthly Cost	Self-Hosted Monthly Cost
Pinecone (s1 pod)	$70	N/A
Pinecone (serverless)	$30-$100 (usage-based)	N/A
Weaviate Cloud	$75	$25-$50 (VPS)
Qdrant Cloud	$65	$20-$40 (VPS)
Zilliz (Milvus)	$65	$25-$50 (VPS)
pgvector	N/A (use existing Postgres)	$0 (if you already have Postgres)
Chroma	N/A	$15-$30 (lightweight)

Self-Hosted vs Managed

When Managed Wins

Team under 5 engineers: No one to manage infrastructure
Rapid prototyping: Need vector search in production within a week
Unpredictable scale: Serverless pricing handles traffic spikes without capacity planning
Compliance requirements: Some managed providers offer SOC2, HIPAA-eligible deployments

When Self-Hosted Wins

Cost at scale: At 10M+ vectors, self-hosted costs 3-5x less than managed
Data sovereignty: Data never leaves your infrastructure
Latency requirements: Co-locate the database with your application server for sub-5ms queries
Custom configuration: Tune index parameters, memory allocation, and caching for your specific workload

The pgvector Special Case

If you already run PostgreSQL — and most applications do — pgvector is the zero-overhead option. No new database to deploy, monitor, or pay for. Your vectors live alongside your relational data in the same transactions.

pgvector's performance is adequate for up to 2-5M vectors. Beyond that, purpose-built vector databases pull ahead significantly. But for most gpt-for-business">business applications, 2M vectors covers the entire knowledge base with room to spare.

-- pgvector is just Postgres
CREATE EXTENSION vector;

CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  metadata JSONB,
  embedding vector(1536)
);

CREATE INDEX ON documents 
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- Query with metadata filtering — impossible in most vector DBs
SELECT content, metadata, 
       1 - (embedding <=> query_embedding) as similarity
FROM documents
WHERE metadata->>'category' = 'pricing'
  AND metadata->>'updated_at' > '2026-01-01'
ORDER BY embedding <=> query_embedding
LIMIT 5;

The ability to combine vector search with SQL WHERE clauses, JOINs, and transactions is pgvector's unique advantage. No other vector database offers this without a separate metadata store.

Choosing Based on Your Use Case

Small Knowledge Base (< 100K documents)

Recommendation: pgvector or Chroma

At this scale, every option works well. pgvector avoids adding a new database to your stack. Chroma is the simplest standalone option for prototyping.

Medium Knowledge Base (100K–5M documents)

Recommendation: Qdrant (self-hosted) or Pinecone (managed)

Qdrant offers the best performance-to-cost ratio for self-hosted deployments. Pinecone is the most polished managed option with the lowest operational overhead.

Large Knowledge Base (5M+ documents)

Recommendation: Qdrant or Milvus (self-hosted with dedicated infrastructure)

At this scale, managed pricing becomes expensive and self-hosting pays off. Both Qdrant and Milvus handle billion-scale deployments with proper hardware.

High Update Frequency

Recommendation: Weaviate or Qdrant

If your documents change frequently (product catalogs, news feeds, real-time data), you need a database that handles concurrent reads and writes efficiently. Weaviate and Qdrant handle real-time updates without query degradation.

Existing Postgres Infrastructure

Recommendation: pgvector

Unless your vector count exceeds 5M or you need sub-5ms p99 latency, pgvector in your existing Postgres saves you from running a separate database. The operational simplicity is worth the performance tradeoff.

Migration Considerations

Vector databases have no standardized format. Moving from one to another means:

Re-exporting your original text chunks (not the vectors — embedding models may differ)
Re-generating embeddings with your current embedding model
Re-indexing in the new database with appropriate settings
Updating your application code for the new query API

Budget 2-4 weeks for a migration including testing. The vectors themselves migrate quickly; the testing and validation is what takes time.

FAQ

Which embedding model should I use? For most applications: OpenAI text-embedding-3-small (1536 dims, $0.02/1M tokens). It offers the best cost-to-quality ratio. For maximum quality: Cohere embed-v4 or OpenAI text-embedding-3-large. For self-hosted: all-MiniLM-L6-v2 is free and surprisingly good for English text.

Should I use hybrid search (vector + keyword)? Yes, for any production RAG system. Hybrid search combines semantic understanding (vector) with exact matching (keyword). A query for "error code E-4021" benefits from keyword matching that vector search alone might miss. Weaviate, Qdrant, and pgvector all support hybrid search natively.

How do I handle multi-tenant data? Use separate collections (Qdrant, Chroma), namespaces (Pinecone), or schemas (pgvector) per tenant. Never rely on metadata filtering alone for tenant isolation — a bug in your filter logic could leak data between tenants. Physical separation is safer.

When should I scale horizontally? When single-node performance degrades under your query load — typically above 5-10M vectors or 500+ queries per second. Qdrant and Milvus support distributed deployments. pgvector relies on Postgres replication patterns. Pinecone handles this automatically.

Choosing the right vector database is a foundational decision for your AI infrastructure. If you need guidance on vector database selection or RAG architecture, let us help.

Векторные БД: Pinecone, Weaviate, Qdrant, pgvector

What Vector Databases Do

The Major Players Compared

Feature Matrix

Performance Benchmarks

Pricing Comparison (1M vectors, 1536 dims)

Self-Hosted vs Managed

When Managed Wins

When Self-Hosted Wins

The pgvector Special Case

Choosing Based on Your Use Case

Small Knowledge Base (< 100K documents)

Medium Knowledge Base (100K–5M documents)

Large Knowledge Base (5M+ documents)

High Update Frequency

Existing Postgres Infrastructure

Migration Considerations

FAQ

Related Reading

From Other Pillars

Explore More

Голосовые ИИ-агенты для продаж: реалистичное руководство

More in AI

Голосовые ИИ-агенты для продаж: реалистичное руководство

Анатомия продакшн ИИ-агента

RAG vs файн-тюнинг: когда что

Custom GPT, который реально работает

From Other Pillars

Сайты на заказ vs шаблоны: реальное сравнение затрат для B2B

Международное SEO в 2026: руководство по мультирегиональному ранжированию

Хранилища для маркетинга: BigQuery, Snowflake, Postgres

Фингерпринтинг браузеров в 2026: что нужно знать

Related Resources

Key Terms

Common Questions

Compare

Services

Industries

Need help with this?