Vector DBs Compared: When to Choose pgvector, Pinecone, or Milvus
- Chenwei Zhang
- Data , MLOps
- 20 Oct, 2025
TL;DR
Pick pgvector for simplicity and SQL governance at moderate scale, Pinecone for managed performance and SLAs, and Milvus for open-source control at high scale. Decide by SLOs, compliance needs, team skills, and 12‑month scale forecasts.
Vector search is the backbone of modern retrieval systems. The database you choose shapes latency, cost, and how fast you can ship. Most teams evaluating enterprise RAG converge on three options: pgvector (PostgreSQL extension), Pinecone (managed vector DB), and Milvus (open-source vector DB with managed variants). Each option can be the right one — in different contexts.
This article compares them through an executive lens: performance, reliability, operations, compliance, and total cost. The goal is not a winner-take-all verdict but a decision framework you can defend to security, finance, and engineering.
The Short Version
Choose pgvector when you already run Postgres, want simplicity, and expect moderate scale with strong SQL governance. Pick Pinecone for turnkey performance, namespaces, and SLAs when speed‑to‑market matters more than infra control. Choose Milvus for open‑source control and high performance at large scale if your platform team is comfortable operating distributed systems.
Comparison Criteria
- Performance and scale: Query latency, recall, and index build speed at millions to billions of vectors.
- Features: Hybrid search (dense + sparse), filters, HNSW/IVF variants, reranking hooks.
- Reliability: Durability, replication, backups, and operational maturity.
- Security and compliance: Encryption, RBAC, network isolation, audit logging, regional controls.
- Ecosystem: SDKs, integrations, and community support.
- Operations: Upgrades, observability, cost predictability, and vendor support.
pgvector (PostgreSQL + pgvector)
What it is: A Postgres extension that adds vector types and ANN indexes. Query with SQL and combine vector search with rich relational filters.
How it works: Store embeddings and metadata in the same database. Use row‑level security and SQL joins to enforce ACLs, jurisdiction, and version filters inline with retrieval.
Use when: You have strong Postgres skills, need governance, and expect ≤ low tens of millions of vectors with rich filters.
Strengths: Simplicity (one DB), mature governance/audit, predictable cost, and powerful hybrid queries.
Constraints: Scale ceiling without sharding, fewer index choices, and you own ops (unless on a managed Postgres).
Pinecone (Managed Vector DB)
What it is: A fully managed vector database with APIs for fast similarity search, isolation via namespaces/projects, and production‑ready features.
How it works: You provision an index per use case/namespace, stream upserts, and query with filters. Pinecone handles replication, scaling, and availability behind the scenes.
Use when: You need turnkey performance, clear SLAs, and global reach, and you’re comfortable with a managed vendor.
Strengths: High performance at scale, operational ease, multi‑tenant patterns, and rich production tooling.
Constraints: More difficult metadata joins, less portability, and cost planning is key (dimension, replicas, traffic).
Milvus (Open‑Source + Managed)
What it is: A high‑performance open‑source vector database (HNSW/IVF/DiskANN), typically run on Kubernetes; available as a managed service via Zilliz.
How it works: You deploy Milvus as a distributed system, choose index types per collection, and scale horizontally as vectors grow.
Use when: You want open‑source control, very large collections, or specialized tuning, and your platform team can operate distributed systems.
Strengths: Excellent performance/flexibility, horizontal scale, and an active open ecosystem.
Constraints: Higher ops complexity, you assemble governance (backups, audit, RBAC), and complex relational filters often need an external store.
Performance Notes That Matter
Balance recall vs. latency by measuring end‑to‑end quality on your corpus, not synthetic benchmarks. An improved embedding model often beats index tweaks. Add a reranker for precision‑critical domains; the latency cost is usually modest. Rich metadata filters reduce context waste — ensure your DB supports efficient filtered ANN.
Security, Compliance, and Residency
Use encryption at rest and in transit, and verify key rotation. Postgres offers mature row‑level security; Pinecone/Milvus rely on app‑layer ACLs or their own RBAC — test thoroughly. Ensure audit trails for queries and admin actions. Respect data residency constraints (EU/UK‑only where needed).
Cost and TCO
pgvector has the lowest barrier if you have Postgres skills — you pay compute, storage, and ops time, and get strong early ROI. Pinecone trades higher unit costs for lower ops and faster delivery. Milvus can be infra‑efficient at large scale but demands SRE maturity; managed Milvus moderates ops at a premium.
Migration Paths (Future‑Proofing)
Keep embeddings, IDs, and metadata portable. Abstract retrieval behind a service so pgvector → Pinecone/Milvus is a swap, not a rebuild. Maintain export/import scripts and test recall parity against your golden set. Consider a thin shim that supports multiple backends for A/B migration.
Decision Framework
Ask four sets of questions:
-
Scale and SLOs: Expected QPS and p95 latency? Peak scenarios? Vectors today vs. 12 months? Dimension size?
-
Governance and Compliance: Need row‑level security and mature audit today? Are we already a Postgres shop? Residency or strict separation by client/BU?
-
Team and Time‑to‑Market: Do we have SRE capacity for Kubernetes and distributed indexing, or do we want managed? What’s the opportunity cost of building infra?
-
Cost Posture: Prefer predictable, incremental spend with lower ops (pgvector)? Will we pay more for managed reliability (Pinecone)? Can we afford SRE investment for open‑source control (Milvus)?
Example Selections by Scenario
- Department pilot or internal knowledge base (≤10M vectors, rich filters): pgvector.
- Customer-facing assistant at global scale (strict latency, namespaces, 24/7): Pinecone.
- Platform team powering multiple apps (50M+ vectors, tuning freedom): Milvus or managed Milvus.
Implementation Tips Regardless of DB
Combine BM25 + vectors for hybrid retrieval. Invest in clean metadata — it’s the cheapest recall boost. Rebuild/compact indexes periodically and track drift as embeddings/models change. Record per‑query latency, filter selectivity, and recall against a golden set. Test restores; snapshots aren’t backups until you’ve restored from them.
Executive Checklist
- Do we have a clear 12-month scale forecast and SLOs?
- Which compliance controls are required on day one, and which later?
- Who will operate the system and on what platform?
- Is our retrieval layer portable to avoid lock-in?
- Have we tested end-to-end quality with a golden set, not just index benchmarks?
The right vector database is the one that lets your team ship reliable retrieval today and still sleep at night a year from now. Make the choice explicit, document your assumptions, and keep the door open to migrate when the data proves you should.