Many companies use Elastic or OpenSearch and use 10% of the capacity.
They have to build ETL pipelines.
Get data Normalized.
Worry about race conditions.
All in all. At the moment, when you want to do search on top of your transactional data, you are forced to build a distributed systems.
Not anymore.
ParadeDB is building an open-source PostgreSQL extension to enable search within your database.
Today, I am talking to Philippe Noël, the founder and CEO of ParadeDB.
We talk about how they build it, how they integrate into the Postgres Query engines, and how you can build search on top of Postgres.
Key Insights:
Search is changing. We're moving from separate search clusters to search inside databases. Simpler architecture, stronger guarantees, lower costs up to a certain scale.
Most search engines force you to duplicate data. ParadeDB doesn't. You keep data normalized and join at query time. It hooks deep into Postgres's query planner. It doesn't just bolt on search - it lets Postgres optimize search queries alongside SQL ones.
Search indices can work with ACID. ParadeDB's BM25 index keeps Lucene-style components (term frequency, normalization) but adds Postgres metadata for transactions. Search + ACID is possible.
Two storage types matter: inverted indices for text, columnar "fast fields" for analytics. Pick the right one or queries get slow. Integers now default to columnar to prevent common mistakes.
Mixing query engines looks tempting but fails. The team tried using DuckDB and DataFusion inside Postgres. Both were fast but broke ACID compliance. They had to rebuild features natively.
Philippe Noël:
Nicolay Gerold:
00:00 Introduction to ParadeDB
00:53 Building ParadeDB with Rust
01:43 Integrating Search in Postgres
03:04 ParadeDB vs. Elastic
05:48 Technical Deep Dive: Postgres Integration
07:27 Challenges and Solutions
09:35 Transactional Safety and Performance
11:06 Composable Data Systems
15:26 Columnar Storage and Analytics
20:54 Case Study: Alibaba Cloud
21:57 Data Warehouse Context
23:24 Custom Indexing with BM25
24:01 Postgres Indexing Overview
24:17 Fast Fields and Columnar Format
24:52 Lucene Inspiration and Data Storage
26:06 Setting Up and Managing Indexes
27:43 Query Building and Complex Searches
30:21 Scaling and Sharding Strategies
35:27 Query Optimization and Common Mistakes
38:39 Future Developments and Integrations
39:24 Building a Full-Fledged Search Application
42:53 Challenges and Advantages of Using ParadeDB
46:43 Final Thoughts and Recommendations