Episodes

Latest Episode
ETL for LLMs, Integrating and Normalizing Unstructured Data | ep 13

ETL for LLMs, Integrating and Normalizing Unstructured Data | ep 13

S1E13 · · 36:48

In our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs). Carbon is streamlining AI...

Serverless Data Orchestration, AI in the Data Stack, AI Pipelines | ep 12

Serverless Data Orchestration, AI in the Data Stack, AI Pipelines | ep 12

S1E12 · · 28:06

In this episode, Nicolay sits down with Hugo Lu, founder and CEO of Orchestra, a modern data orchestration platform. As data pipelines and analytics workflows become increasingly com...

Mastering Vector Databases: Product & Binary Quantization, Multi-Vector Search

Mastering Vector Databases: Product & Binary Quantization, Multi-Vector Search

S1E11 · · 40:06

Ever wondered how AI systems handle images and videos, or how they make lightning-fast recommendations? Tune in as Nicolay chats with Zain Hassan, an expert in vector databases from ...

Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage | ep 10

Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage | ep 10

S1E10 · · 45:33

In this episode of "How AI is Built", data architect Anjan Banerjee provides an in-depth look at the world of data architecture and building complex AI and data systems. Anjan breaks...

Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack | ep 9

Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack | ep 9

S1E9 · · 27:53

Jorrit Sandbrink, a data engineer specializing on open table formats, discusses the advantages of decoupling storage and compute, the importance of choosing the right table format, a...

Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models | ep 8

Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models | ep 8

S1E8 · · 36:40

Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge ...

Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture | ep 7

Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture | ep 7

S1E7 · · 38:12

From Problem to Requirements to Architecture. In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the ...

Data Orchestration Tools: Choosing the right one for your needs | ep 6

Data Orchestration Tools: Choosing the right one for your needs | ep 6

S1E6 · · 32:37

In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularit...

Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals | ep 5

Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals | ep 5

S1E4 · · 29:40

In this episode of "How AI is Built", we learn how to build and evaluate real-world language model applications with Shahul and Jithin, creators of Ragas. Ragas is a powerful open-so...

Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2

Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2

S1E5 · · 21:33

In this episode of Changelog, Weston Pace dives into the latest updates to LanceDB, an open-source vector database and file format. Lance's new V2 file format redefines the tradition...

Unlocking AI with Supabase: Postgres Configuration, Real-Time Processing, and Extensions | ep 4

Unlocking AI with Supabase: Postgres Configuration, Real-Time Processing, and Extensions | ep 4

S1E4 · · 31:57

Had a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring: Core ...

AI Inside Your Database, Real-Time AI, Declarative ML/AI | ep 3

AI Inside Your Database, Real-Time AI, Declarative ML/AI | ep 3

S1E3 · · 36:04

If you've ever wanted a simpler way to integrate AI directly into your database, SuperDuperDB might be the answer. SuperDuperDB lets you easily apply AI processes to your data while ...

Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1

Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1

S1E3 · · 13:37

Supabase just acquired OrioleDB, a storage engine for PostgreSQL. Oriole gets creative with MVCC! It uses an UNDO log rather than keeping multiple versions of an entire data row (tu...

AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation | ep 2

AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation | ep 2

S1E2 · · 37:09

Today’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any d...

Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure | ep 1

Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure | ep 1

S1E1 · · 34:04

Imagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning. Machine learning and AI success depends on the speed you can i...