Episodes

Latest Episode
Season 2 Trailer: Mastering Search

Season 2 Trailer: Mastering Search

S2Trailer · · 04:16

Today we are launching the season 2 of How AI Is Built.The last few weeks, we spoke to a lot of regular listeners and past guests and collected feedback. Analyzed our episode data. A...

#17 Jonathan Yarkoni on Unlocking Value from Unstructured Data, Real-World Applications of Generative AI

#17 Jonathan Yarkoni on Unlocking Value from Unstructured Data, Real-World Applications of Generative AI

S1E17 · · 36:28

In this episode of "How AI is Built," host Nicolay Gerold interviews Jonathan Yarkoni, founder of Reach Latent. Jonathan shares his expertise in extracting value from unstructured da...

#16 Abhishek Choudhary on Data Processing for AI, Integrating AI into Data Pipelines, Spark

#16 Abhishek Choudhary on Data Processing for AI, Integrating AI into Data Pipelines, Spark

S1E16 · · 46:26

This episode of "How AI Is Built" is all about data processing for AI. Abhishek Choudhary and Nicolay discuss Spark and alternatives to process data so it is AI-ready.Spark is a dist...

#15 Rahul Parundekar on Building AI Agents for the Enterprise, Agent Cost Controls, Seamless UX

#15 Rahul Parundekar on Building AI Agents for the Enterprise, Agent Cost Controls, Seamless UX

S1E15 · · 35:12

In this episode, Nicolay talks with Rahul Parundekar, founder of AI Hero, about the current state and future of AI agents. Drawing from over a decade of experience working on agent t...

#14 Richmond Alake on Building Predictable Agents through Prompting, Compression, and Memory Strategies

#14 Richmond Alake on Building Predictable Agents through Prompting, Compression, and Memory Strategies

S1E14 · · 32:14

In this conversation, Nicolay and Richmond Alake discuss various topics related to building AI agents and using MongoDB in the AI space. They cover the use of agents and multi-agents...

Data Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3

Data Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3

S1E14 · · 14:53

In this episode, Kirk Marple, CEO and founder of Graphlit, shares his expertise on building efficient data integrations. Kirk breaks down his approach using relatable concepts: The...

#13 Derek Tu on ETL for LLMs, Integrating and Normalizing Unstructured Data

#13 Derek Tu on ETL for LLMs, Integrating and Normalizing Unstructured Data

S1E13 · · 36:48

In our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs). Carbon is streamlining A...

#12 Hugo Liu on Serverless Data Orchestration, AI in the Data Stack, AI Pipelines

#12 Hugo Liu on Serverless Data Orchestration, AI in the Data Stack, AI Pipelines

S1E12 · · 28:06

In this episode, Nicolay sits down with Hugo Lu, founder and CEO of Orchestra, a modern data orchestration platform. As data pipelines and analytics workflows become increasingly com...

#11 Zain Hasan on Mastering Vector Databases, Product & Binary Quantization, Multi-Vector Search

#11 Zain Hasan on Mastering Vector Databases, Product & Binary Quantization, Multi-Vector Search

S1E11 · · 40:06

Ever wondered how AI systems handle images and videos, or how they make lightning-fast recommendations? Tune in as Nicolay chats with Zain Hassan, an expert in vector databases from ...

#10 Anjan Banerjee on Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage

#10 Anjan Banerjee on Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage

S1E10 · · 45:33

In this episode of "How AI is Built", data architect Anjan Banerjee provides an in-depth look at the world of data architecture and building complex AI and data systems. Anjan breaks...

#9 Jorrit Sandbrink on Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack

#9 Jorrit Sandbrink on Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack

S1E9 · · 27:53

Jorrit Sandbrink, a data engineer specializing on open table formats, discusses the advantages of decoupling storage and compute, the importance of choosing the right table format, a...

#8 Kirk Marple on Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models

#8 Kirk Marple on Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models

S1E8 · · 36:40

Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge ...

#7 Jon Warghed on Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture

#7 Jon Warghed on Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture

S1E7 · · 38:12

From Problem to Requirements to Architecture. In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the...

#6 John Wessel on Data Orchestration Tools, Choosing the right one for your needs

#6 John Wessel on Data Orchestration Tools, Choosing the right one for your needs

S1E6 · · 32:37

In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularit...

#5 Shahul Es and Jithin James on Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals

#5 Shahul Es and Jithin James on Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals

S1E4 · · 29:40

In this episode of "How AI is Built", we learn how to build and evaluate real-world language model applications with Shahul and Jithin, creators of Ragas. Ragas is a powerful open-so...

Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2

Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2

S1E5 · · 21:33

In this episode of Changelog, Weston Pace dives into the latest updates to LanceDB, an open-source vector database and file format. Lance's new V2 file format redefines the tradition...

#4 Christopher Gwilliams on AI with Supabase, Postgres Configuration, Real-Time Processing, and more

#4 Christopher Gwilliams on AI with Supabase, Postgres Configuration, Real-Time Processing, and more

S1E4 · · 31:57

Had a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring: Cor...

#3 Duncan Blythe on AI Inside Your Database, Real-Time AI, Declarative ML/AI

#3 Duncan Blythe on AI Inside Your Database, Real-Time AI, Declarative ML/AI

S1E3 · · 36:04

If you've ever wanted a simpler way to integrate AI directly into your database, SuperDuperDB might be the answer. SuperDuperDB lets you easily apply AI processes to your data while ...

Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1

Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1

S1E3 · · 13:37

Supabase just acquired OrioleDB, a storage engine for PostgreSQL. Oriole gets creative with MVCC! It uses an UNDO log rather than keeping multiple versions of an entire data row (tu...

#2 Antonio Bustamante on AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation

#2 Antonio Bustamante on AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation

S1E2 · · 37:09

Today’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any d...

#1 Chang She on Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure at LanceDB

#1 Chang She on Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure at LanceDB

S1E1 · · 34:04

Imagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning. Machine learning and AI success depends on the speed you can ...