S2E18

#035 A Search System That Learns As You Use It (Agentic RAG)

December 13, 2024 · 45:30

Modern RAG systems build on flexibility.

At their core, they match each query with the best tool for the job.

They know which tool fits each task. When you ask about sales numbers, they reach for SQL. When you need to company policies, they use vector search or BM25. The key is switching tools smoothly.

A question about sales figures might need SQL, while a search through policy documents works better with vector search. The key is building systems that can switch between these tools smoothly.

But all types of retrieval start with metadata. By tagging documents with key details during processing, we narrow the search space before diving in.

The best systems use a mix of approaches: they might keep full documents for context, summaries for quick scanning, and metadata for filtering. They cast a wide net at first, then use neural ranking to zero in on the most relevant results.

The quality of embeddings can make or break a system. General-purpose models often fall short in specialized fields. Testing different embedding models on your specific data pays off - what works for general text might fail for legal documents or technical manuals. Sometimes, fine-tuning a model for your domain is worth the effort.

When building search systems, think modular. Start with pieces that can be swapped out as needs change or better tools emerge. Add metadata processing early - it's harder to add later. Break the retrieval process into steps: first find possible matches quickly, then rank them carefully. For complex documents with tables or images, add tools that can handle different types of content.

The best systems also check their work. They ask: "Did I actually answer the question?" If not, they try a different approach. But they also know when to stop - endless loops help no one. In the end, RAG isn't just about finding information. It's about finding the right information, in the right way, at the right time.

Stephen Batifol:

Nicolay Gerold:

00:00 Introduction to Agentic RAG 00:04 Understanding Control Flow in Agentic RAG 00:33 Decision Making with LLMs 01:11 Exploring Agentic RAG with Stephen Batifol 03:35 Comparing RAG and GAR 06:31 Implementing Agentic RAG Workflows 22:36 Filtering with Prefix, Suffix, and Midfix 24:15 Breaking Mechanisms in Workflows 28:00 Evaluating Agentic Workflows 30:31 Multimodal and VLLMs in Document Processing 33:51 Challenges and Innovations in Parsing 34:51 Overrated and Underrated Aspects in LLMs 39:52 Building Effective Search Applications

View episode transcript

Listen to How AI Is Built using one of many popular podcasting apps or directories.

← Previous · All Episodes · Next →

#035 A Search System That Learns As You Use It (Agentic RAG)

Subscribe