RAG from Scratch

Written by Nikos Vaggalis

Tuesday, 10 December 2024

The "RAG from Scratch" tutorial by Langchain coupled with the "RAG playground" are two great educational resources that will help you kickstart your journey with RAG.

LLMs are trained on data made available from their trainers. If you want to feed them your own data to perform queries on it, you can do it in two ways.

The old way was by fine tuning the foundational model. Fine tuning while a perfectly valid technique, had a few downsides; it's resource intensive in both computing power and data volumes required while it has to be continuously updated when new data arrives.

The other option and the more modern and lightweight approach, is through RAG or retrieval augmented generation.
RAG allows LLMs to amplify the user's query bu connecting to external data in real time when generating their output.
This approach is lighter in resources, doesn't need constant updating since it consumes the data at run time and of course the big boon is that it retrieves up to date answers.
In essence, RAG with a few exceptions has rendered fine tuning LLMs obsolete.

However, to utilize RAG you have to stick to a well-defined pipeline:

Collect and preprocess your documents
Create the vector embeddings
Setup the Retrieal system
Integrate the LLM
Generate the response
Post process

If that sounds too complicated, fear not as this new course by Langchain, will show you how to build a RAG system from scratch.

Assembled as a 14-part short video youtube playlist, it starts with the absolute basics and moves along the pipeline to completion describing all the intermediate steps. And it does that by using its own Langchain framework, Python, the ChromaDB vectorstore, the ChatOpenAI interface and OpenAI's LLM.

Here follows the complete list of the tutorials:

1. Overview
2. Indexing
3. Retrieval
4. Generation
5. Query Translation -- Multi Query
6. Query Translation -- RAG Fusion
7. Query Translation -- Decomposition
8. Query Translation -- Step Back
9. Query Translation -- HyDE
10. Routing
11. Query Structuring
12. Multi-Representation Indexing
13. RAPTOR
14. ColBERT

Part 13 is about how RAG systems can handle "lower-level" questions that reference specific facts found in a single document or "higher-level" questions that distill ideas that span many documents, while part 14 and the ColBERT approach address the issue that occurs with embedding models compressing text into fixed-length (vector) representations that capture the semantic content of the document.

While this compression is very useful for efficient search / retrieval, it puts a heavy burden on that single vector representation to capture all the semantic nuance and in some cases, irrelevant content can dilute the semantic usefulness of the embedding.

All the code is hosted on the project's Github repo as Jupyter notebooks that you can download and run on your own machine.

That's not all however. As a complimentary element to the course resource, there's the interactive RAG Playground (unrelated to Langchain). This playground lets you explore each step of the RAG pipeline through interactive visualizations.

Therefore you can, practically at a glance and without writing any code, take a look at what's going on behind the scenes:

Text Splitting

Visualize how documents are split into meaningful chunks while preserving semantic coherence
Character strategy: Simple splitting with fixed chunk size. Best for straightforward text processing.
Recursive character strategy: Intelligent splitting that preserves natural language boundaries and semantic meaning. Recommended for production use.

Vector Embedding & Similarity

View text blocks and their vector embeddings side by side. Ask questions to find similar content through semantic search.

Response Generation

Observe how LLMs combine retrieved context with user queries to generate accurate, contextual responses

All free and inside your browser. RAG demystified!

More Information

RAG from Scratch on Github

Youtube Playlist

Rag Playground

Learn To Chat with Your Data For Free

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

The OpenAI Academy Makes AI Accessible
29/04/2025

OpenAI has provided a treasure trove of information for spreading knowledge about AI to the general public; understanding what AI is and learning how to leverage it by using tools like ChatGPT.

+ Full Story

TSP - 81,998 Bars In South Korea Shortest Walking Tour
27/04/2025

It is a truth universally acknowledged that the Travelling Saleman Problem (TSP) is impossible to solve for even reasonably small examples using today's computers. Do we need powerful hardware or a qu [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 10 December 2024 )

More Information

Related Articles

Comments