RAG from Scratch
Written by Nikos Vaggalis   
Tuesday, 10 December 2024

The "RAG from Scratch" tutorial by Langchain coupled with the "RAG playground" are two great educational resources that will help you kickstart your journey with RAG.

LLMs are trained on data made available from their trainers. If you want to feed them your own data to perform queries on it, you can do it in two ways.

The old way was by fine tuning the foundational model. Fine tuning while a perfectly valid technique, had a few downsides; it's resource intensive in both computing power and data volumes required while it has to be continuously updated when new data arrives.

The other option and the more modern and lightweight approach, is through RAG or retrieval augmented generation.
RAG allows LLMs to amplify the user's query bu connecting to external data in real time when generating their output.
This approach is lighter in resources, doesn't need constant updating since it consumes the data at run time and of course the big boon is that it retrieves up to date answers.
In essence, RAG with a few exceptions has rendered fine tuning LLMs obsolete.

However, to utilize RAG you have to stick to a well-defined pipeline:

  • Collect and preprocess your documents
  • Create the vector embeddings
  • Setup the Retrieal system
  • Integrate the LLM
  • Generate the response
  • Post process

If that sounds too complicated, fear not as this new course by Langchain, will show you how to build a RAG system from scratch.

Assembled as a 14-part short video youtube playlist, it starts with the absolute basics and moves along the pipeline to completion describing all the intermediate steps. And it does that by using its own Langchain framework, Python, the ChromaDB vectorstore, the ChatOpenAI interface and OpenAI's LLM.

Here follows the complete list of the tutorials:

1. Overview
2. Indexing
3. Retrieval
4. Generation
5. Query Translation -- Multi Query
6. Query Translation -- RAG Fusion
7. Query Translation -- Decomposition
8. Query Translation -- Step Back
9. Query Translation -- HyDE
10. Routing
11. Query Structuring
12. Multi-Representation Indexing
13. RAPTOR
14. ColBERT

Part 13 is about how RAG systems can handle "lower-level" questions that reference specific facts found in a single document or "higher-level" questions that distill ideas that span many documents, while part 14 and the ColBERT approach address the issue that occurs with embedding models compressing text into fixed-length (vector) representations that capture the semantic content of the document.

While this compression is very useful for efficient search / retrieval, it puts a heavy burden on that single vector representation to capture all the semantic nuance and in some cases, irrelevant content can dilute the semantic usefulness of the embedding.

All the code is hosted on the project's Github repo as Jupyter notebooks that you can download and run on your own machine.

That's not all however. As a complimentary element to the course resource, there's the interactive RAG Playground (unrelated to Langchain). This playground lets you explore each step of the RAG pipeline through interactive visualizations.

Therefore you can, practically at a glance and without writing any code, take a look at what's going on behind the scenes:

Text Splitting

  • Visualize how documents are split into meaningful chunks while preserving semantic coherence

  •  Character strategy: Simple splitting with fixed chunk size. Best for straightforward text processing.

  •  Recursive character strategy: Intelligent splitting that preserves natural language boundaries and semantic meaning. Recommended for production use.

Vector Embedding & Similarity

  • View text blocks and their vector embeddings side by side. Ask questions to find similar content through semantic search.

Response Generation

  • Observe how LLMs combine retrieved context with user queries to generate accurate, contextual responses

All free and inside your browser. RAG demystified! 

 

More Information

RAG from Scratch on Github

Youtube Playlist

Rag Playground

Related Articles

Learn To Chat with Your Data For Free

 

 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Can You Solve The GCHQ Christmas Challenge 2024
20/12/2024

The GCHQ Christmas Challenge has become a pre-Christmas tradition. While it is primarily targeted at school students working in teams, GCHQ encourages both children and adults to give it a try.



The IProgrammer Perl 2024 Review
08/01/2025

We recap the main events that happened throughout 2024 in the Perl world as explored by IProgrammer.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 10 December 2024 )