R13363/5ed6974fd1demaster
/
README.md
Welcome to the RAG System Building Exercise!
Hello Students! ๐ Are you ready to dive into the exciting world of Retrieval-Augmented Generation (RAG) systems? This exercise will guide you through constructing your very own RAG system, combining the power of information retrieval with state-of-the-art language models. Let's embark on this learning adventure together!
What is a RAG System?
A Retrieval-Augmented Generation (RAG) system is a powerful AI architecture that combines the strengths of large language models with external knowledge retrieval. Here's how it works:
Retrieval: When given a query, the system searches a knowledge base to find relevant information. Augmentation: The retrieved information is then used to supplement the input to a language model. Generation: The language model generates a response based on both the original query and the retrieved information.
This approach allows the system to access and utilize vast amounts of up-to-date information without the need to retrain the entire model, leading to more accurate and informed responses.
Getting Started
To begin your RAG system journey, you have two options for setting up your environment:
This code has been designed in a python 3.11 environment. Therefore, to avoid any compatibility issues, we recommend using Python 3.11 for this exercise.
Option 1: Using venv (if you have Python 3.11 installed)
- Create a new Python environment: ` python -m venv rag_env `
- Activate the environment:
- On Windows: rag_env\Scripts\activate
- On macOS and Linux: source rag_env/bin/activate
- Install the required packages: ` pip install -r requirements.txt `
Option 2: Using Conda (recommended for flexibility)
- If you don't have Conda installed, download and install Miniconda or Anaconda.
- Create a new Conda environment with Python 3.11: ` conda create -n rag_env python=3.11 `
- Activate the Conda environment: ` conda activate rag_env `
- Install the required packages: ` pip install -r requirements.txt `
After setting up your environment using either method:
- Open the provided Python files and look for comments indicating where code needs to be completed.
- Fill in the missing code to make the RAG system functional.
- Run the main script to test your implementation: ` python main.py `
Adding your .env file
Please add a .env file in the root directory of the project with the following content:
OPENAI_KEY=<your_openai_key> LOCAL=True QUICK_DEMO=True
LOCAL lets you run the embeddings locally if True and on OpenAI's servers if False., QUICK_DEMO lets you run the code with a smaller dataset for faster results if True and with the full dataset if False.
Provided Data
In the data folder of this project, you'll find the pdf files of all EPFL legal documents.
Feel free to explore these files to understand the kind of data your system will be working with.
Customizing Your Data
While we've provided sample data to get you started, we encourage you to experiment with your own documents and questions! Feel free to replace the provided files with your own text documents and relevant questions. This will allow you to test your RAG system on a wider range of topics and scenarios. To use your own data:
Your Task
Your mission, should you choose to accept it (and we hope you do!), is to complete the missing parts of the RAG system. You'll be working on:
Implementing efficient text chunking Creating and managing embeddings Building a retrieval system using FAISS Integrating the retrieval system with a language model using LangChain
Remember, the journey is just as important as the destination. Don't hesitate to experiment, ask questions, and learn from both successes and failures. Tips for Success
Take time to understand each component of the RAG system. Test your code frequently as you implement each part. Collaborate with your peers and share insights. Don't be afraid to consult documentation or ask for help when needed.
We're excited to see what you'll create! Happy coding, and may your RAG system retrieve and generate with excellence! ๐๐๐ค
Going Further
Congratulations on building your basic RAG system! However, the journey doesn't end here. There are several ways to improve and extend your system for better performance and user experience. Here are some areas to consider:
2. Streaming Responses
Our current implementation displays the entire message at once, which can lead to long wait times for users. Implementing a chunk-by-chunk response system using LangChain's streaming capabilities can significantly improve the user experience. This allows users to start reading the response while the rest is still being generated.
3. Improving the Retrieval Process
The retrieval part of our RAG system can be enhanced in several ways:
a) Adding Context to Chunks: Our current chunks might lack important context. For example, a chunk discussing "last quarter benefits" doesn't specify which quarter it's referring to. You can use an LLM to read the full document and append missing context to each chunk.
b) Hybrid Search: Pure embedding-based search can sometimes miss important keywords. Implementing a hybrid approach that combines keyword search (like BM25) with semantic search can lead to more accurate retrievals.
c) Reranking: Implementing a reranking step after the initial retrieval can improve the relevance of the results. While this might introduce some latency, especially on limited hardware, it can significantly enhance the quality of the retrieved information.
4. Fine-tuning and Domain Adaptation
Consider fine-tuning your language model on domain-specific data to improve its performance on your particular use case. This can lead to more accurate and relevant responses.
5. Implementing Feedback Mechanisms
Add a way for users to provide feedback on the system's responses. This can help you identify areas for improvement and potentially implement a learning mechanism to enhance the system over time.
6. Exploring Multi-modal RAG
If your use case involves not just text but also images or other types of data, consider extending your RAG system to handle multiple modalities.
Remember, building a RAG system is an iterative process. Each of these improvements opens up new possibilities and challenges. Don't hesitate to experiment and push the boundaries of what your system can do!
We're excited to see how you'll take your RAG system to the next level. Keep exploring, keep innovating, and most importantly, keep learning! ๐๐ง ๐ก