Meet Verba: An Open-Source Tool to Build Your Own RAG Retrieval Augmented Generation Pipeline and Utilize LLMs for Internal-Based Outputs

Verba is an open-source project to provide RAG apps with a simplified, user-friendly interface. One can dive into the data and start having relevant conversations quickly.

Verba is more of a companion than a mere tool regarding data querying and manipulation. Paperwork, comparison, and contrast between several sets of numbers, and data analysis- through Weaviate and Large Language Models (LLMs), Verba enables all of this to be achievable.

Based on Weaviate’s cutting-edge Generative Search engine, Verba automatically pulls the necessary background information from the documents whenever a search is performed. It uses the processing power of LLMs to provide exhaustive, context-aware solutions. The straightforward layout of Verba makes it easy to retrieve all of this information. Verba’s straightforward data import features support file formats as varied as .txt, .md, and others. The technology automatically performs chunking and vectorization on the data before one feeds it into Weaviate, making it more suitable for search and retrieval.

Use the create module and hybrid search options available in Weaviate to the advantage when working with Verba. These sophisticated methods of searching scan through the papers in search of important context pieces, which Large Language Models then employ to provide in-depth responses to the inquiries.

To improve the speed of future searches, Verba embeds both the generated results and the queries in Weaviate’s Semantic Cache. Before answering the question, Verba will look in its Semantic Cache to determine if a similar one has already been answered.

An OpenAI API key is required regardless of the deployment method to enable data input and querying capabilities. Add the API key to the system environment variables or create a.env file when cloning the project.

Verba allows one to connect to Weaviate instances in various ways, depending on the specific use case. If the VERBA_URL and VERBA_API_KEY environment variables are not present, Verba will use Weaviate Embedded instead. The simplest method to launch the Weaviate database for prototyping and testing is through this local deployment.

Verba provides simple instructions to import the data for further processing. Please be aware that importing data will cost money based on the OpenAI access key configuration before one continues. OpenAI models are used only by Verba. Please note that the API key will be charged for the cost of using these models. Data embedding and answer generation are the primary cost drivers. 

You can give a shot.

There are three main parts to Verba:

  • One can host their Weaviate database on Weaviate Cloud Service (WCS) or their server.
  • This FastAPI Endpoint mediates between the Large Language Model provider and the Weaviate data store.
  • The React Frontend (Static delivered via FastAPI) provides a dynamic user interface for data exploration and manipulation. Development.

Check out the GitHub and Try itAll Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...