With 22 million daily inquiries, Brave Search is rapidly becoming one of the popular search engines. Brave deliver impartial search results based on their index of the Web. Now Brave has taken a step further with the help of artificial intelligence to increase the accuracy of their Summarizer. There is a strong emphasis on users’ right to privacy, with no tracking of their searches or other actions.
The Brave team explained that it created Summarizer by combining existing technologies in reaction to the growing prevalence of AI in search engines, including the release of ChatGPT, an AI language model, and Microsoft’s declaration that it will include OpenAI’s model into its Bing search engine.
The large language models (LLMs) are trained to handle various information sources on the Internet, making them more reliable than a purely generative AI model. When a user types a question into Brave Search, the Summarizer will provide a short and informative answer at the top of the page based only on the user’s Web search results. It also includes source credit for transparency and accountability, in contrast to AI chat tools that can deliver manufactured responses.
In addition, there are always active connections to the primary sources from which the data was compiled. The authority biases of big language models can be reduced if proper attribution is maintained and users are given tools to evaluate the credibility of information sources.
The Brave Search team built the Summarizer from the ground up, so users can rest certain that it adheres to the same high standards of transparency and privacy. The Brave Summarizer uses it’s own privately held and run models, fine-tuned for maximum inference efficiency. Rather than relying on ChatGPT or its underlying infrastructure, the Summarizer is made up of three LLMs that were individually trained for specific tasks:
- The first is a model called question answering (QA), which determines whether a text fragment contains an answer. This is an expansion of what was already in place to enable Brave Search’s knowledge graph and featured snippets features; Brave has used LLMs for some time to increase the search relevancy. The length and quantity of evaluated text segments make all the difference.
- A group of zero-shot classifiers is then used to further categorize the candidates remaining after the quality assurance extraction step according to a wide range of parameters (hate speech, vulgar writing, spam, etc.).
- Ultimately, the summarizer/paraphrasing model processes the candidate texts to rewrite the input to remove redundancies and standardize the language for better readability.
Users of Brave Search, both on desktop and mobile, can now access the Brave Summarizer. The AI model is tested against the peak of 600 requests per second that Brave Search processes now. While only roughly 17% of inquiries currently result in a summary being generated. The team hopes to increase as they scale by applying the Summarizer to all searches while Bing and Google have yet to open up their systems.
Much work has gone into making sure the generated summaries are high-quality, in addition to being scalable. Yet, given the model is still in its infancy, “hallucinations,” in which seemingly unconnected pieces are combined into a single conclusion, are possible. The team plans to soon fix these issues and refine the model so that people can start using them.
Check out the Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 15k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.