Google AI Research Propose a General Approach for Personalized Text Generation Using Large Language Models (LLMs)

With the rise of AI-based technologies used to facilitate content production, individualized text generation has attracted considerable attention. To make generative systems that work for specific audiences, creation contexts, and information needs, they must be able to give a personalized response that takes extra contexts into account, like documents the user has already written.

Researchers have looked into the creation of customized text in several settings, such as reviews, chatbots, and social media. Most existing works suggest models that are task-specific and rely on domain-specific features or information. The question of how to create a generic strategy that can be used in every situation receives less attention. Large language models (LLMs) are rising to prominence in many text production tasks due to the rise of generative AI, especially through chatbots like ChatGPT1 and Bard2. However, few studies have looked into how to give LLMs such capabilities.

Recent Google research offers a generic method for producing unique content by drawing from extensive linguistic resources. Their study is motivated by a common method of writing instruction that breaks down the process of writing with outside sources into smaller steps: research, source evaluation, summary, synthesis, and integration.

To train LLMs for individualized text production, the team takes a similar approach, adopting a multistage multitask structure that includes retrieval, ranking, summarization, synthesis, and generation. In particular, they take cues from the current document’s title and first line to create a question and pull relevant information from a secondary repository of personal contexts, such as previous documents the user has written.

Next, they summarize the ranked findings after ranking them for relevance and importance. In addition to retrieval and summarization, they synthesize the retrieved information into key elements, which are then fed into the big language model to generate the new document.

It is a common observation in the field of language teaching that reading and writing skills develop hand in hand. Moreover, research shows that an individual’s reading level and amount can be measured with author recognition activities, which correlate with reading proficiency. These two findings led the researchers to create a multitasking environment where they added an auxiliary task asking the large language model to identify the authorship of a particular text to improve its reading abilities. They hope that by giving the model this challenge, it will be able to interpret the provided text more accurately and produce more compelling and tailored writing.

The team used three publicly available datasets consisting of email correspondence, social media debates, and product reviews to assess the performance of the suggested models. The multi-stage, multi-task framework shows substantial gains over several baselines across all three datasets.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, please follow us on Twitter

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.

🚀 [FREE AI WEBINAR] 'Optimise Your Custom Embedding Space: How to find the right embedding model for YOUR data.' (July 18, 2024) [Promoted]