Another Large Language Model! Meet IGEL: An Instruction-Tuned German LLM Family

IGEL is the Instruction-tuned German large Language Model for Text. IGEL version 001 (Instruct-igel-001) is a primitive proof of concept meant to be used to determine whether or not it is feasible to construct a German instruction-tuned model from a combination of existing open-source models and a German-translated instruction dataset. 

The first version of IGEL was based on BigScience BLOOM, which Malte Ostendorff localized into German. IGEL is designed to perform various tasks related to natural language comprehension, including sentiment analysis, language translation, and question answering, with high accuracy and dependability in each area.

✅ [Featured Article] LLMWare.ai Selected for 2024 GitHub Accelerator: Enabling the Next Wave of Innovation in Enterprise RAG with Small Specialized Language Models

The team wanted to experiment with how well the LLMs perform instruction-based modeling tasks in German. They accomplished this using a pre-trained customized BLOOM model (6B) and fine-tuning it using a dataset based on translated instructions. To construct the dataset, an approach called automatic translation was used to transform the English instructions into German. Even though there was a greater chance of translation errors occurring due to this strategy, their goal was to determine whether or not the model could still learn to produce instructional replies.

LoRA-tuned BLOOM-CLP Deutsch (6.4B parameters) with merged weights for usage with Hugging Face Transformers is what users will find in Instruct-igel-001. Before instruct-igel-001 is trained on naive translated instruction datasets, there is not a lot of attention paid to data-cleaning, filtering, or post-processing of the data.

The team mentioned that hallucination, toxicity, and stereotyping are only some of the problems that instruct-igel-001 has, all of which are common with language models. They plan to finish developing the chat model to create a conversational interface. This will improve the data quality in ways that go beyond the traditional request-and-response methodology.


Check out the Blog and Try the model here. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

[Free AI Webinar] 'How to Build Personalized Marketing Chatbots (Gemini vs LoRA)'.