Large Language Model

Researchers from China Introduce Video-LLaVA: A Simple but Powerful Large Visual-Language Baseline Model

Researchers from Peking University, Peng Cheng Laboratory, Peking University Shenzhen Graduate School, and Sun Yat-sen University introduce the Large Vision-Language Model (LVLM) approach, Video-LLaVA,...

Meet LQ-LoRA: A Variant of LoRA that Allows Low-Rank Quantized Matrix Decomposition for Efficient Language Model Finetuning

In the rapidly advancing era of Artificial Intelligence, the introduction of Large Language Models (LLMs) has transformed the way machines and humans interact with...

Redefining Transformers: How Simple Feed-Forward Neural Networks Can Mimic Attention Mechanisms for Efficient Sequence-to-Sequence Tasks

Researchers from ETH Zurich analyze the efficacy of utilizing standard shallow feed-forward networks to emulate the attention mechanism in the Transformer model, a leading...

How To Train Your LLM Efficiently? Best Practices for Small-Scale Implementation

Among the daily deluge of news about new advancements in Large Language Models (LLMs), you might be asking, "how do I train my own?"....

Inflection Introduces Inflection-2: The Best AI Model in the World for Its Compute Class and the Second Most Capable LLM in the World Today

Inflection AI developed a Large Language Model with the best right up there. The company states that its model, Inflection-2, is the second most...

Meta Research Introduces System 2 Attention (S2A): An AI Technique that Enables an LLM to Decide on the Important Parts of the Input Context...

Large Language Models (LLMs), although highly competent in a wide array of language tasks, often display weak reasoning capabilities by making very simple mistakes....

This AI Research from China Explores the Illusionary Mind of AI: A Deep Dive into Hallucinations in Large Language Models

Large language models have recently brought about a paradigm change in natural language processing, leading to previously unheard-of advancements in language creation, comprehension, and...

Choosing the Right Whisper Model: When To Use Whisper v2, Whisper v3, and Distilled Whisper?

In the field of Artificial Intelligence and Machine Learning, speech recognition models are transforming the way people interact with technology. These models based on...

NVIDIA AI Research Releases HelpSteer: A Multiple Attribute Helpfulness Preference Dataset for STEERLM with 37k Samples

In the significantly advancing field of Artificial Intelligence (AI) and Machine Learning (ML), developing intelligent systems that smoothly align with human preferences is crucial....

This AI Paper Proposes ML-BENCH: A Novel Artificial Intelligence Approach Developed to Assess the Effectiveness of LLMs in Leveraging Existing Functions in Open-Source Libraries

LLM models have been increasingly deployed as potent linguistic agents capable of performing various programming-related activities. Despite these impressive advances, a sizable chasm still...

Researchers from Microsoft Research and Tsinghua University Proposed Skeleton-of-Thought (SoT): A New Artificial Intelligence Approach to Accelerate Generation of LLMs

Large Language Models (LLMs), such as GPT-4 and LLaMA, have undoubtedly transformed the technological landscape. However, sluggish processing speed is a recurring challenge limiting...

NVIDIA AI Researchers Propose Tied-Lora: A Novel Artificial Intelligence Approach that Aims to Improve the Parameter Efficiency of the Low-rank Adaptation (LoRA) Methods

A group of researchers from Nvidia have developed a new technique called Tied-LoRA, which aims to improve the parameter efficiency of the Low-rank Adaptation...

Recent articles