Author: Mohammad Asjad

Mohammad Asjad
142 POSTS0 COMMENTS
Asjad is an intern consultant at Marktechpost. He is persuing B.Tech in mechanical engineering at the Indian Institute of Technology, Kharagpur. Asjad is a Machine learning and deep learning enthusiast who is always researching the applications of machine learning in healthcare.

MoEUT: A Robust Machine Learning Approach to Addressing Universal Transformers’ Efficiency Challenges

Transformers are essential in modern machine learning, powering large language models, image processors, and reinforcement learning agents. Universal Transformers (UTs) are a promising alternative...

Llama3-V: A SOTA Open-Source VLM Model Comparable performance to GPT4-V, Gemini Ultra, Claude Opus with a 100x Smaller Model

Llama 3 has significantly outperformed GPT-3.5 and even surpassed GPT-4 in several benchmarks, showcasing its strength in efficiency and task-specific performance despite having fewer...

In-Context Learning Capabilities of Multi-Layer Perceptrons MLPs: A Comparative Study with Transformers

Recent years have seen significant advances in neural language models, particularly Large Language Models (LLMs) enabled by the Transformer architecture and increased scale. LLMs...

Question-Answer Cross Attention Networks (QAN): Advancing Answer Selection in Community Question Answering

Community Question Answering (CQA) platforms, exemplified by Quora, Yahoo! Answers, and StackOverflow, serve as interactive hubs for information exchange. Despite their popularity, the varying...

Inductive Biases in Deep Learning: Understanding Feature Representation

Machine learning research aims to learn representations that enable effective downstream task performance. A growing subfield seeks to interpret these representations' roles in model...

Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

Large Language Models (LLMs) have advanced natural language processing tasks significantly. Recently, using LLMs for physical world planning tasks has shown promise. However, LLMs,...

Unlocking the Potential of SirLLM: Advancements in Memory Retention and Attention Mechanisms

The rapid growth of large language models (LLMs) has catalyzed the development of numerous NLP applications, such as chatbots, writing assistants, and programming aids....

Achieving Balance in Lifelong Learning: The WISE Memory Approach

LLMs demonstrate emergent intelligence with increased parameters, computes, and data, hinting at artificial general intelligence. Despite advancements, deployed LLMs still exhibit errors like hallucinations,...

A Paradigm Shift: MoRA’s Role in Advancing Parameter-Efficient Fine-Tuning Techniques

Parameter-efficient fine-tuning (PEFT) techniques adapt large language models (LLMs) to specific tasks by modifying a small subset of parameters, unlike Full Fine-Tuning (FFT), which...

Transparency in Foundation Models: The Next Step in Foundation Model Transparency Index FMTI

Foundation models are central to AI's influence on the economy and society. Transparency is crucial for accountability, competition, and understanding, particularly regarding the data...

An Efficient AI Approach to Memory Reduction and Throughput Enhancement in LLMs

The efficient deployment of large language models (LLMs) necessitates high throughput and low latency. However, LLMs' substantial memory consumption, particularly by the key-value (KV)...

Apple Researchers Propose KV-Runahead: An Efficient Parallel LLM Inference Technique to Minimize the Time-to-First-Token

Large language models (LLMs), particularly Generative Pre-trained Transformer (GPT) models, have demonstrated strong performance across various language tasks. However, challenges persist in their decoder...

🐝 🐝 Join the Fastest Growing AI Research Newsletter...

X