Large Language Model

GraCoRe: A New AI Benchmark for Unveiling Strengths and Weaknesses in LLM Graph Comprehension and Reasoning

Graph comprehension and complex reasoning in artificial intelligence involve developing and evaluating the abilities of Large Language Models (LLMs) to understand and reason about...

This AI Research from Ohio State University and CMU Discusses Implicit Reasoning in Transformers And Achieving Generalization Through Grokking

Large Language Models (LLMs) with parametric memory of rules and knowledge have shown limitations in implicit reasoning. Research has shown that these models, even...

Innodata’s Comprehensive Benchmarking of Llama2, Mistral, Gemma, and GPT for Factuality, Toxicity, Bias, and Hallucination Propensity

In a recent study by Innodata, various large language models (LLMs) such as Llama2, Mistral, Gemma, and GPT were benchmarked for their performance in...

This AI Paper from Cohere for AI Presents a Comprehensive Study on Multilingual Preference Optimization

Multilingual natural language processing (NLP) is a rapidly advancing field that aims to develop language models capable of understanding & generating text in multiple...

Salesforce Research Introduces INDICT: A Groundbreaking Framework Enhancing the Safety and Helpfulness of AI-Generated Code Across Diverse Programming Languages

The ability to automate and assist in coding has the potential to transform software development, making it faster and more efficient. However, ensuring these...

Tsinghua University Open Sources CodeGeeX4-ALL-9B: A Groundbreaking Multilingual Code Generation Model Outperforming Major Competitors and Elevating Code Assistance

In a significant leap forward for the field of code generation, the Knowledge Engineering Group (KEG) and Data Mining team at Tsinghua University have...

T-FREE: A Tokenizer-Free Approach for Efficient and Scalable Text Encoding in Large Language Models

Natural language processing (NLP) drives researchers to develop algorithms that enable computers to understand, interpret, and generate human languages. These efforts cover various applications,...

InternLM2.5-7B-Chat: Open Sourcing Large Language Models with Unmatched Reasoning, Long-Context Handling, and Enhanced Tool Use

InternLM has unveiled its latest advancement in open large language models, the InternLM2.5-7B-Chat, available in GGUF format. This model is compatible with llama.cpp, an...

WorldBench: A Dynamic and Flexible LLM Benchmark Composed of Per-Country Data from the World Bank

Large language models (LLMs) have gained significant attention for their impressive performance across various tasks, from summarizing news to writing code and answering trivia...

Accelerating LLM Inference: Introducing SampleAttention for Efficient Long Context Processing

Large language models (LLMs) now support very long context windows, but the quadratic complexity of standard attention results in significantly prolonged Time-to-First-Token (TTFT) latency....

MInference (Milliontokens Inference): A Training-Free Efficient Method for the Pre-Filling Stage of Long-Context LLMs Based on Dynamic Sparse Attention

The computational demands of LLMs, particularly with long prompts, hinder their practical use due to the quadratic complexity of the attention mechanism. For instance,...

A Comprehensive Guide to Fine-Tuning ChatGPT for Your Business

Businesses continually seek ways to leverage AI to enhance their operations. One of the most impactful applications of AI is conversational agents, with OpenAI's...

NuminaMath 7B TIR Released: Transforming Mathematical Problem-Solving with Advanced Tool-Integrated Reasoning and Python REPL...

0
Numina has announced the release of its latest model, NuminaMath 7B TIR. This advanced language model is designed specifically for solving mathematical problems. The...

Tsinghua University Open Sources CodeGeeX4-ALL-9B: A Groundbreaking Multilingual Code Generation Model Outperforming Major Competitors...

0
In a significant leap forward for the field of code generation, the Knowledge Engineering Group (KEG) and Data Mining team at Tsinghua University have...

InternLM2.5-7B-Chat: Open Sourcing Large Language Models with Unmatched Reasoning, Long-Context Handling, and Enhanced Tool...

0
InternLM has unveiled its latest advancement in open large language models, the InternLM2.5-7B-Chat, available in GGUF format. This model is compatible with llama.cpp, an...

Jina AI Releases Jina Reranker v2: A Multilingual Model for RAG and Retrieval with...

0
Jina AI has released the Jina Reranker v2 (jina-reranker-v2-base-multilingual), an advanced transformer-based model fine-tuned for text reranking tasks. This model is designed to significantly...

Google Releases Gemma 2 Series Models: Advanced LLM Models in 9B and 27B Sizes...

0
Google has unveiled two new models in its Gemma 2 series: the 27B and 9B. These models showcase significant advancements in AI language processing,...

Recent articles

🐝 FREE AI Courses on RAG + Deployment of an Healthcare AI App + LangChain Colab Notebook all included

X