University College London

FlashAttention-3, the latest release in the FlashAttention series, has been designed to address the inherent bottlenecks of the attention layer in Transformer architectures. These bottlenecks are crucial for the performance of large language models (LLMs) and...
One of the emerging challenges in artificial intelligence is whether next-token prediction can truly model human intelligence, particularly in planning and reasoning. Despite its extensive application in modern language models, this method might be inherently limited...

Facebook AI Open-Sources CO3D (Common Objects in 3D) Data Set For 3D Reconstruction in Computer Vision Research

3D object reconstruction is a significant computer vision problem with AR/VR technology applications, such as telepresence and the generation of 3D models for gaming....

NVIDIA And King’s College London Uses Cambridge-1 To build AI Models To Generate Synthetic Brain Images

NVIDIA and King's College London have revealed new information about one of the first projects to be run on Cambridge-1, the UK's most powerful...

DeepMind and University College London Introduce Alchemy, A Novel Open-Source Benchmark For Meta-Reinforcement learning (RL) Research

Alchemy, a novel open-source benchmark for meta Reinforcement learning (RL) in the recent decade, has garnered much attention in the ML field. The RL...

NuminaMath 7B TIR Released: Transforming Mathematical Problem-Solving with Advanced Tool-Integrated Reasoning and Python REPL...

0
Numina has announced the release of its latest model, NuminaMath 7B TIR. This advanced language model is designed specifically for solving mathematical problems. The...

Tsinghua University Open Sources CodeGeeX4-ALL-9B: A Groundbreaking Multilingual Code Generation Model Outperforming Major Competitors...

0
In a significant leap forward for the field of code generation, the Knowledge Engineering Group (KEG) and Data Mining team at Tsinghua University have...

InternLM2.5-7B-Chat: Open Sourcing Large Language Models with Unmatched Reasoning, Long-Context Handling, and Enhanced Tool...

0
InternLM has unveiled its latest advancement in open large language models, the InternLM2.5-7B-Chat, available in GGUF format. This model is compatible with llama.cpp, an...

Jina AI Releases Jina Reranker v2: A Multilingual Model for RAG and Retrieval with...

0
Jina AI has released the Jina Reranker v2 (jina-reranker-v2-base-multilingual), an advanced transformer-based model fine-tuned for text reranking tasks. This model is designed to significantly...

Google Releases Gemma 2 Series Models: Advanced LLM Models in 9B and 27B Sizes...

0
Google has unveiled two new models in its Gemma 2 series: the 27B and 9B. These models showcase significant advancements in AI language processing,...

Recent articles

🐝 FREE AI Courses on RAG + Deployment of an Healthcare AI App + LangChain Colab Notebook all included

X