This AI Paper Introduces a Machine Learning Framework to Estimate the Inference Budget for Self-Consistency and GenRMs (Generative Reward Models)

Large Language Models (LLMs) have demonstrated significant advancements in reasoning capabilities across diverse domains, including mathematics and science. However, improving these reasoning abilities at test time remains a challenge researchers…

TorchSim: A Next-Generation PyTorch-Native Atomistic Simulation Engine for the MLIP Era

Radical AI has released TorchSim, a next-generation PyTorch-native atomistic simulation engine for the MLIP era. It accelerates materials simulation by orders of magnitude, transforming traditional scientific approaches. Current materials research…

Google Introduces Agent2Agent (A2A): A New Open Protocol that Allows AI Agents Securely Collaborate Across Ecosystems Regardless of Framework or Vendor

Google AI recently announced Agent2Agent (A2A), an open protocol designed to facilitate secure, interoperable communication among AI agents built on different platforms and frameworks. By offering a standardized approach to…

Google Releases Agent Development Kit (ADK): An Open-Source AI Framework Integrated with Gemini to Build, Manage, Evaluate and Deploy Multi Agents

Google has released the Agent Development Kit (ADK), an open-source framework aimed at making it easier for developers to build, manage, and deploy multi-agent systems. ADK is written in Python…

Unveiling Attention Sinks: The Functional Role of First-Token Focus in Stabilizing Large Language Models

LLMs often show a peculiar behavior where the first token in a sequence draws unusually high attention—known as an “attention sink.” Despite seemingly unimportant, this token frequently dominates attention across…

Salesforce AI Released APIGen-MT and xLAM-2-fc-r Model Series: Advancing Multi-Turn Agent Training with Verified Data Pipelines and Scalable LLM Architectures

AI agents quickly become core components in handling complex human interactions, particularly in business environments where conversations span multiple turns and involve task execution, information extraction, and adherence to specific…

OpenAI Introduces the Evals API: Streamlined Model Evaluation for Developers

In a significant move to empower developers and teams working with large language models (LLMs), OpenAI has introduced the Evals API, a new toolset that brings programmatic evaluation capabilities to…

This AI Paper from ByteDance Introduces MegaScale-Infer: A Disaggregated Expert Parallelism System for Efficient and Scalable MoE-Based LLM Serving

Large language models are built on transformer architectures and power applications like chat, code generation, and search, but their growing scale with billions of parameters makes efficient computation increasingly challenging.…

Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference Capabilities

LLMs have revolutionized artificial intelligence, transforming various applications across industries. Autoregressive (AR) models dominate current text generation, with leading systems like GPT-4, DeepSeek, and Claude all using sequential left-to-right architectures.…

Repurposing Protein Folding Models for Generation with Latent Diffusion – The Berkeley Artificial Intelligence Research Blog

PLAID is a multimodal generative model that simultaneously generates protein 1D sequence and 3D structure, by learning the latent space of protein folding models. The awarding of the 2024 Nobel…