Kyutai Releases Hibiki: A 2.7B Real-Time Speech-to-Speech and Speech-to-Text Translation with Near-Human Quality and Voice Transfer

Real-time speech translation presents a complex challenge, requiring seamless integration of speech recognition, machine translation, and text-to-speech synthesis. Traditional cascaded approaches often introduce compounding errors, fail to retain speaker identity,…

Asymmetric Certified Robustness via Feature-Convex Neural Networks – The Berkeley Artificial Intelligence Research Blog

Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused…

Fine-Tuning of Llama-2 7B Chat for Python Code Generation: Using QLoRA, SFTTrainer, and Gradient Checkpointing on the Alpaca-14k Dataset

In this tutorial, we demonstrate how to efficiently fine-tune the Llama-2 7B Chat model for Python code generation using advanced techniques such as QLoRA, gradient checkpointing, and supervised fine-tuning with…

Detecting Text Ghostwritten by Large Language Models – The Berkeley Artificial Intelligence Research Blog

The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have…

Chain-of-Associated-Thoughts (CoAT): An AI Framework to Enhance LLM Reasoning

Large language models (LLMs) have revolutionized artificial intelligence by demonstrating remarkable capabilities in text generation and problem-solving. However, a critical limitation persists in their default “fast thinking” approach—generating outputs based…

The Shift from Models to Compound AI Systems – The Berkeley Artificial Intelligence Research Blog

AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led…

Meet Crossfire: An Elastic Defense Framework for Graph Neural Networks under Bit Flip Attacks

Graph Neural Networks (GNNs) have found applications in various domains, such as natural language processing, social network analysis, recommendation systems, etc. Due to its widespread usage, improving the defences of…

2024 BAIR Graduate Directory – The Berkeley Artificial Intelligence Research Blog

Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded…

Creating an AI Agent-Based System with LangGraph: Putting a Human in the Loop

In our previous tutorial, we built an AI agent capable of answering queries by surfing the web and added persistence to maintain state. However, in many scenarios, you may want…

Modeling Extremely Large Images with xT – The Berkeley Artificial Intelligence Research Blog

As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing…