Reinforcement Learning LLM

News

verl: Volcano Engine Reinforcement Learning for LLMs

Build RL dataflows such as GRPO, PPO in a few lines of code. Seamless integration of existing LLM infra with modular APIs: Decouples computation and data dependencies, enabling seamless integration ...

diginomica2d

Want to get AI agents right? Get your real-time evaluation metrics right first

The AI agent hype has reached a new crescendo, but that doesn't bring us closer to successful projects. Enter AI evaluation - ...

ExtremeTech on MSN2d

What Is an LLM and How Does It Work?

Modern AI LLMs can seem almost magical when you use them. But, just like even the best magic tricks, there is an explanation ...

DeepSeek Blows Up Meta's AI Strategy

Meta faces challenges in AI as Chinese models like DeepSeek's R1 outperform with cost-effective innovation. Read an analysis ...

How to Build Custom LLM Benchmarks for Your AI Applications

Custom benchmarks are essential for evaluating and optimizing LLMs to meet specific application needs, especially for ...

Coordinated Science Laboratory4d

Grainger faculty awarded CSL AICE grants

Four faculty members from the Illinois Grainger College of Engineering have received a total of $475,000 in grants to support ...

IEEE8d

Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles With Deep Reinforcement Learning

Abstract: This letter presents a model-free deep reinforcement learning framework for informative path planning with heterogeneous fleets of autonomous surface vehicles to locate and collect plastic ...

Now it’s TikTok parent ByteDance’s turn for a reasoning AI: enter Seed-Thinking-v1.5!

It achieved an 8.0% higher win rate over DeepSeek R1, suggesting that its strengths generalize beyond just logic or math-heavy challenges.

11d

DeepSeek unveils new technique for smarter, scalable AI reward models

Reward models holding back AI? DeepSeek's SPCT creates self-guiding critiques, promising more scalable intelligence for enterprise LLMs.

DIGITIMES12d

DeepSeek, Tsinghua unveil leaner path to smarter AI

Chinese AI startup DeepSeek has teamed up with Tsinghua University researchers to develop a new reinforcement learning ...

Tech Xplore on MSN16d

What is artificial general intelligence and how does it differ from other types of AI?

Turns out, training artificial intelligence systems is not unlike raising a child. That's why some AI researchers have begun mimicking the way children naturally acquire knowledge and learn about the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results