Category

AI Models

13 articles in this category

MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required
AI Models

MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required

A complete walkthrough of LoRA fine-tuning Qwen3-1.7B on MedMCQA using AMD MI300X, built for the AMD Developer Hackathon on lablab.ai.

Hugging Face
May 08, 2026·1 min read
OpenAI Unveils Three Realtime Audio Models for Advanced Voice Applications
AI Models

OpenAI Unveils Three Realtime Audio Models for Advanced Voice Applications

OpenAI has released three new audio models through its Realtime API, enabling developers to build more sophisticated voice applications with capabilities like live speech translation and streaming transcription.

MarkTechPost
May 08, 2026·1 min read
OpenAI's new voice model brings GPT-5-level reasoning to real-time conversations
AI Models

OpenAI's new voice model brings GPT-5-level reasoning to real-time conversations

OpenAI unveils three new voice models that enable real-time reasoning, translation across 70+ languages, and live speech transcription.

The Decoder
May 07, 2026·1 min read
Meet ZAYA1-8B, a Super Efficient, Open Reasoning Model Trained on AMD Instinct MI300 GPUs
AI Models

Meet ZAYA1-8B, a Super Efficient, Open Reasoning Model Trained on AMD Instinct MI300 GPUs

Zyphra releases ZAYA1-8B, an open, efficient reasoning model with 8 billion parameters, trained on AMD Instinct MI300 GPUs, achieving competitive performance with far fewer parameters.

VentureBeat
May 07, 2026·1 min read
Mistral AI Launches Remote Agents in Vibe and Mistral Medium 3.5 with 77.6% SWE-Bench Verified Score
AI Models

Mistral AI Launches Remote Agents in Vibe and Mistral Medium 3.5 with 77.6% SWE-Bench Verified Score

Mistral AI debuts remote agents in Vibe and Mistral Medium 3.5, a 128B dense model scoring 77.6% on SWE-Bench Verified, marking a significant upgrade in its coding agent ecosystem.

MarkTechPost
May 02, 2026·1 min read
Nvidia Unveils Nemotron 3 Nano Omni: A Glimpse into Modern Multimodal Models
AI Models

Nvidia Unveils Nemotron 3 Nano Omni: A Glimpse into Modern Multimodal Models

Nvidia releases Nemotron 3 Nano Omni, an open multimodal model capable of processing text, image, video, and audio.

The Decoder
Apr 29, 2026·1 min read
Poolside AI Unveils Laguna XS.2 and M.1: Breakthrough Agentic Coding Models
AI Models

Poolside AI Unveils Laguna XS.2 and M.1: Breakthrough Agentic Coding Models

Poolside AI releases Laguna M.1 and Laguna XS.2, two agentic coding models achieving 72.5% and 68.2% on SWE-bench Verified, respectively.

MarkTechPost
Apr 29, 2026·1 min read
A Glimpse into the Past: What a 1930s-Era Trained AI Thinks the World Will Be Like in 2026
AI Models

A Glimpse into the Past: What a 1930s-Era Trained AI Thinks the World Will Be Like in 2026

Meet 'Talkie', a 13B-parameter language model that paints a nostalgic picture of the future based on texts from a bygone era.

The Decoder
Apr 28, 2026·1 min read
Better Hardware Could Turn Zeros into AI Heroes
AI Models

Better Hardware Could Turn Zeros into AI Heroes

New hardware that takes advantage of sparsity in AI models could significantly reduce their energy consumption and increase performance.

IEEE Spectrum
Apr 28, 2026·1 min read
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
AI Models

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

NVIDIA unveils Nemotron 3 Nano Omni, a cutting-edge AI model capable of processing and understanding long-context multimodal data, including documents, audio, and video.

Hugging Face
Apr 28, 2026·1 min read
Xiaomi Unveils Highly Efficient Open Source AI Models for Agentic Tasks
AI Models

Xiaomi Unveils Highly Efficient Open Source AI Models for Agentic Tasks

Xiaomi releases open source AI large language models MiMo-V2.5 and MiMo-V2.5-Pro, which are highly efficient and affordable for agentic 'claw' tasks, available under the MIT License.

VentureBeat
Apr 27, 2026·1 min read
Three reasons why DeepSeek’s new model matters
AI Models

Three reasons why DeepSeek’s new model matters

Chinese AI firm DeepSeek releases a preview of V4, its long-awaited new flagship model, which can process longer prompts and is more cost-effective than its predecessors.

MIT Technology Review
Apr 24, 2026·1 min read
DeepSeek-V4: A Million-Token Context That Agents Can Actually Use
AI Models

DeepSeek-V4: A Million-Token Context That Agents Can Actually Use

DeepSeek releases V4 with a 1M-token context window, competitive benchmark numbers, and innovative architecture for efficient large context length support.

Hugging Face
Apr 23, 2026·1 min read