nuneybits Vector art of the iconic Microsoft Windows logo on a c7e8e82a-c8b6-4bb6-b555-19a4b7abcd08-1

Microsoft launches 3 new AI models in direct shot at OpenAI and Google

The trio of models — MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 — are available immediately through Microsoft Foundry and a new MAI Playground. They span three of the most commercially valuable modalities in enterprise AI: converting speech to text, generating realistic human voice, and creating images. Together, they represent the opening salvo from Microsoft's superintelligence team, which Suleyman formed just six months ago to pursue what he calls "AI self-sufficiency."

Michael Nuñez
nuneybits Vector art of burnt orange voice waves rising from a b4e149ce-8692-4f29-8c57-9d1ac9c21a08

Mistral AI just released a text-to-speech model it says beats ElevenLabs — and it's giving away the weights for free

The enterprise voice AI market is in the middle of a land grab. ElevenLabs and IBM announced a collaboration just this week to bring premium voice capabilities into IBM's watsonx Orchestrate platform. Google Cloud has been expanding its Chirp 3 HD voices. OpenAI continues to iterate on its own speech synthesis. And the market underpinning all of this activity is enormous — voice AI crossed $22 billion globally in 2026, with the voice AI agents segment alone projected to reach $47.5 billion by 2034, according to industry estimates.

Michael Nuñez
Subscribe to get latest news!

Deep insights for enterprise AI, data, and security leaders

By submitting your email, you agree to our Terms and Privacy Notice.