August 29, 2025 – On Thursday, Microsoft’s AI division made a significant splash by unveiling its first two in – house developed AI models: the MAI – Voice – 1 speech model and the MAI – 1 – preview general – purpose model.
The new MAI – Voice – 1 speech model boasts impressive capabilities. According to Microsoft, it can generate one minute of audio in just one second using only a single GPU. This efficiency makes it a game – changer in the field of speech synthesis.
Microsoft has already integrated the MAI – Voice – 1 into several features. For instance, in the “Copilot Daily” function, an AI host utilizes this model to deliver the day’s hot news. Additionally, it can create podcast – style dialogues, aiding users in comprehending various topics.
Regular users can get hands – on experience with MAI – Voice – 1 on the Copilot Labs platform. Here, they not only have the freedom to input the content they want the AI to express but also can customize the voice tone and speaking style according to their preferences.

Alongside the speech model, Microsoft also rolled out the MAI – 1 – preview model. The training of this model was a massive undertaking, involving approximately 15,000 NVIDIA H100 GPUs. Designed for specific user needs, it has the ability to follow instructions and can offer practical responses to daily inquiries.
Mustafa Suleyman, the head of Microsoft AI, shed light on the company’s internal AI model development strategy last year during an episode of the “Decoder” podcast. He stated that enterprise – level application scenarios were not the core focus of their in – house AI models. “My vision is that we must create a product that offers an excellent experience for consumers and is deeply optimized for our own application scenarios. We have a vast amount of highly predictive and practical data in areas like advertising and consumer behavior. So, my main task is to build models that truly fit the role of a ‘consumer partner’,” he explained.
It is reported that Microsoft AI plans to apply the MAI – 1 – preview model to specific text – related scenarios in the Copilot assistant. Currently, Copilot mainly relies on OpenAI’s large – scale language models. Moreover, the MAI – 1 – preview model has already started public testing on the AI benchmarking platform LMArena.
The Microsoft AI team expressed its ambitious plans for the future in a blog post. “We have grand plans for what lies ahead. Not only will we continue to push for technological breakthroughs, but we also firmly believe that by integrating a series of professional models tailored to different user needs and application scenarios, we can unlock tremendous value,” the team wrote.