23 June 2026
Introducing Async Pro v1.0 and Flash v1.5

Introducing Async Pro v1.0 and Flash v1.5

We’re expanding the Async text-to-speech lineup with two new models built for different production needs: Async Pro v1.0 and Async Flash v1.5.

Together with Async Flash v1.0, the lineup now gives developers a clearer way to choose the right TTS model based on quality, latency, language coverage, and workflow compatibility.

Async Pro v1.0 is our highest-quality English text-to-speech model to date, built for natural-sounding speech and improved pronunciation accuracy.

Async Flash v1.5 is our next-generation low-latency streaming model for real-time voice applications. It delivers improved responsiveness and speech quality for conversational experiences.

Async Flash v1.0 remains the best option for teams that need broad multilingual coverage, timestamp generation, synchronous TTS workflows, or compatibility with existing Flash v1.0 integrations.

In short, each Async model is optimized for a different use case, making it easier to choose the right balance of quality, latency, and language support.

A quick look at the Async TTS model lineup

Model

Best for

Languages

Async Pro v1.0

Highest-quality English speech generation

English

Async Flash v1.5

Low-latency multilingual voice agents

English, Spanish, French, German, Italian, Portuguese

Async Flash v1.0

Broad language coverage and compatibility

15 languages

If you’re building real-time voice agents, customer support automation, podcasts, audiobooks, or multilingual applications, the Async model lineup now gives you a dedicated option optimized for your specific use case.

What is new in the Async TTS lineup?

Async now offers three text-to-speech models for three different production needs: high-quality English generation, low-latency multilingual conversations, and broad language coverage.

That matters because TTS requirements change depending on the application. A podcast narration workflow does not have the same constraints as a live customer support agent. An audiobook project may prioritize natural pacing and pronunciation accuracy, while a phone agent needs fast streaming, reliable text handling, and responsive turn-taking.

The expanded Async lineup is designed around those tradeoffs. Developers can now choose a model based on the job their application needs to do: generate polished English speech, power real-time multilingual conversations, or support broader language and endpoint compatibility.

Async Pro v1.0 is built for high-quality English TTS

Async Pro v1.0 is the quality-first model in the Async TTS lineup. It is designed for English voice experiences where pronunciation accuracy, natural delivery, and reliable text handling directly affect the user experience.

Natural English speech generation

Async Pro v1.0 is optimized for natural English speech generation and strong pronunciation accuracy. This makes it a strong fit for products where the voice is part of the experience, not just a utility layer.

For example, audiobooks, narration, podcasts, and premium voice assistants all depend on speech that feels polished over longer listening sessions. In those workflows, small issues with pacing, pronunciation, or delivery can become noticeable quickly.

Automatic text normalization

One of the main improvements in Async Pro v1.0 is automatic text normalization. The model can handle dates, numbers, currencies, abbreviations, and structured content during generation, helping developers reduce the need for separate preprocessing pipelines.

This is especially useful when the input text is dynamic. A voice assistant might need to read account balances, calendar dates, product names, or formatted IDs. A narration workflow might include headings, lists, numbers, and abbreviations in the same script.

Low-latency streaming for quality-first use cases

Although Async Pro v1.0 is built around output quality, it still maintains low-latency streaming performance. That makes it suitable for both content generation and real-time conversational applications where English voice quality matters.

This is useful for teams that do not want to choose between natural speech and responsive delivery. If the product experience depends on premium English output but still needs to feel interactive, Async Pro v1.0 is the model to evaluate first.

Best use cases for Async Pro v1.0

Async Pro v1.0 is best for voice assistants, conversational AI, audiobooks, narration, podcasts, and premium English voice experiences.

Choose Async Pro v1.0 when English speech quality is the priority and your application needs output that feels polished, accurate, and ready for longer listening experiences.

Async Flash v1.5 is built for real-time multilingual voice agents

Async Flash v1.5 is the low-latency streaming model in the Async TTS lineup. It is built for real-time voice applications where speech needs to feel fast, natural, and responsive across live conversations.

Low-latency streaming for live conversations

Real-time voice applications depend on timing. If the generated speech takes too long to start, the conversation can feel delayed, even if the voice quality is strong.

Async Flash v1.5 is designed for low-latency streaming, making it a strong fit for conversational products where fast response time is part of the user experience. This includes voice agents, phone agents, AI assistants, and support automation workflows where users expect the system to respond naturally in the moment. For teams designing a streaming TTS system, Flash v1.5 is the model to evaluate when fast turn-taking and conversational responsiveness are central to the experience.

Six-language support for multilingual agents

Async Flash v1.5 supports English, Spanish, French, German, Italian, and Portuguese. That makes it a practical choice for teams building multilingual voice agents across major markets without needing the full 15-language coverage of Async Flash v1.0.

For developers building a multilingual voice agent, this creates a more focused option: use Flash v1.5 when your application needs real-time multilingual performance across these six languages, and use Flash v1.0 when broader language coverage is the priority.

Better handling of structured text

Flash v1.5 introduces major improvements in pronunciation, text normalization, voice cloning quality, and conversational responsiveness. It also performs better on structured text commonly found in production voice agents.

That includes dates, phone numbers, currencies, account numbers, abbreviations, and IDs.

This matters because voice agents often need to speak dynamic information during a live conversation. A support agent might confirm an order number, read a phone number, explain a payment amount, and reference an account ID in the same interaction. If that text is not handled cleanly, the experience can quickly feel robotic or unreliable.

More consistent voice cloning quality

Flash v1.5 also improves voice cloning consistency and intonation compared to previous generations. For production voice agents, consistency matters because users may hear the same cloned voice across many conversations, languages, or support scenarios.

A more consistent voice helps the experience feel stable, especially in customer-facing workflows where the assistant’s voice becomes part of the brand experience.

Best use cases for Async Flash v1.5

Async Flash v1.5 is best for voice agents, real-time AI assistants, customer support automation, phone agents, and multilingual conversational AI.

Choose Async Flash v1.5 when low-latency streaming, multilingual support, and conversational responsiveness are more important than maximum language coverage.

Async Flash v1.0 remains the broadest compatibility model

Async Flash v1.0 remains the best choice for developers who need broad multilingual support, synchronous workflows, timestamp generation, or compatibility with existing Flash v1.0 integrations.

While Async Pro v1.0 and Async Flash v1.5 are optimized for more specific use cases, Flash v1.0 continues to serve teams that need the widest coverage across languages and endpoints.

15-language support

Async Flash v1.0 supports 15 languages:

  • English
  • French
  • Spanish
  • German
  • Italian
  • Portuguese
  • Arabic
  • Russian
  • Romanian
  • Japanese
  • Hebrew
  • Armenian
  • Turkish
  • Hindi
  • Chinese

That makes it the best fit when language coverage matters more than using the newest specialized model. For example, a multilingual education platform, global content workflow, or international customer experience may need broader coverage than the six languages available in Flash v1.5.

Synchronous generation and timestamp APIs

Async Flash v1.0 supports all Async API endpoints, including synchronous generation and timestamp APIs.

That matters for teams building workflows where streaming is not the only requirement. Some applications need generated audio returned in a synchronous flow. Others need timestamps to align spoken audio with captions, transcripts, visual elements, or editing workflows.

For those cases, Flash v1.0 remains the most compatible option in the lineup.

Existing integration compatibility

Flash v1.0 also remains important for teams already building on existing Async integrations. Not every application needs to migrate to a newer model immediately, especially if the current workflow depends on broad endpoint support or a larger language set.

For existing applications, the safest approach is to treat Async Pro v1.0 and Async Flash v1.5 as specialized additions rather than automatic replacements.

Best use cases for Async Flash v1.0

Async Flash v1.0 is best for broad multilingual deployments, applications requiring timestamp generation, synchronous TTS workflows, and existing Flash v1.0 integrations.

Choose Async Flash v1.0 when language coverage, endpoint compatibility, or integration stability matters more than using the newest low-latency or highest-quality model.

How to choose the right Async TTS model

The right Async TTS model depends on the constraint that matters most in your application: speech quality, real-time responsiveness, language coverage, or endpoint compatibility.

If your priority is…

Choose…

Highest-quality English speech

Async Pro v1.0

Real-time multilingual conversation

Async Flash v1.5

Maximum language coverage

Async Flash v1.0

Timestamp generation

Async Flash v1.0

Synchronous TTS workflows

Async Flash v1.0

Existing Flash v1.0 integrations

Async Flash v1.0

A simple way to make the decision: choose Pro for English quality, Flash v1.5 for live multilingual conversations, and Flash v1.0 for coverage and compatibility.

Before implementing, review the model documentation for the latest supported languages, features, and endpoint availability.

How Async compares with other TTS APIs

The TTS market has moved quickly, especially as more teams build voice agents, AI assistants, customer support bots, and content generation workflows. Most major providers now offer high-quality speech generation, streaming support, or multilingual voices, but they do not always organize model choice around the same production tradeoffs.

OpenAI’s speech API, for example, offers built-in voices and real-time audio streaming for developers building voice experiences. ElevenLabs offers multiple speech models, including low-latency options for real-time applications. Google Cloud Text-to-Speech provides a broad cloud API with neural voice tiers, while Amazon Polly offers several TTS engines and enterprise-friendly AWS infrastructure.

Async’s approach is more focused on helping developers choose the right model for the job. Instead of positioning one model as the answer for every workflow, the Async TTS lineup separates model choice into three clear production paths: English quality, real-time multilingual conversation, and broad language or endpoint compatibility.

Async vs broad cloud TTS providers

Broad cloud TTS providers are often strongest when teams already rely on a larger cloud ecosystem. They can be a good fit for infrastructure-heavy teams that want speech synthesis alongside other cloud services.

Async is more focused on production voice workflows where model selection needs to be direct. Developers can choose Async Pro v1.0 for higher-quality English speech, Async Flash v1.5 for low-latency multilingual conversations, or Async Flash v1.0 for wider language and endpoint compatibility.

Async vs voice-first AI platforms

Voice-first AI platforms often focus on realistic voices, cloning, dubbing, creator workflows, or agent experiences. These platforms can be powerful, but the right choice depends on whether the team is optimizing for content, real-time interaction, language support, or API control.

Async fits best when developers want a clear TTS model lineup that maps directly to production requirements. Pro v1.0 is built for polished English output, Flash v1.5 is built for responsive multilingual agents, and Flash v1.0 remains available for teams that need broader coverage or existing endpoint support.

Where Async fits best

Async is a strong fit for teams building with text-to-speech in production and trying to balance quality, latency, multilingual support, and compatibility.

Use Async when you want to build around a dedicated voice API rather than forcing every TTS workflow through the same model. The expanded lineup gives developers a cleaner way to match the model to the experience they are creating, whether that is a premium English voice product, a real-time support agent, or a multilingual application with broader endpoint needs.

What developers should check before implementation

Before choosing a model, developers should review the latest model documentation for supported languages, features, and endpoint availability.

This is especially important if your application depends on a specific workflow, such as streaming generation, synchronous TTS, timestamp generation, or an existing Flash v1.0 integration.

For new builds, start with the use case:

  • If the product is English-first and quality-sensitive, evaluate Async Pro v1.0.
  • If the product is conversational and latency-sensitive, evaluate Async Flash v1.5.
  • If the product needs broader multilingual coverage or timestamp support, use Async Flash v1.0.

For existing applications, the safest approach is to treat the new models as more specialized options rather than automatic replacements. Flash v1.0 continues to support broad language coverage and all Async API endpoints, so teams with existing workflows can continue using it where compatibility matters.

Async Pro v1.0 and Async Flash v1.5 give developers more control over model selection. Instead of forcing every TTS workflow into the same model, you can now align the model with the actual product requirement: quality, latency, language coverage, or compatibility.

Start building with the right Async TTS model

Async Pro v1.0 and Async Flash v1.5 expand the Async TTS lineup with more specialized options for production voice applications.

Use Async Pro v1.0 when English voice quality matters most. Use Async Flash v1.5 when you need low-latency multilingual speech for live conversations. Stay with Async Flash v1.0 when broad language coverage, timestamp generation, synchronous workflows, or existing integration compatibility are the priority.

Start with the Async Voice API, review the model documentation, and choose the model that fits your application’s quality, latency, language, and endpoint requirements.

FAQ

What is the difference between Async Pro v1.0 and Async Flash v1.5?

Async Pro v1.0 is optimized for the highest-quality English speech generation. Async Flash v1.5 is optimized for low-latency multilingual voice applications in English, Spanish, French, German, Italian, and Portuguese.

Which Async TTS model should I use for voice agents?

Use Async Flash v1.5 for real-time voice agents, phone agents, customer support automation, and multilingual conversational AI. It is built for low-latency streaming, conversational responsiveness, and improved handling of structured text during live interactions.

Which Async TTS model supports the most languages?

Async Flash v1.0 supports the most languages in the Async TTS lineup. It supports 15 languages, making it the best fit for broad multilingual deployments and applications that need maximum language coverage.

Does Async handle text normalization automatically?

Yes. Async Pro v1.0 and Async Flash v1.5 automatically handle difficult text normalization scenarios such as dates, numbers, currencies, phone numbers, account numbers, abbreviations, IDs, and structured content.

When should I use Async Flash v1.0 instead of the newer models?

Use Async Flash v1.0 when your application needs broad language coverage, synchronous TTS workflows, timestamp generation, or compatibility with existing Flash v1.0 integrations. It remains the best option for maximum language and endpoint compatibility.

Where can developers check supported languages and endpoints?

Developers can review the Async model documentation for the latest supported languages, features, and endpoint availability. This is the best source to confirm compatibility before choosing a model for production.

PakarPBN

A Private Blog Network (PBN) is a collection of websites that are controlled by a single individual or organization and used primarily to build backlinks to a “money site” in order to influence its ranking in search engines such as Google. The core idea behind a PBN is based on the importance of backlinks in Google’s ranking algorithm. Since Google views backlinks as signals of authority and trust, some website owners attempt to artificially create these signals through a controlled network of sites.

In a typical PBN setup, the owner acquires expired or aged domains that already have existing authority, backlinks, and history. These domains are rebuilt with new content and hosted separately, often using different IP addresses, hosting providers, themes, and ownership details to make them appear unrelated. Within the content published on these sites, links are strategically placed that point to the main website the owner wants to rank higher. By doing this, the owner attempts to pass link equity (also known as “link juice”) from the PBN sites to the target website.

The purpose of a PBN is to give the impression that the target website is naturally earning links from multiple independent sources. If done effectively, this can temporarily improve keyword rankings, increase organic visibility, and drive more traffic from search results.

Jasa Backlink

Download Anime Batch