Speech Synthesis 2.0

With this release, the speech synthesis will get even better! We have added changes to how we train the model resulting in even better results on longer fragments. You can head to usual panel , to test it out straight away! Our core changes include:

  • Support for cased input, this makes it easier for the model to read names (like OpenAI or ChatGPT), construct pauses between fragments or names
  • Longer & better training - the model performs seems to perform better on our long-form benchmarks, and on the loss functions
  • Necessary components to support infilling - contextual changes to fragments
  • Necessary components to extend the model across languages on the same platform

It is expected that your cloned or default Voices will result in minor changes. Enjoy!

