New Text-to-Speech endpoints with timestamps

We release two new endpoints for text-to-speech, which make it possible to get timestamps on when each character was spoken without using websockets both in a streaming and non-streaming way. You can read more about it

