Microsoft has officially unveiled VALL-E, an artificial intelligence model that can convert text to speech, accurately imitating a human voice. The system can use a record of only three seconds as a sample, and the emotional coloring of the original speech will be transferred to the simulated one.
The Redmondians call VALL-E a “neural codec language model.” The development of this technology was based on EnCodec technology. The authors also emphasize that their system analyzes exactly how a person sounds, breaking this information into separate “tokens” and using training data to compare the received information about how this voice will sound if the AI utters other phrases. In other methods of converting text to voice, as a rule, speech is synthesized using the manipulation of waveforms.
3060 is cheaper than 30tr in Citylink
3070 is cheaper than 50 tr in Citylink
3070 Gainward Phoenix for 45 tr in Regarde
4080 for almost 100tr – cheaper than at the rate of 60
3060 Ti Gigabyte deševle 40 tr v Regarde
RTX 3070 Ti for 56 tr in Citylink
3070 Gigabyte Gaming for 50 tr from the beginning
15000r discount on 4090 MSI Gaming
3060 Ti Gigabyte Gaming for 43 tr
Computers from 10 tr in Citylink
-7% на 4080 Gigabyte Gaming
The artificial intelligence model VALL-E was trained on the basis of the LibriLight library, which contains 60 thousand hours of English speech from more than 7 thousand people. On a separate site there are many examples of AI work that anyone can test.