At the annual conference Neural Information Processing Systems (NeurIPS), which is devoted to artificial intelligence and machine learning, taking place these days, Intel presented two projects that are related to the recognition and transcription of spoken language. The People’s Speech project focuses on “automatic speech recognition” tasks, while the Multilingual Spoken Words Corpus (MSWC) project focuses on “keyword searches”.
Within each of the projects, datasets were created containing a significant amount of audio data and are among the largest collections in their class. Both initiatives were launched in 2018 to identify and compile the 50 most used languages in the world into a single dataset and then put that information to use. For The People’s Speech and MSWC, Intel engineers collaborated with colleagues from Alibaba, Oracle, Google, Baidu, and more.
As part of the People’s Speech project, developers have created a dataset that includes tens of thousands of hours of monitored spoken audio. It is currently one of the largest English language datasets in its class, licensed for academic and commercial use and available for free download.
At the same time, MSWC is a set of audio-speech data containing more than 300 thousand keywords in dozens of languages and available for smart devices. The MSWC dataset covers the languages spoken by more than 5 billion people and contributes to the development of voice applications for a wide audience. Both datasets will be available to developers.
If you notice an error, select it with the mouse and press CTRL + ENTER.