Skip to main content

Posts

Showing posts from February, 2021

Telugu ASR speech data collection

Image Source: IIIT-H Developing an indigenous ASR for Indian languages has been a goal for us since a long time. In that regard we have been experimenting a lot, trying out various neural network architectures.  While doing these experiments we found that there was no good dataset for Indian languages. While discussing with IIIT professors we got to know that the government of India was also exploring options to generate a good dataset. We immediately offered our help and our platform for this endeavor. So, as a starting step we have come up with a few campaigns to encourage users to donate speech data. We wanted to make it fun, so our first few campaigns are along the lines of JAMs(Just a Minute speech topics) etc. A topic will be provided and you need to speak for a minute on that topic. We have started this campaign for college students to start with. Of course anyone can participate and contribute their data. The more the merrier :) We will adding a lot more innovative ways utiliz