Creating a speech corpus #1: Before you begin
Before you start collecting data, you need to do some due diligence. Because as important as speech data sets are, they are not trivial to create, and you need to balance what you want from the data with the time and resources you can access. I don’t mean to suggest that developing speech data sets isn’t important (it is), but rather that it needs to happen after careful consideration. This post gets at some of the things you’ll want to think about before you start planning your dream corpus. ...