This issue and #228 both stem from a related concern: stable storage (and long-term archiving) not just of primary data, but also of 'intermediate' states of the data sets (preprocessed data sets) and of 'computational outputs' such as acoustic models trained on a given data set.
Even if the tool & the data are available online, if training is a matter of days then wouldn't it make good sense to make the models available for download, too? (to the extent that the colleagues who produced them wish to make them available, of course)
Possible benefits:
- facilitating experiments on transfer learning
- opening possibilities for smaller-size companies doing Natural Language Processing that need acoustic models but may not find it easy to invest the necessary amount of resources for data acquisition campaigns to create them
This issue and #228 both stem from a related concern: stable storage (and long-term archiving) not just of primary data, but also of 'intermediate' states of the data sets (preprocessed data sets) and of 'computational outputs' such as acoustic models trained on a given data set.
Even if the tool & the data are available online, if training is a matter of days then wouldn't it make good sense to make the models available for download, too? (to the extent that the colleagues who produced them wish to make them available, of course)
Possible benefits: