As part of the multi-source g2p project we hope to have generated pronunciations for many more sources (including PronLex, CELEX, and NETTalk) and formats (e.g., DISC, ARPABet, and NETTalk). These will be read from some remote location by populate.py and will live in the database rather than being generated on the fly. We will also want separate "source" checkboxes for those generated automatically and those actually found in the DB. This might look a bit like the following if you select PronLex and WikiPron US
...
(NB: PronLex itself is proprietary, but my plan was to distribute generated forms for it under a CC license, obscuring whether or not the form was in the original or not.)
As part of the multi-source g2p project we hope to have generated pronunciations for many more sources (including PronLex, CELEX, and NETTalk) and formats (e.g., DISC, ARPABet, and NETTalk). These will be read from some remote location by
populate.pyand will live in the database rather than being generated on the fly. We will also want separate "source" checkboxes for those generated automatically and those actually found in the DB. This might look a bit like the following if you select PronLex and WikiPron US...
- [x] ARPAbet (generated)
- [ ] DISC (generated)
- [ ] IPA (generated)
- [ ] NETTalk (generated)
- [ ] ARPAbet (generated)
- [ ] DISC (generated)
- [x] IPA
- [ ] IPA (generated)
- [ ] NETTalk (generated)
...
(NB: PronLex itself is proprietary, but my plan was to distribute generated forms for it under a CC license, obscuring whether or not the form was in the original or not.)