You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the first publicly available corpus of Hmong [ISO 639-3: mww, hmj], a minority language of China, Vietnam, Laos, Thailand, and various countries in Europe, America, and Australia. The corpus has been scraped from a long-running Usenet newsgroup called soc.culture.hmong and consists of approximately 12 million tokens. This corpus (called SCH) is also the first substantial corpus to be annotated for elaborate expressions, a kind of four-part coordinate construction that is common and important in the languages of mainland Southeast Asia.
Dataloader name:
sch/sch.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?sch
The text was updated successfully, but these errors were encountered: