Chinese Annotated Spontaneous speech (CASS) Corpus o CAss w/Five-Tier Transcription 令 Character level base form Syllable(or Pinyin) Level (w/tone base form Initial/Final (F level w/time boundary for baseform 令 SAMPA- C Level surface form 今 Miscellaneous level used for garbage modeling Lengthening, breathing, laughing, coughing, disfluency, noise, silence, murmur(unclear), modal, smack, non-Chinese xample Character 我们 认 点 SⅤable wo3 menO rent shio alan rer CASS Syllable wo3 menO duol ren 4 shio diana ren2 IF uom@_nt uo z'@_n i't iE n z'@ GIF uo @n tvu z@_ zan Misc noise< noise> Center of speech Technology, Tsinghua University Slide 5Center of Speech Technology, Tsinghua University Slide 5 ❑ CASS w/ Five-Tier Transcription ❖ Character level : base form ❖ Syllable (or Pinyin) Level (w/ tone) : base form ❖ Initial/Final (IF) Level : w/ time boundary for baseform ❖ SAMPA-C Level : surface form ❖ Miscellaneous Level : used for garbage modeling ➢ Lengthening, breathing, laughing, coughing, disfluency, noise, silence, murmur (unclear), modal, smack, non-Chinese ❖ Example Character 我 们 多 认 识 点 人 Syllable wo3 men0 duo1 ren4 shi0 dian3 ren2 CASS Syllable wo3 men0 duo1 ren4 shi0 dianr3 ren2 IF uo m @_n t uo z` @_n s` i` t iE_n z` @_n GIF uo @_n t_v uo z` @_n s`_v t_v ia` z` @_n Misc noise< noise> mum< mum> Chinese Annotated Spontaneous Speech (CASS) Corpus
©2008-现在 cucdc.com 高等教育资讯网 版权所有