Necessity to establish a new annotated spontaneous speech corpus a The existing databases(incl. Broadcast News, CallHome, CallFriend, ..)do not cover all the Chinese spoken language phenomena pl , Sound changes: voiced, unvoiced, nasalization ,s Phone changes: retroflexed, OoV-phoneme a The existing databases do not contain pronunciation variation Intormation for use of bootstrap training o A Chinese annotated Spontaneous Speech(CAss) Corpus was established before wsoo on lsp in jhu Completely spontaneous(discourses, lectures, . Remarkable background noise, accent background Recorded onto tapes and then digitalized Center of speech Technology, Tsinghua University Slide 4Center of Speech Technology, Tsinghua University Slide 4 ❑ The existing databases (incl. Broadcast News, CallHome, CallFriend, …) do not cover all the Chinese spoken language phenomena ❖ Sound changes: voiced, unvoiced, nasalization, … ❖ Phone changes: retroflexed, OOV-phoneme, … ❑ The existing databases do not contain pronunciation variation information for use of bootstrap training ❑ A Chinese Annotated Spontaneous Speech (CASS) Corpus was established before WS00 on LSP in JHU ❖ Completely spontaneous (discourses, lectures, ...) ❖ Remarkable background noise, accent background, ... ❖ Recorded onto tapes and then digitalized Necessity to establish a new annotated spontaneous speech corpus