当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

清华大学:Making Full Use of Chinese Speech Corpora(PPT讲稿)

资源类别:文库,文档格式:PPT,文档页数:67,文件大小:999KB,团购合买
Purpose of speech corpora Factors to be considered in data creation Data creation Data transcription Learning from corpora Chinese Corpus Consortium (CCC)
点击下载完整版文档(PPT)

O-COCOSDA. Oct. 1-3. 2003 Sentosa, singapore Making Full Use of Chinese speech Corpora Thomas Fang Zheng Center of speech Technology State Key laboratory of intelligent Technology and Systems Tsinghua University http://sp.cs.tsinghuaedu.cn, Beijing d-Ear Technologies Co. Ltd http://www.d-ear.com Oct.2,2003

Making Full Use of Chinese Speech Corpora Thomas Fang Zheng Center of Speech Technology State Key Laboratory of Intelligent Technology and Systems Tsinghua University http://sp.cs.tsinghua.edu.cn/ Beijing d-Ear Technologies Co., Ltd. http://www.d-Ear.com Oct. 2, 2003 O-COCOSDA, Oct. 1-3, 2003 Sentosa, Singapore

ecur 得意音通技术 2 Outline Your Partnerin the Century of Speech aPurpose of speech corpora U factors to be considered in data creation 日 Data creation 日 Data transcription ULearning from corpora aChinese Corpus Consortium(CCc)

Your Partner in the Century of Speech 2 Outline ❑Purpose of speech corpora ❑Factors to be considered in data creation ❑Data creation ❑Data transcription ❑Learning from corpora ❑Chinese Corpus Consortium (CCC)

ecur 得意音通技术 Purpose of Speech Corpora Your Partnerin the Century of Speech Item Description Percentage 1. Speech/ system development, evaluation, sentence 73% speaker comprehension and summarization, speech recognition recognition, speaker recognition 2. Speech system development, prosodic analysis 11% synthesis 3. Acoustic acoustic analysis, speech codin g 9% analVSiS 4. Sentence syntactic and semantic analysis 5% analysis 5. Speech/ speech and language education 2% language education

Your Partner in the Century of Speech 3 Purpose of Speech Corpora Item Description Percentage 1. Speech/ speaker recognition system development, evaluation, sentence comprehension and summarization, speech recognition, speaker recognition 73% 2. Speech synthesis system development, prosodic analysis 11% 3. Acoustic analysis acoustic analysis, speech coding 9% 4. Sentence analysis syntactic and semantic analysis 5% 5. Speech/ language education speech and language education 2%

ecur 得意音通技术 Outline Your Partnerin the Century of Speech PUrpose of speech corpora FActors to be considered in data creation 日 Data creation 日 Data transcription ULearning from corpora aChinese Corpus Consortium(CCc)

Your Partner in the Century of Speech 4 Outline ❑Purpose of speech corpora ❑Factors to be considered in data creation ❑Data creation ❑Data transcription ❑Learning from corpora ❑Chinese Corpus Consortium (CCC)

ecur 得意音通技术 5 Factors to be considered in data creation(1) Your Partnerin the Century of Speech 口 The language Language: e. g, Chinese or English i Dialectal background (e.g, for Chinese Putonghua or standard Chinese(普通话); Mandarin(官话, northern china Wu(xia, Southern Jiangsu, Zhejiang, and Shanghai Yue(ia, Guangdong, Hong Kong, Nanning Guangxi Min(闽南话, Fujian, Shantou guangdong, Haikou hainan, Taipei Taiwan kka(客家话, Meixian guangdong,Hsn- Chu Taiwan); Xiang(湘, Hunan); Gan(赣, Jiangxi; Hui(徽, Anhui;and Jn(晋, Shanxi ☆ Special for chinese: Simplified chinese Traditional chinese

Your Partner in the Century of Speech 5 Factors to be considered in data creation (1) ❑ The language. ❖ Language: e.g., Chinese or English ❖ Dialectal background (e.g., for Chinese) :- ▪ Putonghua or standard Chinese (普通话); ▪ Mandarin (官话,Northern China); ▪ Wu (吴语,Southern Jiangsu, Zhejiang, and Shanghai); ▪ Yue (粤语,Guangdong, Hong Kong, Nanning Guangxi); ▪ Min (闽南话,Fujian, Shantou Guangdong, Haikou Hainan, Taipei Taiwan); ▪ Hakka (客家话,Meixian Guangdong, Hsin-Chu Taiwan); ▪ Xiang (湘,Hunan); ▪ Gan (赣,Jiangxi); ▪ Hui (徽,Anhui); and ▪ Jin (晋,Shanxi). ❖ Special for Chinese :- ▪ Simplified Chinese ▪ Traditional Chinese

ecur 得意音通技术 6 Your Partner inthe Centum af snatch A中适 兰糖话 陶容话 e江官话 说明:本图《中国西喜 集(图A2) 言的 官话方言分布图

Your Partner in the Century of Speech 6

ecur 得意音通技术 Your Partnerinthe Century of speech 现代吴语方言分区图 江淮官话” 苏沪嘉小片 宣州片 灶 徽语 太湖片 州片 处衢」 福 瓯江片 建

Your Partner in the Century of Speech 7 太湖片 台 州 片 瓯江片 ? 处衢片 苏沪嘉小片 江淮官话 徽语 宣州片 杭州小片 林绍小片

ecur 得意音通技术 Factors to be considered in data creation(2) Your Partnerin the Century of Speech 日 Speaking style Read for asr in earlier research, or for Tts Spontaneous/ conversational: for ASR nowadays 口 Recording channel 8 Depending on goal of task or application, or the application environment Close-talk microphones: for personal computers(PCs) Telephone, and or cellular phone: for telephony applications Specific channel: for embedded applications(PDA, digital recorder, .) or broadcast news, TV news. Normally mono channel instead of stereo channel 4 However, microphone array may be used for some research purpose

Your Partner in the Century of Speech 8 Factors to be considered in data creation (2) ❑Speaking style :- ❖Read: for ASR in earlier research, or for TTS ❖Spontaneous/conversational: for ASR nowadays ❑Recording channel ❖Depending on goal of task or application, or the application environment ▪ Close-talk microphones: for personal computers (PCs) ▪ Telephone, and/or cellular phone: for telephony applications ▪ Specific channel: for embedded applications (PDA, digital recorder, ...), or broadcast news, TV news. ❖Normally mono channel instead of stereo channel. ❖However, microphone array may be used for some research purpose

ecur 得意音通技术 9 Factors to be considered in data creation (3) Your Partnerin the Century of Speech 口 Sampling rate: s8 kHz: for the telephone/ mobile-phone channel where the bandwidth is about 3. 4 khz 16 kHz: for the close-talk microphone PC channel though the bandwidth is higher than 8 kHz 日 Sampling precision: ☆16bits, normally. 88-bit A-law or Miu-law(13-bit wide after decompression) a Signal-to-Noise Ratio ( snr) level s Was/is often collected in a good environment (clean speech database For noise-related research, noisy data obtained via Noises(noiseX 92 )mixed with clean speech Collected in real-world noisy environments

Your Partner in the Century of Speech 9 Factors to be considered in data creation (3) ❑ Sampling rate :- ❖ 8 kHz: for the telephone/mobile-phone channel where the bandwidth is about 3.4 kHz ❖ 16 kHz: for the close-talk microphone PC channel though the bandwidth is higher than 8 kHz. ❑ Sampling precision :- ❖ 16 bits, normally. ❖ 8-bit A-law or Miu-law (13-bit wide after decompression). ❑ Signal-to-Noise Ratio (SNR) level: ❖ Was/is often collected in a good environment (clean speech database). ❖ For noise-related research, noisy data obtained via :- ▪ Noises (NOISEX 92) mixed with clean speech; ▪ Collected in real-world noisy environments

ecur 得意音通技术 10 Factors to be considered in data creation(4) Your Partnerin the Century of Speech U Number of speakers and speaker balance The more, the better: with a good speaker diversity according to Gender ge ■ Education Birthplace or dialectal background Occupation and so on 日 Corpus size: B Measured by either the number of speakers or the length of valid speech in hour, or both

Your Partner in the Century of Speech 10 Factors to be considered in data creation (4) ❑Number of speakers and Speaker balance: ❖The more, the better: with a good speaker diversity, according to :- ▪ Gender; ▪ Age; ▪ Education; ▪ Birthplace (or dialectal background); ▪ Occupation; ▪ and so on. ❑Corpus size: ❖Measured by either the number of speakers or the length of valid speech in hour, or both

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共67页,可试读20页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有