自动语音识别（PPT讲稿）Automatic Speaker Recognition

• Introduction • The i-vector methodology of speaker recognition • The d-vector methodology of speaker recognition • The end-to-end methodology of speaker recognition • Inter-speaker variability in speaker recognition • Example of variations in speaker recognition • State-of-art approach in SRE

团购合买资源类别：文库，文档格式：PPTX，文档页数：59，文件大小：2.5MB

Automatic Speaker Recognition 于嘉威 2018/8/13

Outline 第一,鉴于有些同学不了解SRE的相关工作,所以我先把双周任务时候我报告的东西快速回顾一下,让大家有个直观的印象第二,我会总结一些最新的研究(基本是CASF2018有关SRE的内容)要点,以及我的一些思考和问题

Outline • 第一，鉴于有些同学不了解SRE的相关工作，所以我先把双周任务时候我报告的东西快速回顾一下，让大家有个直观的印象 • 第二，我会总结一些最新的研究（基本是ICASSP2018有关SRE的内容）要点，以及我的一些思考和问题

Outline Introduction The i-vector methodology of speaker recognition The d-vector methodology of speaker recognition The end-to-end methodology of speaker recognition Inter-speaker variability in speaker recognition EXample of variations in speaker recognition State-of-art approach in SRE

Outline • Introduction • The i-vector methodology of speaker recognition • The d-vector methodology of speaker recognition • The end-to-end methodology of speaker recognition • Inter-speaker variability in speaker recognition • Example of variations in speaker recognition • State-of-art approach in SRE

Introduction Definition: It is the method of recognizing a person based on his Voice Speaker identification Speaker verification Speaker diarization Speaker recognition Text dependent Text independent pen set Close set

Introduction • Definition: It is the method of recognizing a person based on his voice Speaker recognition Speaker identification Speaker verification Speaker diarization Text dependent Text independent Open set Close set

Speaker Identification Definition: Determine whether unknown speaker matches one of a set known speakers One-to-many mapping Often assumed that unknown voice must come from a set of known speakers-referred to as close-set identification Adding hone of the above option to closed -set identification gives open-set identification nose voice is this?

Speaker Identification • Definition: Determine whether unknown speaker matches one of a set known speakers • One-to-many mapping • Often assumed that unknown voice must come from a set of known speakers – referred to as close-set identification • Adding “none of the above” option to closed-set identification gives open-set identification

Speaker Verification Determine whether unknown speaker matches a specific speaker One-to-one mapping Close-set verification: The population of clients is fixed Open-set verification: New clients can be added without having to redesign the system Is this Bob' s voice?

Speaker Verification • Determine whether unknown speaker matches a specific speaker • One-to-one mapping • Close-set verification: The population of clients is fixed • Open-set verification: New clients can be added without having to redesign the system

Speaker diarization Determine when a speaker change has occurred in speech signal (segmentation) Group together speech segments corresponding to the same speaker( clustering) Prior speaker information may or may not be available Where are speaker Which segments are from changes? the same speaker?

Speaker diarization • Determine when a speaker change has occurred in speech signal (segmentation) • Group together speech segments corresponding to the same speaker (clustering) • Prior speaker information may or may not be available

Introduction: Generic Speaker Recognition System Basic structure of a speaker recognition system Unknow Analysis Feature Frames eatureVector Decision Speech Preprocessing Pattern Extraction Matching Enrollment Feature Preprocessing Extraction Speaker Models

Introduction: Generic Speaker Recognition System • Basic structure of a speaker recognition system Preprocessing Feature Extraction Pattern Matching Preprocessing Feature Extraction Speaker Models Unknow Speech Analysis Frames Feature Vector Enrollment Scoring Decision

Introduction Main research fields on sre Feature Extraction Pattern matching Scoring method

Introduction: Main Research Fields on SRE • Feature Extraction • Pattern matching • Scoring method

PROPERTIES OF DEAL FEATURES ideally a feature parameter should Nolan, 1983 show high between-speaker variability and low within-speaker variabili be resistant to attempted disguise or mimicry have a high frequency of occurrence in relevant materials be robust in transmission be relatively easy to extract and measure

PROPERTIES OF IDEAL FEATURES ideally a feature parameter should（F.Nolan，1983）： • show high between-speaker variability and low within-speaker variability • be resistant to attempted disguise or mimicry • have a high frequency of occurrence in relevant materials • be robust in transmission • be relatively easy to extract and measure

点击下载完整版文档（PPTX格式）

共59页，可试读20页，点击继续阅读 ↓↓

点击下载（PPTX格式）

浏览记录