Automatic Speech Recognition (ASR) was considered science fiction till not long ago. What has changed? Is everything now possible? The talk will include a brief description of ASR technologies covering speaker related, language related and content based applications. The complexity of ASR will be illustrated through examples of input ambiguity, mismatch conditions, and the large variability in human vocal expressions. Early ASR days will be shortly reviewed, starting from DTW for isolated digit recognition through HMMs for continuous speech recognition till the recent HMM/DNN systems that are used today. Speaker recognition will also be presented starting from GMM based speaker verification, through i-vectors speaker identification, leading to DNN based x-vectors. These algorithmic and computational advancements will be discussed trying to address the ultimate question – can we “compete” with or even beat human performance in all these tasks? This question will be addressed using examples of ASR role in a few domains such as commercial speech analytics, medical analysis and intelligence. Old and new challenges of ASR will be pinpointed as a mean of differentiating between probable and less probable achievements.
Irit Opher is the head of Afeka Center for Language Processing (ACLP), which is an applied R&D center specializing in Automatic Speech Recognition. She is also a senior lecturer at Afeka College of Engineering at the EE & Math departments. Her main research interests include automatic transcription in challenging conditions, keyword spotting, spoken language understanding and speech-based disease diagnosis.
Prior to joining ACLP at 2014, Irit has led research teams focusing on ASR and pattern analysis at Nice Systems and in a few start-up companies. Irit holds a PhD in Physics from Tel-Aviv University (graduated in 1999) where she specialized in Computational Neuroscience.