
|
We are studying speech and natural language processing to automate
metadata indexing and annotation of digital archives, which are vital
for efficient access to multi-media digital content. Moreover, we are
exploring methods to automatically generate thumbnails of audio-visual
streaming and to interactively search large archives through dialogue
with users.
Automatic Speech Recognition and Indexing
overview for general people (PDF)
overview for speech researchers (PDF)
introduction for graduate school applicants (PDF)
-
Large Vocabulary Continuous Speech Recognition Platform
...Julius site
We are developing an open-source speech recognition software named
Julius, which has become a widely-used platform for speech research
and application development.
-
Spontaneous Speech Recognition
We are studying the next-generation speech recognition that can
automatically transcribe real-world spontaneous speech such as
lectures and meetings. Specifically, we investigate modeling of
acoustic and pronunciation variations and adaptation methods of
language model.
-
Indexing and Annotation of Lectures & Meetings
We are also studying content-based indexing and annotation for speech
archives. They involves transformation of the transcript into
document-style, indexing of key sentences and annotation of discourse
tags as well as speaker indexing.
-
Robust Speech Processing
We are also conducting research on robust speech processing that is
necessary for speech recognition systems to be deployed in the real
world. Topics include voice activity detection (VAD), denoising and
dereverberation.
Spoken Language Understanding and Dialogue
-
Speech Understanding
For effective human-machine speech interface, robust approach of
speech understanding and domain handling is necessary. We are
studying an approach of concept-level key-phrase detection and
verification as well as the in-domain classification and verification.
-
Spoken Dialogue Interface for Document Information Retrieval
The current information retrieval systems such as Web search engines
require users to specify keywords, but do not ask questions to narrow
down to the intended items. We are studying interactive systems that
enable users efficiently search large archives through natural spoken
dialogue.
CALL (Computer Assisted Language Learning)
We are involved in research and development of the next-generation
CALL system that can automatically check pronunciation of foreign
language learners and serve as a virtual language teacher for
simulated conversation practice.
|
|