Título
The Corpus DIMEx100: transcription and evaluation
Autor
LUIS ALBERTO PINEDA CORTES
HAYDE CASTELLANOS VARGAS
JANET JUAREZ ESCOBAR
Joaquim Llisterri
LUIS VILLASEÑOR PINEDA
Nivel de Acceso
Acceso Abierto
Materias
Phonetic corpus - (PHONETIC CORPUS) Phonetic transcription - (PHONETIC TRANSCRIPTION) Transcription granularity - (TRANSCRIPTION GRANULARITY) Mexican Spanish - (MEXICAN SPANISH) Acoustic models - (ACOUSTIC MODELS) CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA - (CTI) MATEMÁTICAS - (CTI) CIENCIA DE LOS ORDENADORES - (CTI) CIENCIA DE LOS ORDENADORES - (CTI)
Resumen o descripción
In this paper the transcription and evaluation of the corpus DIMEx100 for Mexican Spanish is presented. First we describe the corpus and explain the linguistic and computational motivation for its design and collection process; then, the phonetic antecedents and the alphabet adopted for the transcription task are presented; the corpus has been transcribed at three different granularity levels, which are also specified in detail. The corpus statistics for each transcription level are also presented. A set of phonetic rules describing phonetic context observed empirically in spontaneous conversation is also validated with the transcription. The corpus has been used for the construction of acoustic models and a phonetic dictionary for the construction of a speech recognition system. Initial performance results suggest that the data can be used to train good quality acoustic models.
Editor
Springer Science+Business Media B.V.
Fecha de publicación
2010
Tipo de publicación
Artículo
Versión de la publicación
Versión aceptada
Recurso de información
Formato
application/pdf
Idioma
Inglés
Relación
&
Evaluation, (44): 347–370
Audiencia
Estudiantes
Investigadores
Público en general
Sugerencia de citación
Pineda, L.A., et al., (2010). The Corpus DIMEx100: transcription and evaluation, Language Resources
Repositorio Orígen
Repositorio Institucional del INAOE
Descargas
288