Título
Bilingual document clustering using Translation-Independent features
Autor
Claudia Denicia Carral
Manuel Montes y Gómez
Luis Villaseñor Pineda
RITA MARIANA ACEVES PEREZ
Nivel de Acceso
Acceso Abierto
Materias
Resumen o descripción
This paper focuses on the task of bilingual clustering, which involves dividing a set of documents from two different languages into a set of thematically homogeneous groups. It mainly proposes a translation independent approach specially suited to deal with linguistically related languages. In particular, it proposes representing the documents by pairs of words orthographically or thematically related. The experimental evaluation in three bilingual collections and using two clustering algorithms demonstrated the appropriateness of the proposed representation, which results are comparable to those from other approaches based on complex linguistic resources such as translation machines, part-of-speech taggers, and named entity recognizers.
Editor
IJCLA
Fecha de publicación
2010
Tipo de publicación
Artículo
Versión de la publicación
Versión aceptada
Recurso de información
Formato
application/pdf
Idioma
Inglés
Audiencia
Estudiantes
Investigadores
Público en general
Sugerencia de citación
Denicia-Carral, C., et al., (2010). Bilingual document clustering using Translation-Independent features, IJCLA Vol. 1 (1-2): 217-230
Repositorio Orígen
Repositorio Institucional del INAOE
Descargas
286