Título

Bilingual document clustering using Translation-Independent features

Autor

Claudia Denicia Carral

Manuel Montes y Gómez

Luis Villaseñor Pineda

RITA MARIANA ACEVES PEREZ

Nivel de Acceso

Acceso Abierto

Resumen o descripción

This paper focuses on the task of bilingual clustering, which involves dividing a set of documents from two different languages into a set of thematically homogeneous groups. It mainly proposes a translation independent approach specially suited to deal with linguistically related languages. In particular, it proposes representing the documents by pairs of words orthographically or thematically related. The experimental evaluation in three bilingual collections and using two clustering algorithms demonstrated the appropriateness of the proposed representation, which results are comparable to those from other approaches based on complex linguistic resources such as translation machines, part-of-speech taggers, and named entity recognizers.

Editor

IJCLA

Fecha de publicación

2010

Tipo de publicación

Artículo

Versión de la publicación

Versión aceptada

Formato

application/pdf

Idioma

Inglés

Audiencia

Estudiantes

Investigadores

Público en general

Sugerencia de citación

Denicia-Carral, C., et al., (2010). Bilingual document clustering using Translation-Independent features, IJCLA Vol. 1 (1-2): 217-230

Repositorio Orígen

Repositorio Institucional del INAOE

Descargas

286

Comentarios



Necesitas iniciar sesión o registrarte para comentar.