Filter by:
Publication type
- Article (64)
- Master thesis (51)
- Doctoral thesis (15)
- Conference paper (8)
- Book part (6)
Authors
- GERMAN ALEJANDRO MIRANDA DIAZ (4)
- LUIS ENRIQUE SUCAR SUCCAR (4)
- MANUEL MONTES Y GOMEZ (4)
- ALFONSO BUSTOS SANCHEZ (3)
- EDUARDO FRANCISCO MORALES MANZANARES (3)
Issue Years
Publishers
- Instituto Nacional de Astrofísica, Óptica y Electrónica (36)
- CICESE (8)
- Universidad Autónoma Metropolitana (México). Unidad Azcapotzalco. (8)
- Universidad Autónoma Metropolitana (México). Unidad Azcapotzalco. Coordinación de Servicios de Información. (6)
- Instituto Nacional de Astrofísica, Óptica y Electrónica. (4)
Origin repository
- Repositorio Institucional del INAOE (55)
- Repositorio Institucional Zaloamati (14)
- Repositorio Institucional CICESE (9)
- Repositorio Institucional Caxcán (9)
- Repositorio universitario de prosa científica y de divulgación de la Facultad de Estudios Superiores Iztacala (7)
Access Level
- oa:openAccess (152)
Language
Subject
- CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA (58)
- CIENCIA DE LOS ORDENADORES (49)
- MATEMÁTICAS (45)
- INGENIERÍA Y TECNOLOGÍA (42)
- CIENCIAS SOCIALES (24)
Select the topics of your interest and receive the hottest publications in your email
Particle Swarm Model Selection
HUGO JAIR ESCALANTE BALDERAS MANUEL MONTES Y GOMEZ LUIS ENRIQUE SUCAR SUCCAR (2009)
This paper proposes the application of particle swarm optimization (PSO) to the problem of full model selection, FMS, for classification tasks. FMS is defined as follows: given a pool of preprocessing methods, feature selection and learning algorithms, to select the combination of these that obtains the lowest classification error for a given data set; the task also includes the selection of hyperparameters for the considered methods. This problem generates a vast search space to be explored, well suited for stochastic optimization techniques. FMS can be applied to any classification domain as it does not require domain knowledge. Different model types and a variety of algorithms can be considered under this formulation. Furthermore, competitive yet simple models can be obtained with FMS. We adopt PSO for the search because of its proven performance in different problems and because of its simplicity, since neither expensive computations nor complicated operations are needed. Interestingly, the way the search is guided allows PSO to avoid overfitting to some extend. Experimental results on benchmark data sets give evidence that the proposed approach is very effective, despite its simplicity. Furthermore, results obtained in the framework of a model selection challenge show the competitiveness of the models selected with PSO, compared to models selected with other techniques that focus on a single algorithm and that use domain knowledge.
Article
Full model selection Machine learning challenge Particle swarm optimization Experimentation Cross validation CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA MATEMÁTICAS CIENCIA DE LOS ORDENADORES
Algoritmos de agrupamiento global para datos mezlados
SAUL LOPEZ ESCOBAR (2007)
Clustering problem arises in many practical applications in several areas such as Pat-
tern Recognition, Machine Learning, Data Mining, Digital Image Processing, etc. The
k-means algorithm is one of the most frequently algorithms used to solve the clustering
problem, this is due its simplicity but, it has many drawbacks such as: i) it only allows
working with numeric data and ii) it heavily depends on the initial conditions.
On the other hand, in soft sciences such as Medicine, Geology, Sociology, Market-
ing, etc, it is common that objects are described in terms of numeric and no numeric
features (mixed data).
In this context, we propose two clustering algorithms based in the k-Means algo-
rithm. Both algorithms allow working with mixed data and they don't depend on the
initial conditions. The proposed algorithms are tested with data sets obtained from
one public repository and they are compared against other clustering algorithms.
El agrupamiento es un problema que se presenta en una gran cantidad de aplicaciones
prácticas en varios campos tales como Reconocimiento de Patrones, Aprendizaje Automático,
Minería de Datos, Procesamiento Digital de Imágenes, etc. El algoritmo k-Means
es uno de los algoritmos más frecuentemente usados para resolver el problema
de agrupamiento, debido principalmente a su simplicidad, pero tiene varias desventa-
jas entre las que se tienen: i) sólo permite trabajar con datos exclusivamente numéricos
y ii) depende fuertemente de las condiciones iniciales con las que sea ejecutado.
Por otro lado, se tiene que en ciencias denominadas \suaves" (soft sciences) tales
como Medicina, Geología, Sociología, Mercadotecnia, etc. es común que los datos se
encuentren descritos por medio de atributos numéricos y no numéricos (datos mezclados)
simultáneamente.
Dentro de este contexto, en este trabajo se proponen dos algoritmos de agrupamiento
restringido basados en el algoritmo k-Means. Ambos algoritmos permiten trabajar
con datos mezclados y no dependen de las condiciones iniciales con las que sean ejecutados.
Los algoritmos propuestos son evaluados usando conjuntos de datos obtenidos
de un repositorio público y son comparados contra otros algoritmos de agrupamiento
restringido.
Master thesis
Pattern recognition Pattern clustering Machine learning INGENIERÍA Y TECNOLOGÍA CIENCIAS TECNOLÓGICAS TECNOLOGÍA DE LOS ORDENADORES BANCOS DE DATOS
Mexican sign language alphanumerical gestures recognition using 3D haar-like features
JAVIER ARMANDO JIMENEZ VILLAFAÑA ANABEL MARTÍN GONZALEZ VICTOR EMANUEL DE ATOCHA UC CETINA ARTURO ESPINOSA ROMERO (2017)
The Mexican Sign Language (LSM) is a language of the deaf Mexican community, which consists of a series of gestural signs articulated by hands and accompanied with facialexpressions. The lack of automated systems to translate signs from LSM makes integration of hearing-impaired people to society more difficult. This work presents a new method for LSM alphanumerical signs recognition based on 3D Haar-like featuresextracted from depth images captured by the Microsoft Kinect sensor. Features are processed with a boosting algorithm. To evaluate performance of our method, we recognized a set of signs from letters and numbers, and compared the results with the useof traditional 2D Haar-like features. Our system is able to recognize static LSM signs with a higher accuracy rate than theone obtained with widely used 2D features.
Article
INGENIERÍA Y TECNOLOGÍA Boosting Gesture recognition Sign language Machine learning 3D Haar-like features
Social-ecological analysis of timely rice planting in Eastern India
Anton Urfels Andrew Mcdonald Paul Struik Balwinder-Singh Timothy Joseph Krupnik (2021)
Article
CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA SUSTAINABLE AGRICULTURE CLIMATE AGROECOSYSTEMS CROPPING SYSTEMS MACHINE LEARNING GROUNDWATER MONSOONS SOWING DATE LANDSCAPE
Using machine learning for extracting information from natural disaster news reports
Usando aprendizaje automático para extraer información de noticias de desastres naturales
ALBERTO TELLEZ VALERO MANUEL MONTES Y GOMEZ LUIS VILLASEÑOR PINEDA (2009)
The disasters caused by natural phenomena have been present all along human history; nevertheless, their consequences are greater each time. This tendency will not be reverted in the coming years; on the contrary, it is expected that natural phenomena will increase in number and intensity due to the global warming. Because of this situation it is of great interest to have sufficient data related to natural disasters, since these data are absolutely necessary to analyze their impact as well as to establish links between their occurrence and their effects. In accordance to this necessity, in this paper we describe a system based on Machine Learning methods that improves the acquisition of natural disaster data. This system automatically populates a natural disaster database by extracting information from online news reports. In particular, it allows extracting information about five different types of natural disasters: hurricanes, earthquakes, forest fires, inundations, and droughts. Experimental results on a collection of Spanish news show the effectiveness of the proposed system for detecting relevant documents about natural disasters (reaching an F-measure of 98%), as well as for extracting relevant facts to be inserted into a given database (reaching an F-measure of 76%).
Los desastres causados por fenómenos naturales han estado presentes desde el principio de la historia del hombre; sin embargo, sus consecuencias son cada vez mayores. Esta tendencia podría no ser revertida en los próximos años; al contrario, se espera que los fenómenos naturales puedan incrementar en número e intensidad debido al calentamiento global. A causa de esta situación es de gran interés tener suficientes datos relacionados a los desastres naturales, ya que estos datos son absolutamente necesarios para analizar su impacto así como para establecer conexiones entre su ocurrencia y sus efectos. En correspondencia con esta necesidad, en este artículo describimos un sistema basado en métodos de Aprendizaje Automático que mejora la adquisición de datos de desastres naturales. Este sistema automáticamente llena una base de datos de desastres naturales con la información extraída de noticias de periódicos en línea. En particular, este sistema permite extraer información acerca de cinco tipos de desastres naturales: huracanes, temblores, incendios forestales, inundaciones y sequías. Los resultados experimentales en una colección de noticias en Español muestran la eficacia del sistema propuesto tanto para detectar documentos relevantes sobre desastres naturales (alcanzando una medida-F de 98%), así como para extraer hechos relevantes para ser insertados en una base de datos dada (alcanzando una medida-F de 76%). Palabras claves: Aprendizaje Automático, Extracción de Información, Clasificación Temática de Textos, Desastres Naturales, Bases de Datos.
Article
Machine Learning Information Extraction Text Categorization Natural Disasters Databases Aprendizaje Automático Extracción de Información Clasificación Temática de Textos Desastres Naturales Bases de Datos CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA MATEMÁTICAS CIENCIA DE LOS ORDENADORES
Walter Mupangwa Isaiah Nyagumbo Mainassara Zaman-Allah (2020)
Maize kernel traits such as kernel length, kernel width, and kernel number determine the total kernel weight and, consequently, maize yield. Therefore, the measurement of kernel traits is important for maize breeding and the evaluation of maize yield. There are a few methods that allow the extraction of ear and kernel features through image processing. We evaluated the potential of deep convolutional neural networks and binary machine learning (ML) algorithms (logistic regression (LR), support vector machine (SVM), AdaBoost (ADB), Classification tree (CART), and the K-Neighbor (kNN)) for accurate maize kernel abortion detection and classification. The algorithms were trained using 75% of 66 total images, and the remaining 25% was used for testing their performance. Confusion matrix, classification accuracy, and precision were the major metrics in evaluating the performance of the algorithms. The SVM and LR algorithms were highly accurate and precise (100%) under all the abortion statuses, while the remaining algorithms had a performance greater than 95%. Deep convolutional neural networks were further evaluated using different activation and optimization techniques. The best performance (100% accuracy) was reached using the rectifier linear unit (ReLu) activation procedure and the Adam optimization technique. Maize ear with abortion were accurately detected by all tested algorithms with minimum training and testing time compared to ear without abortion. The findings suggest that deep convolutional neural networks can be used to detect the maize ear abortion status supplemented with the binary machine learning algorithms in maize breading programs. By using a convolution neural network (CNN) method, more data (big data) can be collected and processed for hundreds of maize ears, accelerating the phenotyping process.
Article
CIENCIAS AGROPECUARIAS Y BIOTECNOLOGÍA MACHINE LEARNING NEURAL NETWORKS MAIZE CROP YIELD IMAGE ANALYSIS
Mexican experience in spanish question answering
Experiencia mexicana en la búsqueda de respuestas en español
MANUEL MONTES Y GOMEZ LUIS VILLASEÑOR PINEDA AURELIO LOPEZ LOPEZ (2008)
Nowadays, due to the great advances in communication and storage media, there is more information available than ever before. This information can satisfy almost every information need; nevertheless, without the appropriate manage facilities, all of it is practically useless. This fact has motivated the emergence of several text processing applications that help in accessing large document collections. Currently, there are three main approaches for this purpose: information retrieval, information extraction, and question answering. Question answering (QA) systems aim to identify the exact answer to a question from a given document collection. This paper presents a survey of the Mexican experience in Spanish QA. In particular, it presents an overview of the participations of the Language Technologies Laboratory of INAOE (LabTL) in the Spanish QA evaluation task at CLEF, from 2004 to 2007. Through these participations, the LabTL has mainly explored two different approaches for QA: a language independent approach based on statistical methods, and a language dependent approach supported by sophisticated linguistic analyses of texts. It is important to point out that, due to these works, the LabTL has become one of the leading research groups in Spanish QA.
En la actualidad, debido a los grandes avances en los medios de comunicación y de almacenamiento, hay más información disponible como nunca antes se ha visto. Esta información puede satisfacer casi todas las necesidades de información, sin embargo, sin una adecuada gestión ésta es prácticamente inútil. Este hecho ha motivado la aparición de diferentes aplicaciones para el procesamiento de texto orientadas a facilitar el acceso a grandes colecciones de documentos. Hoy en día, existen tres enfoques principales para este propósito: la recuperación de información, la extracción de información, y los sistemas de búsqueda de respuestas. Los sistemas de búsqueda de respuestas (QA por sus siglas en inglés) tienen por objeto identificar la respuesta exacta a una pregunta dentro de una determinada colección de documentos. Este trabajo presenta un panorama general de la experiencia mexicana en QA en español. En particular, se presentan las participaciones del Laboratorio de Tecnologías del Lenguaje del INAOE (LabTL) en la tarea de QA en español dentro del foro de evaluación CLEF, desde 2004 a 2007. A través de estas participaciones, el LabTL ha explorado principalmente dos enfoques diferentes en QA: un enfoque independiente del lenguaje basado en métodos estadísticos, y un enfoque dependiente del lenguaje apoyado en un complejo análisis lingüístico del texto. Es importante señalar que, debido a estos trabajos, el LabTL se ha convertido en uno de los principales grupos de investigación de QA en español.
Article
Question Answering Passage Retrieval Answer Extraction Machine Learning Búsqueda de Respuestas Recuperación de Pasajes Extracción de Respuestas Aprendizaje Automático CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA MATEMÁTICAS CIENCIA DE LOS ORDENADORES
Multi-class particle swarm model selection for automatic image annotation
Hugo Jair Escalante Balderas Manuel Montes y Gómez Luis Enrique Sucar Succar (2012)
This article describes the application of particle swarm model selection (PSMS) to the problem of automatic image annotation (AIA). PSMS can be considered a black-box tool for the selection of effective classifiers in binary classification problems. We face the AIA problem as one of multi-class classification, considering a one-vs-all (OVA) strategy. OVA makes a multi-class problem into a series of binary classification problems, each of which deals with whether a region belongs to a particular class or not. We use PSMS to select the models that compose the OVA classifier and propose a new technique for making multi-class decisions from the selected classifiers. This way, effective classifiers can be obtained in acceptable times; specific methods for preprocessing, feature selection and classification are selected for each class; and, most importantly, very good annotation performance can be obtained. We present experimental results in six data sets that give evidence of the validity of our approach; to the best of our knowledge the results reported herein are the best obtained so far in the data sets we consider. It is important to emphasize that despite the application domain we consider is AIA, nothing restricts us of applying the methods described in this article to any other multi-class classification problem.
Article
Classification Particle swarm optimization Particle swarm model selection Machine learning Image annotation Object recognition CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA MATEMÁTICAS CIENCIA DE LOS ORDENADORES CIENCIA DE LOS ORDENADORES
OCTAVIO GOMEZ RAMOS (2007)
In order to reach the balance between the fulfillment of human needs and the protection
of the environment, it is necessary to have detailed and accurate information about
natural resources. Such information can be obtained through thematic maps, a product
of remote sensing. In remote sensing, the generation of accurate thematic maps presents
many research challenges, being one of them, image segmentation.
In this thesis, a novel segmentation algorithm based on seeded region growing and
instance based learning is proposed. The algorithm includes a novel automatic seed
generation approach that uses a histograms analysis, a new weighted instance-based
learning algorithm (WIBK) which obtains one or more weights per feature per class,
a novel region growing algorithm (SRG-WIBK) that uses WIBK as decision criteria,
and a novel region-merging scheme based on ownership tables which allows to merge
regions according to user needs. The WIBK algorithm was experimentally evaluated
on several databases from the UCI repository, and compared against instance-based
and non instance-based learning algorithms showing a very competitive performance.
The SRG-WIBK algorithm was tested on multispectral synthetic images and compared
against the algorithms implemented in the ERDAS software showing very even results.
Para lograr el balance entre la satifacción de las necesidades humanas y la protección
del medio ambiente, es necesario tener información detallada y precisa sobre los recursos
naturales. Esta información puede ser obtenida mediante mapas temáticos, uno de
los productos de la percepción remota. En percepción remota, la generación de mapas
temáticos fiables presenta muchos retos de investigación, siendo uno de ellos, la
segmentación de la imagen.
En esta tesis se propone un nuevo algoritmo de segmentación basado en crecimiento
de regiones y aprendizaje basado en instacias. Dentro de las características del algoritmo
se encuentran un nuevo esquema automático de obtención de semillas basado en
análisis de histogramas, un nuevo algoritmo de aprendizaje basado en instacias (WIBK)
que obtiene uno o más pesos por atributo por clase, un nuevo algoritmo de crecimiento
de regiones (SRG-WIBK) que hace uso de WIBK como criterio de decisión y un nuevo
esquema de agrupamiento de regiones basado en tablas de propiedad que permite agrupar
regiones de acuerdo a las necesidades del usuario. El algoritmo WIBK fué evaluado
experimentalmente en varias bases de datos del repositorio UCI, y comparado contra
algoritmos de aprendizaje basados y no basados en instancias mostrando resultados
muy competitivos. El algoritmo SRG-WIBK fué probado en imágenes multiespectrales
sintéticas, y comparado contra los algoritmos implementados en el software ERDAS
mostrando resultados muy parejos.
Master thesis
Remote sensing Machine learning Computer vision INGENIERÍA Y TECNOLOGÍA CIENCIAS TECNOLÓGICAS TECNOLOGÍA DE LOS ORDENADORES LENGUAJES ALGORÍTMICOS
Using wittgenstein’s family resemblance principle to learn exemplars
ANDRES FLORENCIO RODRIGUEZ MARTINEZ LUIS ENRIQUE SUCAR SUCCAR Jia Wu (2008)
The introduction of the notion of family resemblance represented a major shift in Wittgenstein’s thoughts on the meaning of words, moving away from a belief that words were well defined, to a view that words denoted less well defined categories of meaning. This paper presents the use of the notion of family resemblance in the area of machine learning as an example of the benefits that can accrue from adopting the kind of paradigm shift taken by Wittgenstein. The paper presents a model capable of learning exemplars using the principle of family resemblance and adopting Bayesian networks for a representation of exemplars. An empirical evaluation is presented on three data sets and shows promising results that suggest that previous assumptions about the way we categories need reopening.
Article
Machine learning Family resemblance Bayesian networks CIENCIAS FÍSICO MATEMÁTICAS Y CIENCIAS DE LA TIERRA MATEMÁTICAS CIENCIA DE LOS ORDENADORES