Author: Mario Graff
Recientemente, el análisis del sentimiento ha recibido mucha atención debido al interés en las opiniones mineras de los usuarios de las redes sociales. El análisis del sentimiento consiste en determinar la polaridad de un texto dado, es decir, su grado de positividad o negatividad. Tradicionalmente, los algoritmos de análisis de sentimiento se han adaptado a un lenguaje específico dada la complejidad de tener una serie de variaciones léxicas y errores introducidos por las personas que generan contenido. En esta contribución, nuestro objetivo es proporcionar un marco multilingüe simple de implementar y fácil de usar, que pueda servir como base para los concursos de análisis de sentimientos y como punto de partida para construir nuevos sistemas de análisis de sentimientos. Comparamos nuestro enfoque en ocho idiomas diferentes, tres de ellos tienen importantes concursos internacionales, a saber, SemEval (inglés), TASS (español) y SENTIPOLC (italiano). Dentro de las competiciones, nuestro enfoque abarca desde posiciones medias a altas en los rankings; mientras que en los idiomas restantes nuestro enfoque supera el resultado informado.
Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a simple to implement and easy to use multilingual framework, that can serve as a baseline for sentiment analysis contests, and as a starting point to build new sentiment analysis systems. We compare our approach in eight different languages, three of them correspond to important international contests, namely, SemEval (English), TASS (Spanish), and SENTIPOLC (Italian). Within the competitions, our approach reaches from medium to high positions in the rankings; whereas in the remaining languages our approach outperforms the reported results.
Multilingual sentiment analysis Error-robust text representations Opinion mining INGENIERÍA Y TECNOLOGÍA CIENCIAS TECNOLÓGICAS TECNOLOGÍA DE LOS ORDENADORES INTELIGENCIA ARTIFICIAL INTELIGENCIA ARTIFICIAL
This paper describes a novel approach to learning term-weighting schemes (TWSs) in the context of text classification. In text mining a TWS determines the way in which documents will be represented in a vector space model, before applying a classifier. Whereas acceptable performance has been obtained with standard TWSs (e.g., Boolean and term-frequency schemes), the definition of TWSs has been traditionally an art. Further, it is still a difficult task to determine what is the best TWS for a particular problem and it is not clear yet, whether better schemes, than those currently available, can be generated by combining known TWS. We propose in this article a genetic program that aims at learning effective TWSs that can improve the performance of current schemes in text classification. The genetic program learns how to combine a set of basic units to give rise to discriminative TWSs. We report an extensive experimental study comprising data sets from thematic and non-thematic text classification as well as from image classification. Our study shows the validity of the proposed method; in fact, we show that TWSs learned with the genetic program outperform traditional schemes and other TWSs proposed in recent works. Further, we show that TWSs learned from a specific domain can be effectively used for other tasks.
Quality tests applied to hydraulic concrete such as compressive, tension, and bending strength are used to guarantee proper
characteristics ofmaterials. All these assessments are performed by destructive tests (DTs). The trend is to carry out quality analysis
using nondestructive tests (NDTs) as has been widely used for decades.This paper proposes a framework for predicting concrete
compressive strength and modulus of rupture by combining data from four NDTs: electrical resistivity, ultrasonic pulse velocity,
resonant frequency, and hammer test rebound withDTs data.Themodel, determined fromthemultiple linear regression technique,
produces accurate indicators predictions and categorizes the importance of each NDT estimate. However, the model is identified
fromall the possible linear combinations of the available NDT, and it was selected using a cross-validation technique. Furthermore,
the generality of the model was assessed by comparing results from additional specimens fabricated afterwards.