Técnicas inteligentes para el análisis de condiciones medioambientales

  1. Arroyo Puente, Angel
Supervised by:
  1. Emilio Santiago Corchado Rodríguez Director
  2. Álvaro Herrero Cosío Director
  3. Verónica Tricio Gómez Director

Defence university: Universidad de Salamanca

Fecha de defensa: 19 May 2017

Committee:
  1. Hilde Pérez García Chair
  2. José Luis Calvo Rolle Secretary
  3. Paulo Novais Committee member

Type: Thesis

Teseo: 476828 DIALNET

Abstract

It is well known that air quality is an important and worrying issue nowadays, affecting not only human health but also many other aspects such as climate change or the survival of the biosphere. In recent years, many public institutions have been adapted to the restrictive normative about environmental pollution imposed by European regulations, being Spain one of the countries that must comply with these regulations. Both in Spain and in other countries there are various air-quality networks and stations for the continuous acquisition of meteorological parameters. These networks are not only present in big cities, but also in peripheral and industrial areas, as well as in places where the preservation of nature is fundamental key issue. Furthermore, they are constantly rearranged to improve their function. In present PhD Thesis, different intelligent techniques (more specifically, Soft Computing techniques) have been applied to publicly available databases with air quality and/or meteorological information. The applied techniques perform two fundamental tasks: dimensionality reduction and clustering. They have been applied in isolation and in conjunction in order to improve the results in the analysis of environmental conditions. The applied dimensionality reductions techniques are: Principal Component Analysis (PCA) as the technique firstly applied to obtain an approximation to the dataset structure, Locally Linear Embedding (LLE) as a non-linear local technique, Maximum Likelihood Hebbian Learning (MLHL) and Cooperative Maximum Likelihood Hebbian Learning (CMLHL) as neural models which implement Exploratory Projection Pursuit, Curvilinear Component Analysis (CCA) as a non-linear technique which tries to preserve the interpoint distance in the output space, Multidimensional Scalling (MDS) as a non-linear global technique operating with the distance matrix, Isometric Mapping (ISOMAP) as a technique derived from MDS and Self-Organizing Maps (SOM), as a competitive learning neural model. The applied clustering techniques are, on the one hand partitional techniques: k-means as the clustering technique firstly applied, which assigns samples to groups using distance metrics, SOM k-means which use the SOM algorithm for the weight updating process, k-medoids as a k-means derived technique which assigns the centroid of each cluster to one of the belonging samples, and fuzzy c-means as a fuzzy-logic based technique for grouping samples. On the other hand, hierarchical agglomerative techniques have also been applied (where groups are formed in an ascending way) together with different clustering evaluation indexes, used to determine the possible number of existing groups in a dataset, and finally dendrograms for a tree-form graphical representation of clustering. Case studies have been carefully selected and range from local, regional to national contexts. Similarly, the selected periods of time have also been a priority. In some of the studies, the analyzed period of time is one day long, considered for the analysis of meteorological / air quality in a short time interval in a certain place, while in other cases, long periods of time (close to a decade), are used to analyze some of the most climatological representative places in Spain. From one or more public datasets comprising all the information about environmental conditions (weather, air quality, or both), but always analyzing key variables in the characterization of environmental conditions, the goal is to extract the meaningfully information in the datasets by applying intelligent techniques. This leads to an analysis of the environmental conditions in the selected case studies. In each case study, an analysis of the weather or air quality conditions is carried out in the selected places and periods of time, searching for similarities and differences in the analyzed data samples, emphasizing those detected anomalous situations and trying to give an explanation to these phenomena’s. A comparative analysis of the results obtained with the different techniques applied is also performed, considering the advantages and disadvantages of using each of them in each case study Dimensionality reduction techniques are useful for graphically analyzing high-dimensional data sets, find relationships in datasets and detect anomalous situations. Complementarily, clustering techniques reveal the structure of datasets by assigning the data samples to different clusters depending on the applied distance and similarity measures. This is useful in present work to understand the similarities and differences in the meteorological and / or air quality conditions of the different locations selected in each case study.