abstract
| - Data mining is the process of extracting knowledge and patterns from data. Classification and Regression are among the major data mining tasks, where the goal is to predict a value of an attribute of interest for each data instance, given the values of a set of predictive attributes. Most classification and regression problems involve continuous, ordinal and categorical attributes. Currently Ant Colony Optimization (ACO) algorithms have focused on directly handling categorical attributes only; continuous attributes are transformed using a discretisation procedure in either a preprocessing stage or dynamically during the rule creation. The use of a discretisation procedure has several limitations: (i) it increases the computational runtime, since several candidates values need to evaluated; (ii) requires access to the entire attribute domain, which in some applications all data is not available; (iii) the values used to create discrete intervals are not optimised in combination with the values of other attributes. This thesis investigates the use of solution archive pheromone model, based on Ant Colony Optimization for mixed-variable (ACOMV) algorithm, to directly cope with all attribute types. Firstly, an archive-based ACO classification algorithm is presented, followed by an automatic design framework to generate new configuration of ACO algorithms. Then, we addressed the challenging problem of mining data streams, presenting a new ACO algorithm in combination with a hybrid pheromone model. Finally, the archive-based approach is extended to cope with regression problems. All algorithms presented are compared against well-known algorithms from the literature using publicly available data sets. Our results have been shown to improve the computational time while maintaining a competitive predictive performance.
|