abstract
| - Regression problems provide some of the most challenging research opportunities in the area of machine learning,
where the predictions of some target variables are critical to a specific application. Rainfall is a prime example, as it
exhibits unique characteristics of high volatility and chaotic patterns that do not exist in other time series data. Moreover,
rainfall is essential for applications that surround financial securities, such as rainfall derivatives. This paper
extensively evaluates a novel algorithm called Decomposition Genetic Programming (DGP), which is an algorithm
that decomposes the problem of rainfall into subproblems. Decomposition allows the GP to focus on each subproblem,
before combining back into the full problem. The GP does this by having a separate regression equation for
each subproblem, based on the level of rainfall. As we turn our attention to subproblems, this reduces the difficulty
when dealing with data sets with high volatility and extreme rainfall values, since these values can be focused on
independently. We extensively evaluate our algorithm on 42 cities from Europe and the USA, and compare its performance
to the current state-of-the-art (Markov chain extended with rainfall prediction), and six other popular machine
learning algorithms (Genetic Programming without decomposition, Support Vector Regression, Radial Basis Neural
Networks, M5 Rules, M5 Model trees, and k-Nearest Neighbours). Results show that the DGP is able to consistently
and significantly outperform all other algorithms. Lastly, another contribution of this work is to discuss the effect that
DGP has had on the coverage of the rainfall predictions and whether it shows robust performance across different
climates.
|