Paolo Foschi and Efstratios Gallopoulos Description:
This WG will collaborate with WG B
for the development of Task 3
, and with WG D
for the development of Task 4
. Initially, the existing computational methods usedfor the most basic problems will be improved both in efficiency and accuracy. Subsequently, new algorithms to make the results of WG B
applicable in practice will be developed. Cross-sectional research by providing alternatives to heuristic algorithms will be developed. Computationally intensive key-problems in robust clustering, decision learning trees, multiple testing and feature selection will be addressed. The setting includes problems traditionally solved by using stepwise algorithms which cannot be guaranteed to find either the optimal solution or any bound on the error. One of the innovative objectives is to develop new strategies able to provide such a bound when it is not computationally feasible to attain the optimum based on exact algorithms which are feasible for small size. The computational burden of these methods is high, but can be greatly reduced by parallelization. Thus, the parallelization of the derived strategies and their implementations in different architectures, such as GPU and clusters of PCs, will be explored. Algorithms based on the paradigms of computing coresets and/or sketches of data will be investigated. These algorithms can, by design, be easily distributed (for example, using Map-Reduce), parallelized or deal with streams of data.