ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

On the Scalability of Genetic Algorithms to Very Large-Scale Feature Selection

Moser, Andreas and Murty, Narasimha M (2000) On the Scalability of Genetic Algorithms to Very Large-Scale Feature Selection. In: Lecture Notes in Computer Science, April 2000, Edinburgh, Scotland, UK, pp. 77-86.

[img] PDF
On_the_Scalability.pdf - Published Version
Restricted to Registered users only

Download (601Kb) | Request a copy
Official URL: http://www.springerlink.com/content/10a9u7rr22fj0k...

Abstract

Feature Selection is a very promising optimisation strategy for Pattern Recognition systems. But, as an NP-complete task, it is extremely difficult to carry out. Past studies therefore were rather limited in either the cardinality of the feature space or the number of patterns utilised to assess the feature subset performance. This study examines the scalability of Distributed Genetic Algorithms to very large-scale Feature Selection. As domain of application, a classification system for Optical Characters is chosen. The system is tailored to classify hand-written digits, involving 768 binary features. Due to the vastness of the investigated problem, this study forms a step into new realms in Feature Selection for classification. We present a set of customisations of GAs that provide for an application of known concepts to Feature Selection problems of practical interest. Some limitations of GAs in the domain of Feature Selection are unrevealed and improvements are suggested. A widely used strategy to accelerate the optimisation process, Training Set Sampling, was observed to fail in this domain of application. Experiments on unseen validation data suggest that Distributed GAs are capable of reducing the problem complexity significantly. The results show that the classification accuracy can be maintained while reducing the feature space cardinality by about 50%. Genetic Algorithms are demonstrated to scale well to very large-scale problems in Feature Selection.

Item Type: Conference Paper
Additional Information: Copyright of this artcile belongs to Springer.
Department/Centre: Division of Electrical Sciences > Computer Science & Automation (Formerly, School of Automation)
Date Deposited: 17 Sep 2004
Last Modified: 11 Jan 2012 09:19
URI: http://eprints.iisc.ernet.in/id/eprint/1773

Actions (login required)

View Item View Item