CINF 81 |
| Decision Tree (DT), as a classification algorithm, has certain advantages over other methods like Neural Networks or Support Vector Machines. Apart from producing interpretable models, DTs can inherently select those descriptors that are of relevance to modeling the given property, during tree building itself. However, in context of cheminformatics data, which is characterized by high dimensionality of feature-space and less number of samples available for training, DTs tend to suffer. Here, ‘parameter tuning' and ‘feature selection' become of importance. In this study, we present our findings about the influence of parameters such as ‘attribute selection measure', ‘tree stopping criterion' and ‘tree pruning method' on the size and performance of the learned Decision Trees. Further, we introduce an initial feature selection, using wrappers, before invoking DT learning to take care of high-dimensional data. Finally, we compare our results with those obtained from ‘Decision Forest', which is an ensemble of DTs. |
|
General Papers
8:30 AM-11:50 AM, Thursday, August 23, 2007 BCEC -- 252 A, Oral
Division of Chemical Information |