CINF 76 |
| We present an extensive study to build classification and regression models using five different ADMET data sets (HIA, LogP, LogS, BBB, and two toxicological data sets causing cancer in rats and mice). We compare especially the relevance of vector based coding for molecules using descriptors and fingerprints and a coordinate-free coding working directly on the molecular structures avoiding a temporary abstract vector representation. We see that the vector coding can be used for large data sets by loosing accuracy and the coordinate-free approach avoids the feature selection problem, but is only applicable for smaller data sets. Furthermore we discuss shortly the underlying space and time complexities. |
|
ADME/tox Informatics
2:00 PM-5:00 PM, Tuesday, 15 March 2005 Convention Center -- Room 33A, Oral
Division of Chemical Information |