Realizing Prospective QSAR through data fusion and modern descriptors

COMP 174

Curt M. Breneman, brenec@rpi.edu1, N Sukumar, nagams@rpi.edu2, Mark J. Embrechts3, Kristin P. Bennett, bennek@rpi.edu4, C. Matthew Sundling, sundlm@rpi.edu5, Mike Krein1, and Theresa Hepburn1. (1) Department of Chemistry / RECCR Center, Rensselaer Polytechnic Institute, 110-8th Street, Center for Biotechnology and Interdisciplinary Studies, Troy, NY 12180, (2) Department of Chemistry and Center for Biotechnology, Rensselaer Polytechnic Institute, Cogswell Laboratory, 110 8th Street, Troy, NY 12180-3590, (3) Department of Decision Sciences & Engineering Systems, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, (4) Department of Mathematics, Rensselaer Polytechnic Institute, Amos Eaton Building, 110 8th St, Troy, NY 12180, (5) Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180-3590
The evolution of "prospective" molecular property prediction methods that truly fulfill the promise of QSAR have been paced by the need for parallel development of information-rich molecular descriptors and modern multi-objective machine-learning schemes. By creating multiple models employing data fusion techniques and multiple endpoints, maximum benefit can be derived from the relationship between the chemical information encoded within modern molecular descriptors and several channels of available experimental data. Examples of data fusion QSAR will be discussed, including means for determining domain applicability of the resulting models.