QSAR modeling and knowledge discovery of a large unbalanced dataset of hERG K+ channel blockers and openers

COMP 259

Kun Wang, kunwang@email.unc.edu, Medicinal Chemistry, University of North Carolina at Chapel Hill, School of Pharmacy, CB# 7360, Beard Hall Rm 301, Chapel Hill, NC 27599, Alexander Golbraikh, golbraik@email.unc.edu, School of Pharmacy, University of North Carolina, CB # 7360, Beard Hall, School of Pharmacy, Chapel Hill, NC 27599-7360, Bryan L. Roth, bryan_roth@med.unc.edu, National Institute of Mental Health Psychoactive Drug Screening Program and Department of Pharmacology, University of North Carolina at Chapel Hill, 8032 Burnett-Womack, CB # 7365, Chapel Hill, NC 27599, and Alexander Tropsha, tropsha@email.unc.edu, Laboratory of Molecular Modeling, School of Pharmacy, The University of North Carolina at Chapel Hill, 301 Beard Hall, CB# 7360, UNC-CH, Chapel Hill, NC 27599.
The human ether-a-go-go related gene (hERG) K+ channel can be target and antitarget in drug discovery. It is important to screen out hERG channel blockers that cause QT prolongation and fatal arrhythmia, or tune out QT liability in a lead at early stage, and find openers as potential therapeutics for LQTS. For a diverse imbalanced dataset of 1878 compounds (including openers, blockers, and inactives) with class overlap, we combined k-nearest-neighbor (kNN) QSAR classification algorithm with the class boundary cleaning, class boundary mining and active learning techniques, then built models for (i) blockers vs. openers, (ii) blockers vs. inactives, (iii) hits (openers & blockers) vs. inactives. Models with prediction accuracy exceeding 90% each were obtained for training, test and external validation sets; false positive/negative rates were below 10%. Our results compare favorably with those generated using other algorithms for imbalanced dataset. Knowledge discovered will extend application scope of hERG in drug development and regulatory.
 

Poster Session
6:00 PM-8:00 PM, Tuesday, August 18, 2009 Walter E. Washington Convention Center -- Ballroom A, Poster

Division of Computers in Chemistry

The 238th ACS National Meeting, Washington, DC, August 16-20, 2009