The more things change, the more they stay the same: Issues in data analysis from QSAR to HTS

CINF 73

David Rogers, drogers@scitegic.com, Workplace, 10188 Telesis Court, Suite 100, San Diego, CA 92121-4779
From the 20+ years since the development of the earliest quantitative structure-activity releationship (QSAR) methods to the current era with the availability of data from high-throughput screening (HTS), it would appear that much has changed: data sets have grown from dozens to millions of samples, novel analysis techniques have appeared, and new descriptors (including high-dimensional molecular fingerprints) have greatly increased the data content of each data point. However, throughout this transformation of the data available, the primary issues affecting the choice of analysis method remain the same: data dimensionality, algorithm scaling, and algorithm tuning. In this talk I will discuss a number of methods used in QSAR and HTS analysis and how these choices are derived from these same underlying issues.