How well can we score now and where do we go from here: Comprehensive evaluation of 13 scoring functions on 800 protein-ligand complexes and development of new scoring functions

COMP 103

Shaomeng Wang1, Renxiao Wang2, Xueliang Fang2, Chao Yie Yang2, and Yipin Lu2. (1) The Department of Internal Medicine and the Department of Medicinal Chemistry, The University of Michigan, 1500 E. Med. Center Dr, Ann Arbor, MI 48109-0934, (2) Departments of Internal Medicine and Medicinal Chemistry, University of Michigan, 1500 E. Medical Center Dr, CCGC/3316, Ann Arbor, MI 48109
We have carried out evaluation of 13 popular scoring functions against 800 diverse protein-ligand complexes with known Ki or Kd values. Four scoring functions, i.e. X-Score, DrugScore, Sybyl::ChemScore, and Cerius2::PLP, were found to provide better correlations between their scores and the experimentally determined binding constants of the 800 complexes than the other scoring functions evaluated. After removal of outliers from the correlation evaluation, these four scoring functions reproduced the binding constants of the entire test set with a standard deviation of 1.4 ~ 1.7 log units (corresponding to 1.9 ~ 2.3 kcal/mol in binding free energy at room temperature). To examine if a scoring function generally works better analyzing ligand molecules bound to the same target protein, we have also re-evaluated these thirteen scoring functions on three subsets of protein-ligand complexes extracted from our test set: HIV-1 protease complexes (82 entries), trypsin complexes (45 entries) and carbonic anhydrase II complexes (40 entries). For the HIV-1 protease complexes, the performance of almost all scoring functions was disappointing; for tryspin complexes, a good number of scoring functions gave excellent results; while for carbonic anhydrase II complexes, the performance of several scoring functions was acceptable.

We also wish to present new results from our recent efforts in the development of scoring functions. These efforts include the development of a large publicly accessible protein-ligand binding database (the PDBbind database) for protein-ligand complexes whose experimental 3D structures are available from the Protein Data Bank and whose experimental binding affinities have been published in the literature, and of new algorithms for the calculation of conformational entropy changes for both ligand molecules and proteins during the binding process.

 

Docking and Scoring
1:30 PM-4:50 PM, Tuesday, August 24, 2004 Pennsylvania Convention Center -- 109B, Oral

Division of Computers in Chemistry

The 228th ACS National Meeting, in Philadelphia, PA, August 22-26, 2004