Encoding and exchange of chemical information using substructural molecular fragments

COMP 48

Alexandre Varnek, varnek@chimie.u-strasbg.fr, Denis Fourches, and Vitaly P. Solov’ev. Laboratoire d’Infochimie, Louis Pasteur University, 4, rue B. Pascal, Strasbourg, 67000, France
In this presentation, we describe how chemical information can be encoded in substructural molecular fragments, then used for “in silico” design of new compounds. The Substructural Molecular Fragments method is based on the representation of a molecule by its fragments and on the calculation of their contributions to a given property. Two different classes of fragments are considered “sequences” and “augmented atoms”. The sequences represent the shortest path between each pair of atoms; their length vary from 2 to 15 atoms An augmented atom represents a selected atom with its first coordination sphere. The both classes of fragments involve either atoms and bonds, or atoms only, or bonds only. Once a given compound is split into constitutive fragments, any its quantitative property is calculated from the fragments contributions using several linear and non-linear fitting equations. The best structure-property models are selected according to statistical criteria. Thus, the information concerning a given data set is stored in the files containing types of fragments, their contributions and corresponding fitting equations. In the framework of the ISIDA project (http://infochim.u-strasbg.fr/recherche/isida/index.php) we have developed a knowledgebase which stores the structure-property models based on fragment descriptors using PostgreSQL environment. Since the model is loaded to the knowledgebase, it immediately becomes available for all users of INTRANET via a client application. The stored models can be efficiently used for a virtual screening of large combinatorial libraries. Several examples of application of chemical information encoded in substructural molecular fragments for “in silico” design of new compounds possessing desirable chemical or biological activities will be given.