Metasite

Site of metabolism predictions.

Site of metabolism predictions.

Figure 2.117. Molecular Discovery
Figure 2.119. Molecular Discovery

2.26 Molecular Networks

I. Molecular Networks GmbH: (http://www.mol-net.com)

II. Product Summaries:

a. 2DCOOR coordinates generator for publishing quality 2D depictions.

b. ADRIANA provides algorithms for the search, identification, and optimization of hits and lead structures.

c. ADRIANA.Code calculation of molecular physicochemical properties, autocorrelation of 2D and 3D interatomic distance distributions, RDF of 3D interatomic distances and autocorrelation of distances between surface points.

d. BioPath database that provides access to biological transformations and regulations as described on the Roche Applied Science "Biochemical Pathways" wall chart.

e. [email protected] data warehousing for 2D structures, multiple 3D conformations, and experimental information.

f. CHECK structure integrity check and normalization of chemical state.

g. CONVERT inter-conversion of 40 different chemical file formats.

h. CORINA 3D structure generation.

i. CORINA.direct graphical user interface for CORINA including a molecule editor and a 3D structure viewer.

j. CORINA_F CORINA interfaced to FlexX docking program.

k. IMAGE conversion of chemical files into images.

l. PAGE conversion of chemical files into formatted documents.

m. ROTATE generation of ensemble of conformations.

n. SONNIA self-organizing neural network package.

o. SPL/T/JOIN&MERGE splitting and concatenating of a chemical file or merging with external data files. p. STERGENenumeration of stereoisomers. q. TABLE conversion of chemical files into spreadsheet file formats. r. TAUTOMER enumeration of tautomers. s. WODCA synthesis design by retro-synthetic analysis. III. Key capabilities and offerings:

a. Warehousing Structures and Data: [email protected] is a chemical warehouse system designed to store 2D structures, multiple 3D conformations of chemical compounds as well as chemical reactions along with related (e.g., experimental or computed) data. The system is implemented as a client-server application. The web-based user interface provides access to the features of the structure search engine for the retrieval of chemical compounds and their related data. This engine can perform structure and sub-structure search, similarity search, and transformation search. When [email protected] is linked to the Commercially Available Compound database the system generates compound purchase orders. There are a variety of methods for Structure, Reaction and Data Retrieval:

i. String Search ii. Property Search iii. Full Structure and Sub-Structure Search iv. 3D Pharmacophore-Type Search v. Reaction Center Search

[Reitz M, Sacher O, Tarkhov A, Trümbach D, Gasteiger J, (2004) Enabling the exploration of biochemical pathways. Org. Biomol. Chem. 2, 3226-3237.]

b. Generating 2D Coordinates: 2DCOOR generates high-quality 2D depictions (atomic coordinates) of chemical compounds. 2D depictions are 2-dimensional representations of chemical compounds similar to the ones a chemist would sketch. 2DCOOR is able to align the layout of images according to a given substructure template c. Generating 3D Coordinates: COR/NA generates 3-dimensional molecular models information on atom types and atom connectivity only. CORINA is used routinely for conversion of large datasets. This application is currently used by MDL, NIH/NCI and all major pharmaceutical companies to convert their 2D structures into 3D. Although often represented in 2D (2D Depictions) by chemists, the molecular structure of a compound is three-dimensional (3D). This 3-dimensional structure is closely associated with the chemical, physical, and biological properties of chemical compounds. 3D structures are starting materials for 3D QSAR, ligand-protein docking and drug design studies. CORINA can be used in batch mode or interactively by applying the graphical user interface CORINA.^ireci. COR/NA_F is a feature-restricted version of CORINA that has been interfaced to FlexX, the flexible docking program distributed by BioSolveIT GmbH. [Sadowski J, Gasteiger J, (1993) From Atoms and Bonds to Three-dimensional Atomic Coordinates: Automatic Model Builders. Chem. Rev. 93: 2567-2581.]

d. Controlling Structural State & Integrity: CHECK performs high throughput structure integrity checks and can be used to normalize the state of each compound by applying certain business rules. Integrity check is performed on the atomic valence, hybridization state, and ionization state of a molecule. The state of each structure can be controlled by specifying its ionization state, by removing salts and solvents or by adding missing hydrogen atoms. Furthermore, in large structure files duplicate structures can be detected and removed.

e. Enumerating Stereoisomers & Tautomers: STERGEN automatically identifies stereocenters (tetrahedral centers and cis/trans double bonds) and enumerates all possible combinations of stereoisomers. This application can be used to process large datasets of chemical structures. TAUTOMER enumerates all tautomeric forms of a compound. TAUTOMER can be restrained to generate only one tautomer, the proposed form being assumed to be one of the most prevalent in solution.

f. Exploring Conformational Space: ROTATE automatically generates conformational ensembles from a starting 3D structure like the one obtained with CORINA, including conformers, which come close to biologically active ones. ROTATE can perform a user-defined and balanced sampling of the conformational space in order to obtain a sets of diverse conformations. [Schwab, C.H. (2003) Conformational analysis and searching. In: Handbook of chemoinformatics - from data to knowledge. Gasteiger J (ed.) Weinheim D, Wiley-VCH. pp. 262-301.]

g. Computing Descriptors: ADRIANA.Code calculates a series of molecular descriptors that can be applied in the area of in silico discovery and optimization of new chemical entities. The descriptors encode physicochemical, topological, geometrical, and surface properties of molecules [Gasteiger J (2003) Physicochemical effects in the representation of molecular structures for drug designing. Mini Rev Med Chem 3: 789-796.]. The following descriptors can be calculated:

i. Global molecular descriptors including number of H bond acceptors and H bond donors, TPSA, molecular weight, dipole moment, log P, log S and molecular polarizability ii. Autocorrelation of 2D and 3D interatomic distance distributions weighted by partial charges, electronegativities, and polarizabilities [Gasteiger J, Teckentrup A, Terfloth L, Spycher S (2003) Neural networks as data mining tools in drug design. J Phys Org Chem 16: 232-245.]

iii. Radial Distribution Functions of 3D interatomic distances weighted by partial charges, electronegativities, and polarizabilities [Terfloth L, Gasteiger J (2003) Electronic screening: lead finding from database mining. In: The practice of medicinal chemistry, 2nd Edition, Wermuth CG (ed.) Amsterdam, NL, 2003, pp. 131-145.]

iv. Autocorrelation Functions of distances between surface points weighted by molecular electrostatic potential, hydrogen bonding potential, and hydrophobicity potential [Teckentrup A, Briem H, Gasteiger J (2004) Mining high-throughput screening data of combinatorial libraries: development of a filter to distinguish hits from nonhits. J. Chem. Inf. Comput. Sci. 44: 626-634.]

h. Analyzing and Modeling: ADRIANA (Automated Drug Research by Interactive Application of Non-linear Algorithms) bundles the two software packages ADRIANA.Code and SONNIA. SONNIA is a self-organizing neural network package including both unsupervised (Kohonen) and supervised (counter-propagation network) learning techniques. SONNIA has a graphical user-interface for the visualization of chemical structures, reactions, and spectra. Statistical or machine learning methods are widely used to establish relationships between biological activities, physical or chemical properties of a compound and its chemical structure. These methods, in combination with structure descriptors, are used to derive models that can be applied to predict properties of new compounds. [Anzali S, Gasteiger J, Holzgrabe U, Polanski J, Sadowski J, Teckentrup A, Wagener M (1998) The use of self-organizing neural networks in drug design. In: Kubinyi H, Folkers G, Martin YC (eds) 3D QSAR in Drug Design - Volume 2, Kluwer/ESCOM, Dordrecht, NL, pp. 273-299.]

i. Warehousing Reactions: [email protected] is a molecule-oriented data warehousing system for the storage of chemical reactions, 2D structures, and multiple 3D conformations of chemical compounds. [email protected] is a client-server application that includes a structure search engine for the retrieval of chemical reactions and their related data. This engine can perform reaction searches, structure and sub-structure searches, similarity searches and precursor searches.

j. Designing Synthesis: WODCA provides synthesis design solutions for organic compounds. This application relies on a retrosynthetic approach to provide acceptable synthetic pathways. By identifying strategic bonds in the product, WODCA suggests suitable precursors. Retrieval in reaction databases shows the viability of retrosynthetic steps and provides experimental conditions. This procedure can be applied recursively until commercially available starting materials are identified. [Gasteiger J (2003) The prediction of chemical reactions. In: Gasteiger J, Engel T (eds), Chemoinformatics - A Textbook, Wiley-VCH, Weinheim, pp. 542-567. Sitzmann M, Pförtner M (2003) Computer-assisted synthesis design. In: Gasteiger J, Engel T (eds), Chemoinformatics - A Textbook, Wiley-VCH, Weinheim, pp. 567-593.]

k. Converting and Manipulating files: CONVERT enables the inter-conversion of 40 different structure and reaction file formats. This application automatically detects the format of the input file and converts it into the format specified by the user. TABLE converts files containing structures and data into spreadsheet and internet compatible file formats. Formats supported include Excel, dBASEIII and HTML. During that process, structures are converted to WMF (Excel) or GIF images. SPLIT/JOIN&MERGE splits a structure file including n structures into n separated files, concatenates a series of n structure files into one single file, or merges separated structure files and data files into a single SDFile. l. Drawing and printing: IMAGE is designed to convert structure files into raster (GIF, PNG, BMP) or vector (WMF, EMF, EPS) images. These file formats can be used to embed 2D depicts or flattened 3D structures into documents. These file formats can also be imported in digital imaging software for graphic design. PAGE converts chemical files into formatted documents. The application provides many parameters to control the page layout. The resulting PostScript file can printed or converted into a PDF document. m. Databases:

i. [email protected] database includes about 1 million compounds from diverse commercial providers. These structures are stored with physicochemical properties like molecular weight, predicted log P, solubility and Lipinsky rule of 5. Structure, substructure and similarity searches can be performed along with any regular data and string search.

ii. The Biochemical Pathways database (BioPath) contains data derived from the Roche Applied Science "Biochemical Pathways" wall chart. BioPath provides access to biological transformations and regulations as described on the "Biochemical Pathways" chart. [Reitz M, Sacher O, Tarkhov A, Trumbach D, Gasteiger J (2004) Enabling the exploration of biochemical pathways. Org Biomol Chem 2, 3226-3237.]

iv. Review:

Molecular Networks products have been largely developed by the work by Professor Gasteiger on a range of chemoinformatics technologies to depict molecules and chemical reactions as described above. Many of these programs have been widely used, of particular note is the CORINA program which is broadly used to generate 3D conformations of 2D renditions of structures and integrated into other vendor products. Molecular Networks collaborates with MDL for databases that feature CORINA-generated models to provide researchers with a more complete and realistic set of searchable models in an industry-standard format. Molecular Networks collaborates with SciTegic to integrate CORINA and other programs into Pipeline Pilot data processing protocols. Molecular Networks collaborates with Biomax to join bioinformatics and chemoinformatics into one stream of research, with BioSolvelT to integrate collections of virtual screening tools to facilitate in silico synthesis of chemical fragments optimized towards certain target-dependent properties (see CORINA_F), and with Inte:Ligand to integrate CORINA into i:lib diverse for virtual compound library generation.

Figure 2.123. Molecular Networks

2.27 Open Eye Scientific Software

I. Open Eye Scientific Software; http://www.eyesopen.com/

II. Product Summaries:

a. General: OpenEye Scientific Software, Inc. develops large-scale molecular modeling applications and toolkits. Primarily geared towards drug discovery and design, areas of application include structure generation, docking, shape comparison, charge/electrostatics, chemical informatics and visualization. OpenEye makes much of its technology available as toolkits - programming libraries suitable for custom development.

b. OpenEye offers the following application software:

i. EON - chemical similarity analysis via comparison of electrostatics overlay.

ii. FILTER - molecular screening and selection based on physical property or functional group.

iii. FRED - fast, systematic docking search for ligand binding within a protein active site.

iv. OMEGA - systematic high-throughput conformer generation, including 1D or 2D to 3D structure generation.

v. QUACPAC - quality charge states and charges for small molecules and proteins.

vi. ROCS - chemical similarity analysis via rapid 3D molecular shape searches.

vii. SMACK - molecular databases query converter and optimizer (SMARTS and MDL).

viii. SZYBKI - fast structure optimization of ligands in gas-phase, solution, or within a protein active site.

ix. VIDA - graphical user interface that visualizes, analyzes and manages corporate collections of molecular structures and information.

x. WABE - electrostatics optimization of a lead compound.

c. OpenEye offers the following toolkits as programming libraries providing other applications with object-oriented accessibility to a given set of capabilities:

i. Case - generalized function optimization, e.g. molecular structure optimization.

ii. LexiChem - state-of-the-art compound name and structure interconversion.

iii. OEChem - chemoinformatics and 3D molecular data handling. 1. SCUT Monkeys - all the apps included with OEChem.

iv. Ogham - elegant 2D structure rendering of compounds.

v. Shape - molecular shape comparisons based on 3D overlays. vi. Zap - an efficient Poisson-Boltzmann electrostatics solver.

III. Key capabilities and offerings:

a. OMEGA is a high-throughput structure generation tool designed to handle large databases and combinatorial libraries valuable to computer-aided drug design. Exhaustive conformational expansion of drug-like molecules can be performed in fractions of a second using a systematic algorithm to ensure reproducibility, yielding a throughput of hundreds of thousands of compounds per processor per day. Omega accepts a wide-variety of input file formats, including 1D connection tables. Conformers can be stored in a number of formats, including an ultra-compact one averaging 20 bytes per structure. The output ensembles are designed to include the bioactive conformer and can be minimized against MMFF and solvent forces. OMEGA provides a natural entry point to OpenEye structural analysis software and is available for a variety of operating systems and hardware. Over a dozen features are described on the website.

b. ROCS is a shape comparison program, based on the idea that molecules have similar shape if their volumes overlay well. Any volume mismatch is a measure of dissimilarity. Implementation of the global search required to find the best overlay is difficult with hard-sphere representations. Instead ROCS uses a Gaussian representation of the molecular volume. Because the volume function is smooth, it is possible to routinely minimize to the best global match from a few starting arrangements. ROCS is capable of processing 600-800 queries each second. At this speed it is possible to search multi-conformer representations of corporate collections in a day on a single processor. ROCS is also able to include chemical knowledge by the addition of a SMARTS-defined chemical force field. The force field can be discrete or Gaussian. Discrete scores the overlays based on proximity of SMARTS patterns, while Gaussian weights such scores based on distance. Furthermore, because the Gaussian force field is differentiable, it can be included in the minimization to arrive at a different overlay than would be found from shape alone. Nine features are described on the website.

c. OEChem is a programming library for chemistry and chemical informatics that is fast, stable and well documented. OEChem has many simple yet powerful functions, which handle the details of working with molecules. For routine tasks, OEChem offers clear and efficient scripting in Python; over 70 scripts are provided as examples for common tasks. For more advanced software development and enterprise solutions, OEChem offers a stable API in Python and C+ +. High-level functions provide simplicity while low-level functions provide flexibility. OEChem is available on many platforms and is the core chemistry toolkit for all OpenEye products. Fifteen key features of the OEChem Toolkit are described on the website.

d. FILTER is a molecular screening and selection tool that uses a combination of physical-property calculations and functional-group knowledge to assess compound collections. In selection mode, FILTER can be used to choose reagents appropriate for specific syntheses. In filter mode, it quickly removes compounds with undesirable elements, functional groups, or physical properties. FILTER is a command line utility that, like other OpenEye products, reads and writes numerous file formats. FILTER currently has the ability to select or reject compounds based on eleven criteria described on the website, including extensive definitions of undesireable functional groups.

e. ZAP: Poisson-Boltzmann (PB) is an efficient way to simulate electrostatics in a medium of varying dielectric, such as organic molecules (drugs, proteins) in water. It requires a molecular charge description and a designation of low (molecular) and high (solvent) dielectric regions. The ZAP toolkit provides facilities to produce a grid of PB electrostatic potentials and, from this, a long list of biologically interesting quantities. These include solvent transfer energies, binding energies, pKa shifts, solvent forces, electrostatic descriptors, surface potentials and effective dielectric constants. Unique to ZAP is a dielectric function based on atomic-centered Gaussian functions. ZAP avoids many pitfalls of discrete dielectric models and works well not only for small molecules but also proteins and macro-molecular ensembles. Eight key features of the ZAP toolkit are described on the website.

f. VIDA is a graphical interface designed from the ground-up to browse, manage and manipulate large sets of molecular information. Built-in chemoinformatics (SMARTS-matching, SMILES parsing), an advanced list manager, spreadsheet, annotation and graphing capabilities make it possible to operate in real-time on corporate collections of a million structures. It supports all standard visualization paradigms for both small molecules and proteins, including 2D depictions, both hardware and software stereo, and many unique facilities including surface selection and manipulation. Many methods of physical property calculation are included, and a wide range of formats can be read and written. VIDA can also act as an interface to other OpenEye's modeling software for initial setup or analysis of generated data. The eighteen primary features of VIDA are described on the website.

g. FRED stands for Fast Rigid Exhaustive Docking. For every ligand, FRED exhaustively searches all possible poses within a protein active site, filtering for shape complementarity and pharmacophoric features before evaluating with several scoring functions (ChemScore, PLP, ScreenScore, ChemGauss, PBSA). FRED uses a systematic search algorithm, accurately predicting binding modes in a reproducible manner, unlike many other docking programs, which use stochastic methods. Despite being exhaustive, FRED is extremely fast, out-performing all competitive methods; FRED docks about a dozen ligand conformers per second per processor. Furthermore, FRED will perform constrained docking, wherein certain pharmacophoric features are guaranteed to be in specific regions of the active site, allowing scientists to take advantage of known structure activity relationships. The twelve primary features of FRED are described on the website.

h. SHAPE is an object-oriented toolkit in C, modeled in form after the Daylight chemical toolkit. SHAPE makes accessible the functions and capabilities that underlie ROCS, OpenEye's shape comparison program, and facilitates the incorporation of molecular overlay into other applications. SHAPE encompasses the calculation of molecular descriptors for shape (steric multi-poles), the volume overlap between molecules, the spatial similarity of chemical groups (color force field) and the optimization of the latter two quantities. SHAPE utilizes a variety of methods, some tuned for performance, others for accuracy. Example applications are provided. SHAPE has been extended to allow operation on a generic shape fields, e.g. grids, and can be applied to such diverse problems as comparing and aligning active sites and real-space fitting of electron-density. Eight example uses of SHAPE are included on the website.

IV. Review:

OpenEye's modular software applications and toolkits allow developers to build their own tools. Open Eyes software is widely used by developers for creating customized technologies. Instead of the tools that traditionally dominate the field, both in academia and in industry, OpenEye attempts to provide tools that vastly increase the scale of operation of computational chemistry in drug design. OpenEye sells software designed to be useful in drug discovery and drug optimization. Simple products for non-programmers and nonexperts could widen the benefit to larger communities of scientists. Java interfaces to all toolkits are currently under development to specifically meet this need.

Figure 2.125. (eon_mdd) Two molecules from the MDDR with substantially different chemistry, but high shape and electrostatic similarity (Tshape > 0.75, relectrostatic > 0.3)
Figure 2.126. (fred_cdk2) The crystallographic structure of a CDK-2 ligand in its bound conformation (green) overlaid by the pose predicted by FRED

©

Search Engine Optimization

Search Engine Optimization

Discover The Secrets to Improve Your Site Ranking! Have you been wondering how you can draw more traffic to your website? Do you want to boost sales or do a better job of promoting your website online but have no idea how to go about it?

Get My Free Ebook


Post a comment