Representation of chemical structures and reactions
Traditional small pharma compounds, statistically distributed compound structures in the cosmetic or chemical industry, or peptide sequences for diagnostics are examples of the diverse world of chemical structures. Depending on the industry area the requirements for the chemical representation vary quite heavily and lead to individual solutions for each industry field. Beside consistent drawing rules to ensure the reproducibility of chemical depictions the registration and retrieval systems for chemical compounds and reactions have to support these individual needs of the industry sectors.
Most of the standard drawing tools (Accelrys/Draw, ChemDraw, Marvin) support the needs of small pharma molecules out of the box, but different opinions about the right way to draw functional groups or to handle stereochemistry, for example. make it necessary to have enterprise wide drawing rules in place and to control them during registration (for more details see "Chemical Drawing Rules on Enterprise Level" and "Automated Structure Modifications and Normalizations")
On the other hand peptide sequences, DNA, or RNA are mainly described by texts, because they use standard abbreviations for the standard amino acids and nucleobases that can be handled by methods like BLAST or FASTA. But as soon as amino acids or nucleobases are chemically modified and linkers and protecting groups are introduced into the molecules only structure based DBMS provide a unique registration and retrieval of these compound classes. HELM, Biochemfusion’s PLN or Accelrys’ SCSR extension for the V3 molfile format handle this area. ( more Details will follow ….)
The most challenging part is the structural representations of polymers and other compound classes that are not precisely defined but are described by statistical distributions. (see "Sgroups – Abbreviations, Mixtures, Formulations, Polymers, Structures with Statistical Distribution and Other Special Cases" for more details). In this case the drawing, registration and retrieval tools must be packaged in a way that the unique representation of the structures is guaranteed as well as a simple user friendly interface for registration and retrieval supporting the easy to use access for end-users.
In this context StructurePendium has gathered experiences in the following Domains
- Development of database consistent company drawing rules
- Normalization of chemical structures and reactions using tools like Cheshire, Ppleline Pilot with other tools under Investigation.
- Handling of biologics and biopolymers based on V2/V3 molfiles, Accelrys SCSR format, HELM, or Biochemfusion’s PLN notation and the storage in the related cartridge systems of Accelrys, Biochemfusion and ChemAxon,
- Special compound classes like polymers, mixtures, formulations, or compounds with structure sections described by statistical distributions based on the V2/V3 molfile Format
The applications side of these services is mostly embedded into registration and retrieval systems or into data pipelining tools like Accelrys Pipeline Pilot and Knime.