Erasmus Mundus Joint Master - ChEMoinformatics+ : Chemoinformatics: Past, Present, Future

by:Eliya Davidov, Track «Chemoinformatics and Materials Informatics», Bar Ilan-Strasbourg, 2023

In 1998, Frank Brown coined the term Chemoinformatics, defining it as "all the information resources that a scientist needs to optimize the properties of a ligand to become a drug." Yet, the foundations of chemoinformatics trace back much earlier, to the 1950s and 60s, when computational chemistry first began taking shape [1].

In 1957, Ray and Kirsch published the first algorithm for substructure searching. Their groundbreaking paper described "a collection of machines... capable of performing a complete data processing task involving data storage facilities." This laid the groundwork for structure, similarity, and substructure searching in databases — core concepts that would later become vital in chemoinformatics. By 1963, Vleduts proposed the concept of "skeleton reaction schemes" and reaction centers, suggesting the possibility of machine-aided synthesis: "the possibility of a machine solution... the selection of ways synthesizing a given compound" [2]. Another pivotal moment came in 1962 when Hansch introduced QSAR (Quantitative Structure–Activity Relationships), which link a biological activity to chemical structure using factors (molecular descriptors) such as steric effects, electronic properties, and hydrophobicity.

In recent decades, with the rise of artificial intelligence, chemoinformatics has evolved. Its scope now extends beyond ligand optimization to encompass "the application of informatics methods to solve chemical problems" [3]. Without exhaustivity, this includes predictive modeling for biological activity, drug discovery, ligand-based design, 3D molecular docking, protein-ligand interactions, virtual screening, simulations, and molecular dynamics (Figure 1). Although much of the field focuses on biology, chemoinformatics also plays a role in materials science, aiding in the design of batteries, energetic materials, and other physical systems.

What lies ahead for chemoinformatics? With AI, increasing computational power, and the surge of big data, the future promises new breakthroughs. AI is expected to push chemoinformatics into uncharted territories, such as drug discovery for rare diseases. Quantum computing will certainly be a major game changer in the realm of simulations and modeling, allowing for new algorithmic approaches to solve, for instance, complex graph isomorphism problems.

Figure 1. Chemoinformatics emerged as a field from the solutions found to data related problems shared by many other scientific domains. Medicinal chemistry and drug discovery subjects are still today strong driving forces in chemoinformatics.

References
1. P. Willett, Chemoinformatics: a history. WIREs Comput. Mol. Sci., 2011, 1, 46-56. https://doi.org/10.1002/wcms.1
2. G.E. Vleduts, Concerning one system of classification and codification of organic reactions. Inf. Stor. Ret. 1963, 1, 117–146. https://doi.org/10.1016/0020-0271(63)90013-5
3. J. Gasteiger, The central role of chemoinformatics. Chemometr. Intell. Lab. Syst. 2006, 82, 200–209. https://doi.org/10.1016/j.chemolab.2005.06.022