Since the 1990s, researchers in industry and academia have sought computational models to assist chemists with the complex design task of designing “better” molecules to treat disease. In the ensuing decades, practitioners in the field of de novo molecular design have learned many lessons on how best to apply these models to generate actionable molecules for active drug discovery programmes.
De novo molecular design is the art of designing molecules to optimally satisfy the desired objective. In our case, this objective is producing better drugs, which involves balancing a number of molecular properties and is, therefore, a multiobjective problem. We previously published GuacaMol, an open benchmark for measuring the aptitude of de novo design algorithms.
In our review, “De novo molecular design and generative models” we categorise existing methods for de novo design by a new paradigm, related to the practicality of using these methods in anger. Namely, we classify generative chemistry algorithms by the coarseness of their molecular representations, whether atom-based, fragment-based or reaction-based, each of which has profound implications for the types of molecular design that can be achieved. We also distinguish between modern AI methods (gradient-based) and traditional chemoinformatics approaches (metaheuristic) and emphasize that while older methods are out of vogue, they can offer competitive performance and practical advantages.
On top of the choice of automated molecular design algorithm, we offer our perspective on practical usage of these algorithms. For example, some algorithms for de novo design allow the practitioner to “grow” from an initial starting molecule more easily than others. Algorithms which allow users to ask a broad range of questions are certainly practically advantageous.
While there has been much focus on developing new algorithms for de novo design, we emphasize our belief that it is the design of a suitable fitness objective that remains the challenge for most de novo design endeavours. Automated design algorithms are able to exploit loopholes in calculated scores which can result in less useful outputs being generated, therefore it is sensible to take steps to avoid the presence of loopholes in addition to providing a number of competing objectives.
Medicinal chemists now have a variety of tools at their disposal that are proficient generators of sensible molecule structures, now the challenge is to evaluate whether our generators and optimization objectives are useful for the tasks at hand. De novo molecular design and generative chemistry models remain a controversial topic in the field, but we believe there is strong evidence to support adding atom-based generators, fragment-based methods and reaction-based de novo design tools to the medicinal chemistry toolbox.