Machine Learning approach to accelerate new drug development

A new Machine Learning model has been developed by researchers from the University of Cambridge that has the potential to accelerate the design process for new drugs.

The new Machine Learning approach combines automated experiments with AI to predict how chemicals will react with one another. The data-driven approach, inspired by genomics, is called the chemical reactome.

The results, validated on a dataset of more than 39,000 pharmaceutically relevant reactions, have been reported in the journal Nature Chemistry.

The University of Cambridge team was assisted by Pfizer.

Ways to predict molecule reactivity

Predicting the reactivity of molecules is vital for the discovery and manufacture of new pharmaceuticals.

However, this has historically been a trial-and-error process, and the reactions often fail.

To predict how molecules will react, electrons and atoms are simulated in simplified models. This process is often inaccurate and computationally expensive.

The new reactome approach

“The reactome could change the way we think about organic chemistry,” said Dr Emma King-Smith from Cambridge’s Cavendish Laboratory, the paper’s first author.

“A deeper understanding of the chemistry could enable us to make pharmaceuticals and so many other useful products much faster. But more fundamentally, the understanding we hope to generate will be beneficial to anyone who works with molecules.”

The reactome approach picks out relevant correlations between reactants, reagents, and the performance of the reaction from the data. Gaps in the data are also pointed out.

The data used is generated from fast or throughput automated experiments.

“High throughput chemistry has been a game-changer, but we believed there was a way to uncover a deeper understanding of chemical reactions than what can be observed from the initial results of a high throughput experiment,” said King-Smith.

“Our approach uncovers the hidden relationships between reaction components and outcomes,” said Dr Alpha Lee, who led the research.

“The dataset we trained the model on is massive – it will help bring the process of chemical discovery from trial-and-error to the age of big data.”

Machine Learning approach to enable faster drug design

The team has also developed a Machine Learning approach that enables chemists to introduce precise transformations to pre-specified regions of a molecule. The work, published in a related paper in Nature Communications, is set to enable faster drug design.

The approach allows chemists to tweak complex molecules without having to make them from scratch.

The conventional way to vary the core of a molecule is to rebuild the molecule from scratch, but core variations are important to medicine design.

Late-stage functionalisation reactions attempt to directly introduce chemical transformations to the core, avoiding the need to start from scratch.

However, it is challenging to make late-stage functionalisation selective and controlled. This is because there are many regions of the molecule that can react – making it difficult to predict the outcome.

“Late-stage functionalisations can yield unpredictable results and current methods of modelling, including our own expert intuition, isn’t perfect,” said King-Smith. “A more predictive model would give us the opportunity for better screening.”

The Machine Learning approach predicts where a model reacts

The researchers developed a Machine Learning model that predicts where a molecule would react. The approach also shows how the rate of reaction varies as a function of different reaction conditions. This helps chemists find ways to tweak the core of a molecule.

© shutterstock/SynthEx

“We pretrained the model on a large body of spectroscopic data – effectively teaching the model general chemistry – before fine-tuning it to predict these intricate transformations,” said King-Smith.

The Machine Learning approach allowed the team to overcome the limitation of low data – there are few late-stage functionalisation reactions reported in the scientific literature.

The team validated the model on a diverse set of drug-like molecules and was able to predict the sites of reactivity under different conditions.

“The application of Machine Learning to chemistry is often throttled by the problem that the amount of data is small compared to the vastness of chemical space,” said Lee.

“Our approach – designing models that learn from large datasets that are similar but not the same as the problem we are trying to solve – resolve this fundamental low-data challenge and could unlock advances beyond late stage functionalisation.”

The research was supported by Pfizer and the Royal Society.

Subscribe to our newsletter

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Featured Topics

Partner News

Advertisements

Media Partners

Similar Articles

More from Innovation News Network