Publications
Most recent publications available on Google Scholar
Matthew A. McDonald, Brent A. Koscher, Richard B. Canty, Jason Zhang, Angelina Ning, and Klavs F. Jensen*
ACS Central Science, 11, 346-356 (2025)
Different experiments of differing fidelities are commonly used in the search for new drug molecules. In classic experimental funnels, libraries of molecules undergo sequential rounds of virtual, coarse, and refined experimental screenings, with each level balanced between the cost of experiments and the number of molecules screened. Bayesian optimization offers an alternative approach, using iterative experiments to locate optimal molecules with fewer experiments than large-scale screening, but without the ability to weigh the costs and benefits of different types of experiments. In this work, we combine the multifidelity approach of the experimental funnel with Bayesian optimization to search for drug molecules iteratively, taking full advantage of different types of experiments, their costs, and the quality of the data they produce. We first demonstrate the utility of the multifidelity Bayesian optimization (MF-BO) approach on a series of drug targets with data reported in ChEMBL, emphasizing what properties of the chemical search space result in substantial acceleration with MF-BO. Then we integrate the MF-BO experiment selection algorithm into an autonomous molecular discovery platform to illustrate the prospective search for new histone deacetylase inhibitors using docking scores, single-point percent inhibitions, and dose–response IC50 values as low-, medium-, and high-fidelity experiments. A chemical search space with appropriate diversity and fidelity correlation for use with MF-BO was constructed with a genetic generative algorithm. The MF-BO integrated platform then docked more than 3,500 molecules, automatically synthesized and screened more than 120 molecules for percent inhibition, and selected a handful of molecules for manual evaluation at the highest fidelity. Many of the molecules screened have never been reported in any capacity. At the end of the search, several new histone deacetylase inhibitors were found with submicromolar inhibition, free of problematic hydroxamate moieties that constrain the use of current inhibitors.
Matthew A. McDonald, Brent A. Koscher, Richard B. Canty, and Klavs F. Jensen*
Chemical Science, 15, 10092-10100 (2024)
Reaction optimization and characterization depend on reliable measures of reaction yield, often measured by high-performance liquid chromatography (HPLC). Peak areas in HPLC chromatograms are correlated to analyte concentrations by way of calibration standards, typically pure samples of known concentration. Preparing the pure material required for calibration runs can be tedious for low-yielding reactions and technically challenging at small reaction scales. Herein, we present a method to quantify the yield of reactions by HPLC without needing to isolate the product(s) by combining a machine learning model for molar extinction coefficient estimation, and both UV-vis absorption and mass spectra. We demonstrate the method for a variety of reactions important in medicinal and process chemistry, including amide couplings, palladium catalyzed cross-couplings, nucleophilic aromatic substitutions, aminations, and heterocycle syntheses. The reactions were all performed using an automated synthesis and isolation platform. Calibration-free methods such as the presented approach are necessary for such automated platforms to be able to discover, characterize, and optimize reactions automatically.
Brent A. Koscher†, Richard B. Canty†, Matthew A. McDonald†, Kevin P. Greenman, Charles J. McGill, Camille L. Bilodeau, Wengong Jin, Haoyang Wu, Florence H. Vermeire, Brooke Jin, Travis Hart, Timothy Kulesza, Shih-Cheng Li, Tommi S. Jaakkola, Regina Barzilay, Rafael Gómez-Bombarelli, William H. Green, Klavs F. Jensen*
† These authors contributed equally to this work
Science, Vol 382, Issue 6677 (2023)
A closed-loop, autonomous molecular discovery platform driven by integrated machine learning tools was developed to accelerate the design of molecules with desired properties. We demonstrated two case studies on dye-like molecules, targeting absorption wavelength, lipophilicity, and photooxidative stability. In the first study, the platform experimentally realized 294 unreported molecules across three automatic iterations of molecular design-make-test-analyze cycles while exploring the structure-function space of four rarely reported scaffolds. In each iteration, the property prediction models that guided exploration learned the structure-property space of diverse scaffold derivatives, which were realized with multistep syntheses and a variety of reactions. The second study exploited property models trained on the explored chemical space and previously reported molecules to discover nine top-performing molecules within a lightly explored structure-property space.
Christian P. Haas, Maximilian Lübbesmeyer, Edward H. Jin, Matthew A. McDonald, Brent A. Koscher, Nicolas Guimond, Laura Di Rocco, Henning Kayser, Samuel Leweke, Sebastian Niedenführ, Rachel Nicholls, Emily Greeves, David M. Barber, Julius Hillenbrand,* Giulio Volpin,* and Klavs F. Jensen
ACS Central Science, 9, 307−317 (2023)
Automation and digitalization solutions in the field of small molecule synthesis face new challenges for chemical reaction analysis, especially in the field of high-performance liquid chromatography (HPLC). Chromatographic data remains locked in vendors’ hardware and software components, limiting their potential in automated workflows and data science applications. In this work, we present an open-source Python project called MOCCA for the analysis of HPLC−DAD (photodiode array detector) raw data. MOCCA provides a comprehensive set of data analysis features, including an automated peak deconvolution routine of known signals, even if overlapped with signals of unexpected impurities or side products. We highlight the broad applicability of MOCCA in four studies: (i) a simulation study to validate MOCCA’s data analysis features; (ii) a reaction kinetics study on a Knoevenagel condensation reaction demonstrating MOCCA’s peak deconvolution feature; (iii) a closed-loop optimization study for the alkylation of 2-pyridone without human control during data analysis; (iv) a well plate screening of categorical reaction parameters for a novel palladium-catalyzed cyanation of aryl halides employing O-protected cyanohydrins. By publishing MOCCA as a Python package with this work, we envision an opensource community project for chromatographic data analysis with the potential of further advancing its scope and capabilities.
Matthew A. McDonald, Hossein Salami, Patrick R. Harris, Colton E. Lagerman, Xiaochuan Yang, Andreas S. Bommarius, Martha A. Grover, and Ronald W. Rousseau*
Reaction Chemistry & Engineering, 6, 364-400 (2021)
Reactive crystallization is not new, but there has been recent growth in its use as a means of improving performance and sustainability of industrial processes. This review examines phenomena and processes in which reaction and crystallization are coupled in the production of a desired chemical species. Coverage includes fundamental phenomena, such as solubility, supersaturation, crystal nucleation and growth, and chemical kinetics. Systems examined are divided into two groups, those best described as undergoing ionic reactions (including neutralizations), which have near instantaneous rates and result in the formation of ionic bonds, and those undergoing covalent reactions in which the key step occurs at measurable rates and results in the formation of covalent bonds. Discussion of the latter category also includes the impact of catalysis. Examples of a variety of reactions and applications are enumerated, and special attention is given to the utility of reactive crystallization in chiral resolution. Integration of reactive crystallization into process design, including both batch and continuous operations, and the development and efficacy of modeling, monitoring and control are reviewed. Finally, a perspective addressing needs to advance the usefulness and applications of reactive crystallization is included.
For more, visit Google Scholar