Data Science and Cocrystallization

To enable engineering of bespoke crystals, combining the best of molecular and self-assembled properties, powered by data science

With cocrystallization a chemical engineer can decouple the molecular properties of their target molecule from the emergent properties of the solid form. So far, the technology has primarily been used to improve the aqueous solubility of pharmaceutical compounds, but there are many other properties that derive from crystal structure that one can imagine engineering. Creating crystals of a desired shape to facilitate manufacturing, decreasing hygroscopicity to prolong shelf life, sequestering soluble compounds from solutions, and tuning the electronic properties of thin film assemblies are all possible with control over cocrystallization.

What is a cocrystal?

Types of crystals, the FDA defines cocrystals as any multi-molecular crystal

Machine learning and other data science techniques could revolutionize cocrystal discovery. The Cambridge Structural Database (CSD) is full of manually annotated and uniform data. A significant challenge is the imbalance in number of coformers represented in the CSD; fewer than 25 coformers account for more than 50% of the 10,000 deposited cocrystal structures. My group will use clever sampling techniques, supplemented with additional experimental data, to overcome this challenge.