Introducing Generative Biology Part II: On the Brink of Transforming Drug Discovery

Download as PDF

Our previous article examined how machine learning is well-suited to uncovering the governing principles of biology that are needed before truly novel, purpose-built molecules can be fully engineered.

In this arti­cle, we’ll take a look at the flaws in our cur­rent drug dis­cov­ery par­a­digm and dis­cuss how gen­er­a­tive biol­o­gy, dri­ven by machine learn­ing, is poised to trans­form the future of medicine.

Drug discovery today is essentially an expensive process of trial and error that has scarcely changed in a hundred years. Decades of experience have not allowed researchers to predict what molecules will serve what therapeutic purposes, and this process of guesswork has only become more inefficient and costly over time. 

The advent of high throughput technologies suggested that the twin problems of inefficiency and cost could be solved by the brute-force” approach of randomly creating and screening more and more molecules. Yet this mindset has actually exacerbated the problem: since 2001, the cost of commercializing a new drug swelled from approximately $800 million to more than $2.5 billion. The steady decline in successful drug discovery has been creatively termed Eroom’s law” – the inverse of Moore’s law, which predicts the doubling of computing power, and consequent improvements in costs and efficiencies, every two years.

Evolution and modern medicine rarely align

The conventional therapeutics discovery paradigm looks for molecules in close vicinity to those that nature has already evolved. This severely limits the space to search for therapeutic proteins because the goals of evolution and modern medicine are rarely the same. Indeed, the probability that nature has sampled any given functional protein sequence at any point in the history of life is far less than one drop of water in all of Earth’s oceans. This puts an almost incomprehensible number of possible proteins, and their therapeutic potential, beyond the reach of conventional therapeutics discovery. 

As a result, many of the most powerful medicines of the first half of the twentieth century, such as insulin, are direct copies of what nature has created. More recent medicines also tend to rely on naturally derived molecules that have been minimally tailored for therapeutic use, such as monoclonal antibodies (mAbs). The tailoring process expands the search space only minutely compared with the ocean of possibilities because drug discovery currently uses nature as the starting point and guide.

The same is true for technologies used to deliver medicines such as gene therapies. The promise of gene therapy wasn’t realized until the existence of natural biological machinery, in the form of viruses, was discovered, which evolved to deliver nucleic acids inside cells. Today, approved gene therapies are predominantly delivered with adeno-associated viruses (AAVs) or lentiviruses whose machinery is only slightly altered from their natural forms, rather than being explicitly engineered for their new therapeutic functions. 

It is unrealistic to believe drug discovery can rely on nature’s medicine cabinet forever to serve humanity’s growing and diverse medical challenges. Biotech and pharmaceutical companies must develop new capabilities for generating never-before-seen therapies tailored to meet human needs. To do this, they have to find the principles that govern how a protein’s sequence determines its function, how viral machinery delivers nucleic acid cargo to cells and a host of other biological hows” that will enable direct engineering of the next generation of therapies – without the guesswork of the past.

Machine learning, coupled with investments in obtaining the right datasets, can extract the principles needed to engineer that next generation of medicines, including those that biologists have yet to imagine, through the new paradigm of generative biology.

Transforming the future of medicine with generative biology 

At Generate Biomedicines, we believe that the transition from conventional drug discovery to generative biology will constitute a sea change in life sciences. No longer will we be discovering suboptimal medicines that nature evolved for its own purposes; instead, we will be creating, or generating, purpose-built, generated medicines. While this was once purely science fiction, the transformation that generative biology offers is now a realistic goal, thanks to the combination of decreased costs of high throughput DNA sequencing and DNA synthesis, exponential increases in computing power, and the rapid advancement of machine learning algorithms. 

By definition, bringing to life the vision of generative biology requires the categorical merger of scientific and technological expertise. Working at the intersection of biology, machine learning, and engineering, scientists from diverse disciplines need to collaborate in new ways, inventing and innovating together. At Generate, we are building cross-disciplinary teams of experts and taking on this mission. We are pioneering a future in which chance and unpredictability in drug discovery give way to intentionally and controllably generated biology and medicine.


About the authors

Andrew Beam, Ph.D., is an Assistant Professor at Harvard University and the Founding Head of Machine Learning at Generate Biomedicines

Molly Gibson, Ph.D., is Co-Founder and Chief Strategy and Innovation Officer at Generate Biomedicines

Gevorg Grigoryan, Ph.D., is Co-founder and Chief Technology Officer at Generate Biomedicines