Tech & AI

AI goes viral

Using machine learning to accelerate gene therapy

By Han Le

Designs by Kristina Boyko

December 3, 2024

Did you know that over 10,000 human diseases are caused by an abnormality in a single gene? Even though one genetic defect may seem minor among the thousands of genes that make up a human, it can cause pain, deformity, and even death. Among modern medical advances, gene therapy offers a cure for genetic diseases. Scientists first identify the faulty gene causing the disease and then use a vector to deliver genetic material to repair it. In this process, viruses—masters at injecting foreign DNA into human cells—have become the go-to vehicle for delivering DNA cargo to target genes.

Yet, skepticisms about the safety of gene therapy have long simmered, especially after the story of Jesse Gelsinger, a patient who died days after being given adenovirus to ferry DNA into his liver. Gelsinger suffered a massive immune response triggered by the virus, leading to multiple organ failures and brain death. Indeed, the success of any viral gene therapy relies on evading the body’s immune system. Introducing a gene into patient DNA—where it was not originally meant to be—can have unpredictable effects. Immune responses can hinder gene delivery and reduce efficacy. In some instances, the response can be severe, leading to rapid destruction of the modified cells. While some viral vectors, such as adeno-associated viruses (AAVs), are better tolerated, they often fail to specifically target diseased cells and can be neutralized by antibodies from prior exposures. Yet, increasing dosage to counteract these issues can provoke the body’s immune system, raising the risk of an adverse reaction.

David Schaffer’s laboratory at UC Berkeley is enhancing the targeted delivery of AAV vectors into diseased tissues. Virus assembly, or “packaging,” involves three steps: first, the virus folds into a specific structure, then forms the capsid structure, and lastly packages the genetic material within the capsid. To boost the effectiveness of gene therapy, the key is to optimize the capsid—the virus’s outer protein shell that directly interacts with target cells. The capsid determines which cells are targeted, how efficiently the virus enters cells, and how likely the gene therapy is to cause an immune response. Using an approach mimicking natural evolution, termed “directed evolution,” the researchers create large libraries of new capsids by randomly modifying their traits and evaluating which changes improve the delivery of gene material to tissues. The most promising variants are selected and further amplified iteratively, until the desired traits are achieved.

However, the researchers found that as most random changes reduce functionality, most of the capsids in these libraries cannot assemble correctly. Thus, much of the library is wasted even at the initial filter, lowering success rates for future iterations. One day, Schaffer received an email from Jennifer Listgarten, a professor in the Departments of Electrical Engineering and Computer Science and Bioengineering. Listgarten, who has spent her career applying machine learning to complex biological problems, had seen an advertisement for a talk Schaffer was giving on this challenge.

The duo came up with a plan: implement a machine learning approach to increase the proportion of successfully packaged viruses in these libraries, thus increasing downstream chances of success. The goal was to achieve a high number of viruses that can package successfully—a property known as “packaging fitness”—and to maintain structural diversity within the library. However, these two properties are strict trade-offs with one another. While high structural diversity originally led to much of the library being wasted, optimizing solely based on packaging fitness would result in a less diverse library.

The best libraries, the researchers found, live on a trade-off curve known as the Pareto frontier. Each point on the frontier represents an optimal solution between two conflicting properties. By setting weights for different values of packaging fitness and diversity to trace out the Pareto frontier, the researchers can identify the optimal libraries for any desired condition. The results were remarkable: the model generated libraries with five times higher packaging fitness than existing libraries while maintaining similar diversity.

The researchers then wondered if their model could be generalized to any directed evolution problem. To test this, they applied the algorithm to develop vectors that target adult brain tissues, aiming to treat neurological diseases. Impressively, the model produced 10 times more successful variants than existing libraries. During the process, the systematic efforts also uncovered a variant that specifically targets glial cells—crucial support cells for neurons. Historically, efforts to treat neurological diseases with gene therapy have yielded limited results, due to a lack of targeting tools for brain cells and the need for invasive procedures, such as drilling holes in a patient’s skull. The algorithm’s success will advance the design of vectors that can target brain cells, revolutionizing treatments for neurodegenerative conditions such as Parkinson’s and Huntington’s diseases. The variants identified through this algorithm could advance to clinical trials and expand the treatment landscape for gene therapies, unlocking new opportunities to cure a wide range of genetic diseases.

This article is part of the Fall 2024 issue.