A new study, published in the magazine Cell, It develops artificial intelligence (AI) designing synthetic molecules capable of controlling gene expression in healthy mammalians.
The Genomic Regulation Center (CRG) team has carried out a research project that generates DNA regulatory sequences not present in nature. From an AI model you can ask for the interface that creates synthetic DNA fragments with personalized criteria such as: “Activates this gene in stem cells that will become red blood cells, but not in platelets,” the authors point out.
Next, the model predict The combination of DNA letters (A, T, C, g) necessary for the desired gene expression patterns in specific cell types. Thus, researchers can use this information to chemically synthesize DNA fragments of approximately 250 letters and add them to a virus to deliver it inside the cells.
From a AI model you can ask the interface that creates synthetic DNA fragments with custom criteria
To verify their effectiveness, the researchers asked the AI to design synthetic fragments that will activate a gene that encodes a fluorescent protein in some cells and thus leave the gene expression patterns unaltered in other types.
The automatic learning of this technology created these fragments from scratch and the researchers inserted them into Blood mouse cellswhere synthetic DNA merged with the genome in random places. The result obtained was exactly the desired and expected.
Possible application in gene therapies
According to scientists, the study could help develop new Genic therapies that increase or reduce the activity of some genes. In addition, it facilitates the way to new strategies to adjust a patient’s DNA combinations and make treatments more effective and reduce side effects.
“Potential applications are huge. It is like writing software, but for biology. It provides new forms of instruct to a cell and guide the way they develop and behave with unprecedented precision, ”he says Robert Frömelfirst author of the study that carried out the work in the CRG.
It’s like writing biological software, provides new ways to instruct a cell and guide the way they develop and behave with unprecedented precision
Obert Frömel, first author of the study (CRG)

Milestone in generative biology
The study marks a milestone in the field of generative biology. To date, advances in this field had focused on protein design to create enzymes and antibodies. However, many human diseases derive from a defective gene expression that is specific to the type of cell, for which it is possible that there is never the perfect protein for a potential drug.
According to SINC Lars Veltengroup Chief in the CRG, “the objective of this work is to understand the ‘grammar’ of the potentiating DNA, that is, how genetic expression works and regulates.” The results have been obtained using mouse blood cells and “if used in living organisms in the future, we would need short -term safety studies in animals,” he clarifies.

We have created an AI that helps us to understand and elaborate the short phrases of DNA, a step that one day could make gene therapies safer and more precise and precise
Lars Velten, group leader at the CRG

Regarding issues Bioethicsthe co -author adds: “We do not rewrite complete genomes. We have created an AI that helps us understand and elaborate the short phrases of DNA, a step of knowledge that one day could make gene therapies safer and more precise.”
-“The model suggests DNA designs, but a rigorous laboratory validation decides which ones advance. In addition, we add elements of control Only non -reproductive (somatic) cells, so no change goes to future generations, ”he clarifies.
Huge volumes of biological data
The gene expression is controlled by regulatory elements such as potentiating, small DNA fragments that activate or deactivate genes. To correct defective gene expression, AI can help design ultra -elelective potentiators that do not exist in nature and create therapies without unwanted effects on healthy cells.
development This automatic learning technology requires a large amount of high quality data, but in the case of potentiators there is a scarce record. For that reason, the researchers created huge volumes of biological data to build your AI model.
Through thousands of experiments with human blood training laboratory models, both the potentials and transcription factors, proteins that also intervene in the control of gene expression are studied.
“To create a language model for biology, we must understand the language that cells speak. decipher These grammar rules for potentiars and thus be able to create completely new words and phrases, ”he explains Velten.
The largest enhancer library
Until now, potentiars and transcription factors have been studied using cellular cancer lines because it is easier to work with them. On the other hand, the authors of this research They studied healthy cells because it is more representative of human biology. His work helped discover subtle mechanisms that shape our immune system and the production of blood cells.
For five years, the team created the largest library of synthetic potentiators in blood cells, with more than 64,000 sequences designed
For five years, the team designed more than 64,000 synthetic enhancers, each meticulously built to prove its interaction with the union sites for 38 different transcription factors. It is the largest library of synthetic potentiators ever built in blood cells to date.
Once inserted in cellsthe team measured the activity of each synthetic enhancer in seven stages of the development of blood cells. With this method they discovered that many potentiators activate genes in a cell type, but repress gene activity in another.
Most potentiators function as the volume of a radio that increases or decreases gene activity while others act as switches On or off. The study authors call it ‘negative synergy’.
Most potentiators function as the volume of a radio that increases or decreases the activity of genes, while others act as ignition or off switches
The data of the experiments were crucial to establish the design principles of the model of Automatic learning. Once the model had enough measurements on how each synthetic enhancer changed gene activity in real cells, this could predict new DNA codes.
The study was designed as a Pilot project To determine if technology would work before starting a larger scale investigation. This is only the tip of the iceberg, since both humans and mice have approximately 1,600 transcription factors responsible for regulating their genomes.
Reference:
Frömel, R. et. al. “” Cell (2025)