
A team of Catalan researchers has developed an artificial intelligence tool (AI) that takes a decisive step in the understanding of Alzheimer’s and other pathologies related to protein aggregation. The tool, baptized as Canya, has been presented in a study published in the journal Science Advances, the result of collaboration between the Genomic Regulation Center (CRG) and the Bioengineering Institute of Catalonia (IBEC).
The AI has managed to decipher the secret language used by proteins to decide if they fold correctly or form “sticky”, a pathological condition that alters cellular functioning and is related to more than 50 diseases, including Alzheimer’s disease, which affect about 500 million people throughout the world.
Unlike other “black box” type, whose processes are opaque, Canya was designed with an “explainable” approach, which allows scientists to understand how it reaches their predictions. This transparency has allowed to identify the chemical patterns that drive harmful aggregation, a key advance for both biomedical research and the pharmaceutical industry.
A difficult biological language to translate
Proteins are composed of 20 types of amino acids, whose order determines its form and function. Understanding what combinations cause pathological aggregations has been a scientific challenge for decades. The above tools faced a great limitation: the shortage of reliable data on this process.
To train Canya, the team generated more than 100,000 fragments of synthetic proteins, each with 20 amino acids, and studied their behavior in yeast cells. If a fragment caused aggregation, the cells reacted in a measurable way, which allowed establishing cause-effect relationships.
“The language of proteins is like an unexplored galaxy,” explains Dr. Mike Thompson, of CRG. “So far only a small fraction had been mapped. Our approach allows us to expand that map with thousands of combinations never seen in nature.”
Implications for health and industry
In addition to its relevance for neurodegenerative diseases, Canya’s most immediate impact could occur in the biotechnological and pharmaceutical field. Many modern medications are based on therapeutic proteins that, if added, lose their effectiveness and generate manufacturing problems.
“Protein aggregation is a great headache for pharmaceuticals,” says Dr. Benedetta Bolognesi, of IBEC. “Canya can help design more stable antibodies and enzymes, reducing production failures and economic costs.”
Despite having resigned from the predictive capacity offered by the most opaque AI, Canya has exceeded 15 % the precision of the previous models, demonstrating that transparency and efficiency do not have to be at odds.
This advance positions Catalan science at the forefront of artificial intelligence applied to molecular biology, and opens a new stage in the fight against diseases that, until now, seemed impossible to prevent or treat from its molecular origin.