Artificial Intelligence Deciphers Biological Source Code

Scientists from Harvard University under the guidance of graduate student Yunha Hanga developed an artificial intelligence system that could decipher the complex language of genomics. This system, presented in the journal Nature Communications , opens new horizons for biologists in understanding the functions and regulation of genes .

Genomics is a kind of “source code” of biology that describes how various biological systems work and interact. The team of researchers introduced in artificial intelligence data on microbial metagenomas, the largest and most diverse set of genomic data, which made it possible to create a model of the genomic language (GLM). This model is able to learn on the basis of this data, understanding the functional “semantics” and the regulatory “syntax” of each gene.

The importance of the study is emphasized by the fact that most genes of even well -studied organisms remain poorly characterized. GLM provides a unique opportunity to understand how different genes work together, identifying new biological mechanisms and functions.

A group of researchers from different disciplines, including microbiology, genomics, bioinformatics, protein science and machine learning, was able to overcome the restrictions of the traditional methods of protein annotation, which consider proteins separately, not taking into account their interaction. GLM, on the contrary, integrates the concept of the “neighborhood of genes” with language models, providing a comprehensive view of the interaction of proteins.

GLM model opens doors for new discoveries in genomics, allowing scientists to open new genomic patterns and study previously non -enhanced genes. This can accelerate the opening of new biotechnological solutions for problems of climate change and bio -economy.

GLM work is a significant achievement in interdisciplinary cooperation, promoting the science of life -life and providing scientists with a powerful tool for studying biological mechanisms at the genetic level.

/Reports, release notes, official announcements.