On May 19th, Nucleic Acids Research, an internationally renowned academic journal, published a Breakthrough Article titled "Synthetic promoter design in Escherichia coli based on a deep generative network" by Wang Xiaowo's research team from the Department of Automation, Tsinghua University online. This study is the first of its kind to use artificial intelligence to design and generate new gene promoters, thereby providing a new means for the design and optimization of biological regulatory elements.
Gene regulatory elements, as the cornerstone of synthetic biological systems, are widely used in metabolic engineering, gene therapy and other fields. The construction of engineering biological systems requires a large number of regulatory elements with excellent performance as support to adapt to the requirements of different chassis cells and working environments. In the past, synthetic components were acquired mainly through simple modification of natural components, such as random mutation of natural sequences, assembly of functional fragments, and experimental screening using directed evolution. On the one hand, these methods have a low success rate, and on the other hand, only elements that are very similar to the natural sequence can be obtained usually. As a result, it is difficult to find new regulatory elements. A sequence with a length of 100 bases, for example, consists of 4^100 possible sequence combinations, but natural elements only account for a small part of them, and the potential sequence space is far beyond the screening capacity of current experimental libraries.
With the advent of the era of artificial intelligence and big data, deep learning technology has displayed unique advantages in complex object characterization, multi-modal fusion, automatic sample generation and other issues, which provides new possibilities for the design of biomolecules. In this study, artificial intelligence technology was applied to construct new gene regulatory elements, and deep learning technology and biological priori knowledge were used to establish the generation model of regulatory elements from the perspective of automatic design. The success rate of experiments can be greatly improved by replacing random searching in biological experiments with algorithm searching in computers. The research team successfully realized the design and generation of the new gene promoter in E. coli. This method can generate a large number of new promoters, and the success rate of experimental verification has exceeded 70% after iterative optimization. These new elements designed by artificial intelligence method possess the statistical characteristics of the key features of natural elements, and have some non-natural typical sequence modes. They can achieve very low similarity with the natural promoter in the overall sequence arrangement, which reduce the risk of homologous recombination with the natural genome. In addition, the optimized artificial element can have higher transcriptional activity than that of the natural sequence. Theoretically, this approach can generate a vast amount of completely new elements that far exceed the number of natural promoters, greatly enriching the regulatory element library available for engineering biology research.
This study has proved the feasibility of using artificial intelligence to create new biological regulatory elements in practice, which is of great significance for promoting the intelligent design and construction of engineered biological systems with higher efficiency, safety and controllability. The intersection of artificial intelligence technology and engineering biotechnology may have a profound impact on promoting the development of metabolic engineering, molecular breeding, gene therapy and other fields.
This paper was selected as the highlight of Breakthrough Articles (which accounted for 1-3% of all articles published in the journal) by the editorial board of the journal Nucleic Acid Research. Breakthrough Articles are considered to "answer long-standing key questions in nucleic acid research, open new research areas, and represent the most influential and innovative research results published in the journal."
Associate Professor Wang Xiaowo from the Department of Automation is the corresponding author of this paper. Wang Ye and Wang Haochen, PhD students from the Department of Automation, are the co-first authors of this paper. Wei Lei, postdoctoral researcher of the Department of Automation, Liu Liyang, instructor of the Department of Automation, and Li Shuailin, undergraduate student of School of Life Sciences, are the co-authors of this paper. The research was funded by the Innovative Research Groups Program of the National Natural Science Foundation of China.
"Link to the introduction of the Breakthrough Article can be found on the NAR journal website: http://www.narbreakthrough.com"
Original text link:
https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkaa325/5837049