Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks

Date

2021

Authors

Zhu, Y.
Li, F.
Xiang, D.
Akutsu, T.
Song, J.
Jia, C.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Journal article

Citation

Briefings in Bioinformatics, 2021; 22(4):1-11

Statement of Responsibility

Yan Zhu, Fuyi Li, Dongxu Xiang, Tatsuya Akutsu, Jiangning Song and Cangzhi Jia

Conference Name

Abstract

A promoter is a region in the DNA sequence that defines where the transcription of a gene by RNA polymerase initiates, which is typically located proximal to the transcription start site (TSS). How to correctly identify the gene TSS and the core promoter is essential for our understanding of the transcriptional regulation of genes. As a complement to conventional experimental methods, computational techniques with easy-to-use platforms as essential bioinformatics tools can be effectively applied to annotate the functions and physiological roles of promoters. In this work, we propose a deep learning-based method termed Depicter (Deep learning for predicting promoter), for identifying three specific types of promoters, i.e. promoter sequences with the TATA-box (TATA model), promoter sequences without the TATA-box (non-TATA model), and indistinguishable promoters (TATA and non-TATA model). Depicter is developed based on an up-to-date, species-specific dataset which includes Homo sapiens, Mus musculus, Drosophila melanogaster and Arabidopsis thaliana promoters. A convolutional neural network coupled with capsule layers is proposed to train and optimize the prediction model of Depicter. Extensive benchmarking and independent tests demonstrate that Depicter achieves an improved predictive performance compared with several state-of-the-art methods. The webserver of Depicter is implemented and freely accessible at https://depicter.erc.monash.edu/.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

© The Author(s) 2020. Published by Oxford University Press. All rights reserved.

License

Grant ID

Call number

Persistent link to this record