SIMLIN: a bioinformatics tool for prediction of S-sulphenylation in the human proteome based on multi-stage ensemble-learning models
dc.contributor.author | Wang, X. | |
dc.contributor.author | Li, C. | |
dc.contributor.author | Li, F. | |
dc.contributor.author | Sharma, V.S. | |
dc.contributor.author | Song, J. | |
dc.contributor.author | Webb, G.I. | |
dc.date.issued | 2019 | |
dc.description.abstract | Background: S-sulphenylation is a ubiquitous protein post-translational modification (PTM) where an S-hydroxyl (-SOH) bond is formed via the reversible oxidation on the Sulfhydryl group of cysteine (C). Recent experimental studies have revealed that S-sulphenylation plays critical roles in many biological functions, such as protein regulation and cell signaling. State-of-the-art bioinformatic advances have facilitated high-throughput in silico screening of protein S-sulphenylation sites, thereby significantly reducing the time and labour costs traditionally required for the experimental investigation of S-sulphenylation. Results:In this study, we have proposed a novel hybrid computational framework, termed SIMLIN, for accurate prediction of protein S-sulphenylation sites using a multi-stage neural-network based ensemble-learning model integrating both protein sequence derived and protein structural features. Benchmarking experiments against the current state-of-the-art predictors for S-sulphenylation demonstrated that SIMLIN delivered competitive prediction performance. The empirical studies on the independent testing dataset demonstrated that SIMLIN achieved 88.0% prediction accuracy and an AUC score of 0.82, which outperforms currently existing methods. Conclusions: In summary, SIMLIN predicts human S-sulphenylation sites with high accuracy thereby facilitating biological hypothesis generation and experimental validation. The web server, datasets, and online instructions are freely available at http://simlin.erc.monash.edu/ for academic purposes. | |
dc.description.statementofresponsibility | Xiaochuan Wang, Chen Li, Fuyi Li, Varun S. Sharma, Jiangning Song, and Geoffrey I. Webb | |
dc.identifier.citation | BMC Bioinformatics, 2019; 20(1) | |
dc.identifier.doi | 10.1186/s12859-019-3178-6 | |
dc.identifier.issn | 1471-2105 | |
dc.identifier.issn | 1471-2105 | |
dc.identifier.orcid | Li, F. [0000-0001-5216-3213] | |
dc.identifier.uri | https://hdl.handle.net/2440/139591 | |
dc.language.iso | en | |
dc.publisher | Springer Science and Business Media LLC | |
dc.relation.grant | http://purl.org/au-research/grants/arc/DP120104460 | |
dc.relation.grant | http://purl.org/au-research/grants/arc/DP120104460 | |
dc.relation.grant | http://purl.org/au-research/grants/arc/LP110200333 | |
dc.relation.grant | http://purl.org/au-research/grants/nhmrc/1144652 | |
dc.relation.grant | http://purl.org/au-research/grants/nhmrc/490989 | |
dc.rights | © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. | |
dc.source.uri | https://doi.org/10.1186/s12859-019-3178-6 | |
dc.subject | Protein post-translational modification; S-sulphenylation; Bioinformatics software; Machine learning; Ensemble learning | |
dc.subject.mesh | Humans | |
dc.subject.mesh | Sulfamerazine | |
dc.subject.mesh | Proteome | |
dc.subject.mesh | Area Under Curve | |
dc.subject.mesh | ROC Curve | |
dc.subject.mesh | Computational Biology | |
dc.subject.mesh | Amino Acid Sequence | |
dc.subject.mesh | Amino Acid Motifs | |
dc.subject.mesh | Conserved Sequence | |
dc.subject.mesh | Algorithms | |
dc.subject.mesh | Software | |
dc.subject.mesh | Databases, Protein | |
dc.subject.mesh | Gene Ontology | |
dc.subject.mesh | Neural Networks, Computer | |
dc.title | SIMLIN: a bioinformatics tool for prediction of S-sulphenylation in the human proteome based on multi-stage ensemble-learning models | |
dc.type | Journal article | |
pubs.publication-status | Published |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- hdl_139591.pdf
- Size:
- 2.08 MB
- Format:
- Adobe Portable Document Format
- Description:
- Published version