添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
College of Mathematics and Physics, Qingdao University of Science and Technology
, Qingdao 266061,
China
School of Life Sciences, University of Science and Technology of China
, Hefei 230027,
China
Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology
, Qingdao 266061,
China
School of Mathematics and Statistics, Changsha University of Science and Technology
, Changsha 410114,
China
To whom correspondence should be addressed. [email protected] or [email protected]
Search for other works by this author on: Oxford Academic PubMed Google Scholar

Bin Yu, Wenying Qiu, Cheng Chen, Anjun Ma, Jing Jiang, Hongyan Zhou, Qin Ma, SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting, Bioinformatics , Volume 36, Issue 4, February 2020, Pages 1074–1081, https://doi.org/10.1093/bioinformatics/btz734

Abstract

Motivation

Mitochondria are an essential organelle in most eukaryotes. They not only play an important role in energy metabolism but also take part in many critical cytopathological processes. Abnormal mitochondria can trigger a series of human diseases, such as Parkinson's disease, multifactor disorder and Type-II diabetes. Protein submitochondrial localization enables the understanding of protein function in studying disease pathogenesis and drug design.

Results

We proposed a new method, SubMito-XGBoost, for protein submitochondrial localization prediction. Three steps are included: (i) the g -gap dipeptide composition ( g -gap DC), pseudo-amino acid composition (PseAAC), auto-correlation function (ACF) and Bi-gram position-specific scoring matrix (Bi-gram PSSM) are employed to extract protein sequence features, (ii) Synthetic Minority Oversampling Technique (SMOTE) is used to balance samples, and the ReliefF algorithm is applied for feature selection and (iii) the obtained feature vectors are fed into XGBoost to predict protein submitochondrial locations. SubMito-XGBoost has obtained satisfactory prediction results by the leave-one-out-cross-validation (LOOCV) compared with existing methods. The prediction accuracies of the SubMito-XGBoost method on the two training datasets M317 and M983 were 97.7% and 98.9%, which are 2.8–12.5% and 3.8–9.9% higher than other methods, respectively. The prediction accuracy of the independent test set M495 was 94.8%, which is significantly better than the existing studies. The proposed method also achieves satisfactory predictive performance on plant and non-plant protein submitochondrial datasets. SubMito-XGBoost also plays an important role in new drug design for the treatment of related diseases.

Availability and implementation

The source codes and data are publicly available at https://github.com/QUST-AIBBDRC/SubMito-XGBoost/ .

Supplementary information

Supplementary data are available at Bioinformatics online.