Note: ENNAACT is a novel tool which employs neural networks for anticancer activity classification for therapeutic peptides
Note: ENNAACT is a novel tool which employs neural networks for anticancer activity classification for therapeutic peptides
doi:10.1016/j.biopha.2020.111051
Using a novel sequence-based deep neural network classifier to predict ACP
ENNAACT is comparable to best-in-class, CV accuracy ~ 98.3%, Mathews correlation coefficient ~ 0.91, AUC ~ 0.95
ACP
5-30 AA
Often drive from host defence peptides (HDP ~ peptide against microbes)
3 mechanisms have been proposed in terms of killing cancer cells
Cytoplasmic mb disruption via micellization or pore formation
Induction of apoptosis via disruption of the mitochondrial mb (peptide entering the cells w/o cell mb disruptive?)
Healthy cell vs cancer cells
Healthy cells
Zwitterionic cell mb
Cancerous cells
Cell mb ~ net negative charge, this is due to
phosphatidylserine
O-glycosylated mucins
Sialylated ganglioside
Heparin sulfate
High cell surface area
Increased membrane fluidity
Most previous machine learning method
SVM
Random forests
Features used in previous studies;
Physicochemical features
AA composition
Dipeptide composition
Cho’s pseudo AA composition
Deep learning approach (part of the ML)
Using neural network
ENNAACT
Using neural networks approach to create a model
Using primary AA sequence as input to generate model
Chou’s 5 steps rule for ML classifier
Valid dataset
Effective representation
ML should be trained on the representation of the dataset
Predictive performance ~ evaluate through means of CV
Predictor made available as web-server
NN developed well with large dataset
Source of ACP;
DBAASP
dbAMP
CancerPPD
BioPepDB
Peptide used as negative (bc no experimental validated for non-ACP activity)
Random non-secretory sequences from universal protein resource
ACP ~ usually secretory in nature (based on antimicrobial activity paper)
After cleaning (remove redundancy and gather all information)
ACP ~ 659 experimentally validate
non-ACP ~ 5298 peptides
7-40 AAs
Model validation
Dataset is split to 12
10 used for 10-fold cross validation
Model from cross validation -- > ensemble and evaluate through the external dataset (2 remaining subsets)
Feature extraction
Physicochemical descriptors
Composition descriptors
ML
PCA
t-SNE
SVM
RF
Dense fully connected neural networks
Neural network
Using keras and tensorflow to build up
Performance evaluation
Metrics evaluate imbalance dataset
Acc/Sn/Sp/MCC
ROC
They manually check the ACP is the real ACP from the original literature.
Comments
Post a Comment