Note: ENNAACT is a novel tool which employs neural networks for anticancer activity classification for therapeutic peptides

Note: ENNAACT is a novel tool which employs neural networks for anticancer activity classification for therapeutic peptides

doi:10.1016/j.biopha.2020.111051


  • Using a novel sequence-based deep neural network classifier to predict ACP

  • ENNAACT is comparable to best-in-class, CV accuracy ~ 98.3%, Mathews correlation coefficient ~ 0.91, AUC ~ 0.95

  • https://research.timmons.eu/ennaact


ACP

  • 5-30 AA

  • Often drive from host defence peptides (HDP ~ peptide against microbes)


3 mechanisms have been proposed in terms of killing cancer cells

  • Cytoplasmic mb disruption via micellization or pore formation

  • Induction of apoptosis via disruption of the mitochondrial mb (peptide entering the cells w/o cell mb disruptive?)


Healthy cell vs cancer cells

Healthy cells

  • Zwitterionic cell mb

Cancerous cells

  • Cell mb ~ net negative charge, this is due to

    • phosphatidylserine 

    • O-glycosylated mucins

    • Sialylated ganglioside

    • Heparin sulfate

  • High cell surface area

  • Increased membrane fluidity


Most previous machine learning method

  • SVM

  • Random forests

Features used in previous studies;

  • Physicochemical features

  • AA composition

  • Dipeptide composition

  • Cho’s  pseudo AA composition


Deep learning approach (part of the ML)

  • Using neural network 


ENNAACT


Chou’s 5 steps rule for ML classifier

  • Valid dataset

  • Effective representation

  • ML should be trained on the representation of the dataset

  • Predictive performance ~ evaluate through means of CV

  • Predictor made available as web-server


NN developed well with large dataset

Source of ACP;

  • DBAASP

  • dbAMP

  • CancerPPD

  • BioPepDB

Peptide used as negative (bc no experimental validated for non-ACP activity)

  • Random non-secretory sequences from universal protein resource

  • ACP ~ usually secretory in nature (based on antimicrobial activity paper)

After cleaning (remove redundancy and gather all information)

  • ACP ~ 659 experimentally validate

  • non-ACP ~ 5298 peptides

  • 7-40 AAs


Model validation

  • Dataset is split to 12

  • 10 used for 10-fold cross validation

  • Model from cross validation -- > ensemble and evaluate through the external dataset (2 remaining subsets)

Feature extraction

  • Physicochemical descriptors

  • Composition descriptors

ML

  • PCA

  • t-SNE

  • SVM

  • RF

  • Dense fully connected neural networks

Neural network

  • Using keras and tensorflow to build up

Performance evaluation

  • Metrics evaluate imbalance dataset

  • Acc/Sn/Sp/MCC

  • ROC


They manually check the ACP is the real ACP from the original literature.


Comments

Popular posts from this blog

Useful links (updated: 2024-05-05)

SUSA Thailand - Sustainable University? (update 2023-06-23)

Genome editing technology short note