Note for the youtube lecture: Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery

Note for the youtube lecture: Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery

Link: https://youtu.be/uoVAd_zd-90

Drug

1.       Biological entity -- biologic

2.       Chemical based drugs – synthetic drugs, natural product, small molecule

Drug discovery – for one particular drug

-          10-15 years

-          Failure rate -- >90%

-          Cost ~2 billion USD

Drug discovery process (million to one compound and it could fail!!!)

1.       Identified target; ~30,000 protein (not include PTM processes)

2.       Screen for the hit compound – molecule disrupted the activity of particular protein

3.       Optimization of the hit compounds – called medicinal chemistry, scaffold hopping, bioisostere, structure-activity relationship – getting the potential compound (potent)

4.       ADMET (balance between potency and toxicity)

Computational drug discovery

-          Green chemistry; safe way to generate the compound, environment safe as well as less processes (meaning less chemical wastes)

How do we seek for the compounds

-          Nature resources

-          Computational approach – training computer to learn organic reaction; GDB 13, 17 databases

-          Known compounds; PubChem, ChEMBL

Drug discovery toolbox

Combinatorial chemistry

-          Scaffold

-          Functional group

Chemical libraries

Chemical space – diversity of compounds, approximately 10^60 molecules which <500 Da

HTS

Property filters

Computational chemistry

Machine learning

QSAR

Proteochemometrics

Molecular modeling

Molecular dynamics

Molecular docking

Computational model in drug discovery

-          Linking the chemical library to bioactivity

-          Training the computer to learn

-          By using this approach;

o   Using as a guideline to generate a good potency

-          Chemist generates many compounds

o   Which proteins could be bound to the particular compounds

o   Off-target

o   Similar compounds have the similar binding

o   How do the compound bind to the protein

Quantum chemistry;

-          Translate the distribution of electron into quantitative manner

-          Thus, ligand-based drug design is feasible

Fragmented –based drug design

-          13-heavy atoms – get fragment

-          Fragment-fragment – larger compounds – more diverse

Lipinski’s rule of 5

-          MW <500 Da, <5 hydrogen bond donors, <10 hydrogen bond acceptors, partition coefficient <5 [PubMed:11259830]

-          Collect 2000 FDA

-          Orally drug

-          Analyze the data – come up with common properties (safer chemical profiles for human uses)

-          How about the chemical which passing the rule but having the toxicity?

Lead-like rule of 3

-          Compound should be <300 Da

-          Lead should be small as much as possible due to during the modification step, more molecule will be added

Biological space

-          List of proteins which are druggable

-          Small molecules which are active on specific targets

-          AA sequence is not random

Structural classification of natural products

-          Arranges the scaffolds of the natural products in tree-like fashion

-          Providing a viable analysis- and hypothesis-generating tool for the design of natural-derived compound collections

Chemical space>Biological space

-          Privilege substructures – substructures that are present in many drugs, and predisposed to bioactivity.

Polypharmacology

-          One drug -- > multiple targets

-          Basic idea -- > using fragment to target 1 and fragment to target 2 -- > links those 2 fragments -- > but we have to make sure it fits the previous suggested rules (rule of 3 and 5) -- > we have to check whether it is feasible to synthesize

QSAR

-          Seeking the relationship between structure and activity

-          Chemical structure (functional group) – Activity (biological property, IC50, EC50, Ki, Km, MIC)

-          Multiple linear regression – Y (biological activity) = F(energy, Qm, dipole moment….)

-          Regression coefficient – informing whether particular feature (factor, etc., energy, dipole moment) has more or less effect on the bioactivity (dependence factors)

-          Application -- > we can predict the biological properties

QSAR vs Proteochemometrics

-          QSAR

o   Multiple chemical compounds -- > single target proteins

o   The prediction has the confidence score, the tested compound is compared with the training set – higher similarity -- > more confidence to be correct if less similarity with the training set -- > less confidence

-          Proeochemometrics

o   Multiple compounds -- > Multiple target proteins -- > we can do the drug repurposing

o   Just like doing meta-analysis, observing the relationship between many factors and many results

o   We can study selectivity of particular compound to many many proteins

o   This approach we can find the orphan receptor

 

 


Comments

Popular posts from this blog

Useful links (updated: 2024-04-26)

Genome editing technology short note

SUSA Thailand - Sustainable University? (update 2023-06-23)