You are on page 1of 34

Development of Methods for de novo Design of Functional Drugs and Catalyst Compounds

PhD thesis defense

Yunhan Chu
Department of Chemistry, Norwegian University of Science and Technology (NTNU)

21 June 2011

Outline
Introduction Overview of GeneGear for de novo design Evolutionary de novo drug design by GeneGear Evolutionary de novo coordination catalyst design by GeneGear A knowledge-based approach of GeneGear for constraining de novo EA search space Conclusions Acknowledgements

21 June 2011

21 June 2011

How to explore chemical space ?

Exploring known chemical space High Throughput Screening (HTS) Virtual Screening (VS) Exploring novel chemical space De novo design Using computer to produce novel molecular structures with desired properties by taking chemical space as a source

21 June 2011

Exploring chemical space by de novo design


De novo Design How to sample chemical structures Molecular representation Building blocks Structural operations How to evaluate chemical structures Scoring function How to navigate through the space smartly Search algorithm

Schneider G. et al., Nat. Rev. Drug Discov., 4:649663, 2005

21 June 2011

GeneGear An open source software for de novo design


Advantages of GeneGear: Freedom in how the system is used, modified and extended Design of non-medicinal compounds

21 June 2011

GeneGear De novo design by an Evolutionary Algorithm (EA)

21 June 2011

GeneGear Building blocks (fragments)


Molecules

National Cancer Institute (NCI) diversity set (1990 molecules)

Split

Screen

1151 fragments Fragments


8 21 June 2011

GeneGear Structural representation and operation

Crossover
1
N N

2
N O

3
O N O N

4
N

N N

N N

N F N N F F O N O Cl N N

N F

Cl N O N O N F N O N N O

Cl N

F O N O N

Cl

21 June 2011

GeneGear Scoring function


Receptor-based scoring
Receptor-ligand binding free energy (affinity) Force-field based function (AutoDock) Empirical and knowledge-based function (Vina)

Ligand-based scoring
Molecular similarity Quantitative structure-activity relationship (QSAR)

Multiobjective scoring
f(p) = w1p1 + w2p2 + ... + wnpn

10

21 June 2011

GeneGear application - Evolutionary drug design

11

21 June 2011

Building a fragment library


NCI diversity set (1990 molecules)

Split

Indinavir a HIV-1 protease inhibitor

Screen

1154 Fragments

select
12

98 entries
21 June 2011

Design of HIV-1 protease inhibitor


Ligand-based scoring (similarity) Receptor-based scoring (binding energy)

Indinavir fragment set


13

NCI fragments
21 June 2011

Design of HIV-1 protease inhibitor - contd


Multiobjective scoring (half-to-half weighted combination of receptor- and ligand-based strategy) Indinavir

14

21 June 2011

GeneGear application - Evolutionary coordination catalyst design

15

21 June 2011

Characteristics of coordination compound


ionic neutral covalent bond dative bond ionic neutral metal Traditional de novo methods lack the following functions: To maintain and protect the coordination center To retrieve information associated with the coordination center To vary ligand groups in a restricted and meaningful manner To maintain possible characteristics of symmetric structures
16 21 June 2011

Representation of coordination compound


Model Example

lead c: core, t: trial, f: free

17

21 June 2011

Assembly of coordination compounds

Growing from a lead

18

21 June 2011

Assembly of coordination compounds - contd


Crossover of free parts

Mutation of free part

19

21 June 2011

Case study: Ruthenium catalyst for olefin metathesis

Occhipinti G. et al., J. Am. Chem. Soc., 128:69526964, 2006.

20

21 June 2011

A PLSR-based QSAR model for productivity

PM6 optimized geometry of 14-electron active complex Q2=0.85, RMSECV=1.46 kcal/mol

21

21 June 2011

Design of ruthenium catalysts


L Cl Ru Cl

R4 R2 R1 P Cl Ru Cl Cl Ru R3 R1 N

R3 N

R4 N

R3 N

R2 Cl

R1

R2 Cl

Ru Cl

22

21 June 2011

Fragment library design


Lead library
M R1 N N R2 NHC R1 R1 M P R3 R3 phosphine PAr3 NHC Cl NHC FF R2 P M M N R2 Cl R4 R3 R1 R2 M N R2 NHC H R2 R1 N M N R2 NHC HH R1 N

C M

P C R3

M: Cl2Ru=CH2

PR3

R1 N

Cl M

R1 N N R2

F F

Free library

2238 fragments (1155 side chains + 1083 scaffolds) derived from KEGG database

23

21 June 2011

Parameter setup for EA experiments

24

21 June 2011

Results of EA experiments evolution trends

NHC

> phosphine

(Second gen.) (First gen.)

Average of predicted productivity increases smoothly over generations

25

21 June 2011

Results of EA experiments evolution trends (contd)

26

21 June 2011

Results of EA experiments high active complex

Complex
N1 N2 N3

DFT-calc. prod. (kcal/mol)


2.8 1.9 4.8
27 21 June 2011

A knowledge-based approach of GeneGear for constraining de novo EA search space

28

21 June 2011

Advantage and challenge of evolutionary algorithms in de novo design


Advantages: Sampling a diverse chemical space Providing solutions to a wide range of objective problems Performing well in searching a large and complex space

Challenge: Production of chemically insensible structures New molecules New molecules Bias filter Fitness function Discarded
29 21 June 2011

EA EA

Fitness function

Building a bias filter (BF)


A bias molecule set to sample positive and negative examples. A set of structure descriptors to characterize the bias set structure space. A classification method to model the positive/negative boundary.

30

21 June 2011

Application of bias filter (BF)


bias filter 290 Factor Xa inhibitors, 77 descriptors k-NN (k = 2) ?

Validation LOO CV Test set (145)

Accuracy 94% 93%

high lipophilic region, logP > 4.0

31

21 June 2011

Results of BF and non-BF experiment

logP > 4

logP > 4.8

32

21 June 2011

Conclusions
De novo design is an important concept that allows a variety of computational knowledge, methods and tools to be implemented to explore chemical space. GeneGear has been tested to be effective at de novo design of functional molecules such as drugs by the implementation of a parallel EA framework. A new EA facilitated with special molecular representation and operations, quantum chemistry, and QSAR analysis is adapted for optimization of coordination compounds. A knowledge-based approach built with chemometrics, multivariate analysis, and machine learning is able to to constrain de novo EA searched space.

33

21 June 2011

Acknowledgments

The Department of Chemistry, NTNU is gratefully thanked for funding this research. Prof. Bjrn K. Alsberg is thanked for all his support and help. Members of Physical Chemistry group are thanked for their good advice.

Thank you for your attention!

34

21 June 2011

You might also like