You are on page 1of 30

Quantitative StructureActivity Relationships

(QSAR)

Rationale for QSAR Studies


In drug design, in vitro potency addresses only part
of the need; a successful drug must also be able to
reach its target in the body while still in its active
form.
The in vivo activity of a substance is a composite of
many factors, including the intrinsic reactivity of
the drug, its solubility in water, its ability to pass the
blood-brain barrier, its non- reactivity with nontarget molecules that it encounters on its way to the
target, and others.

Rationale for QSAR Studies...


A quantitative structure-activity relationship (QSAR)
correlates measurable or calculable physical or
molecular properties to some specific biological
activity in terms of an equation.
Once a valid QSAR has been determined, it should be
possible to predict the biological activity of related
drug candidates before they are put through
expensive and time-consuming biological testing. In
some cases, only computed values need to be known
to make an assessment.

History of QSAR
The first application of QSAR is attributed to Hansch
(1969), who developed an equation that related
biological activity to certain electronic characteristics
and the hydrophobicity of a set of structures.
log (1/C) = k1log P - k2(log P)2 + k3 + k4
for: C = minimum effective dose
P = octanol - water partition coefficient
= Hammett substituent constant
kx= constants derived from regression analysis

Hanschs Approach
Log P is a measure of the drugs hydrophobicity,
which was selected as a measure of its ability to
pass through cell membranes.
The log P (or log Po/w) value reflects the relative
solubility of the drug in octanol (representing the
lipid bilayer of a cell membrane) and water (the
fluid within the cell and in blood).
Log P values may be measured experimentally
or, more commonly, calculated.

Calculating Log P
Log P = Log K (o/w) = Log ([X]octanol/[X]water)
most programs use a group additivity approach:
1 Aromatic ring
7 Hs on Carbon
1 C-Br bond
1 alkyl C

0.780
1.589
-0.120
0.195

CH2

Br

Sum = 2.924 = calc. log P

some use more complicated algorithms, including


factors such as the dipole moment, molecular size
and shape.

Hanschs Approach...
The Hammett substituent constant () reflects the
drug molecules intrinsic reactivity, related to
electronic factors caused by aryl substituents.
In chemical reactions, aromatic ring substituents
can alter the rate of reaction by up to 6 orders of
magnitude!
For example, the rate of the reaction below is ~105
times slower when X = NO2 than when X = CH3

C
H

Cl

CH3OH

C
H

OCH3 + HCl

Hammett Equation
Hammett observed a linear free energy
relationship between the log of the relative rate
constants for ester hydrolysis and the log of the
relative acid ionization (equilibrium) constants
for a series of substituted benzoic esters & acids.
log (kx/kH) = log (Kx/KH) =
He arbitrarily assigned , the reaction constant,
of the acid ionization of benzoic acid a value of 1.

Definition of Hammett
O

OH

O
+

substituent p
Eq. constant
-NH2
-0.66
0.00000554
-OCH3
-0.27
0.000015
-CH3
-0.17
0.000023
-H
0.00
0.000034
-Cl
0.23
0.000055
-COCH3
0.5
0.000088
-CN
0.66
0.000128
-NO2
0.78
0.000166

log K
-5.25649
-4.82391
-4.63827
-4.46852
-4.25964
-4.05552
-3.89279
-3.77989

Hammett Plot

Log K

-3.7
-3.9
-4.1
-4.3
-4.5
-4.7
-4.9
-5.1
-5.3

y = 0.9992x - 4.5305
R2 = 0.9907
-1

-0.5

0.5

sigma p

These p values are obtained from the best fit line having a slope = 1

Hammett Plot
Aryl substituent constants () were determined by
measuring the effect of a substituent on a reaction
rate (or Keq). These are listed in tables, and are
constant in widely different reactions.
Reaction constants () for other reactions may also
be determined by comparison of the relative rates
(or Keq) of two differently substituted reactants,
using the substituent constants described above.
Some of these values ( and ) are listed on the
following slide.

Hammett Rho & Sigma Values


Reaction (Rho) Values
O
CH2COCH3
X

CH2CO + CH3OH

= + 2.4

OH

C
H

Cl

CH3OH

C OCH3 + HCl

= - 5.0

Substituent (Sigma) Values (the electronic effect of the substituent;


negative values are electron donating)
p-NH2
-0.66
p-Cl 0.23
p-OCH3
-0.27
p-COCH3
0.50
p-CH3
-0.17
p-CN
0.66
m-CH3
-0.07
p-NO2
0.78

Molecular Properties in QSAR


Many other molecular properties have been
incorporated into QSAR studies; some of these
are measurable physical properties, such as:

density
pKa
ionization energy
boiling point
Hvaporization
refractive index
molecular weight
dipole moment ()
Hhydration
reduction potential
lipophilicity parameter
= log PX - log PH

Molecular Properties in QSAR


Other molecular properties (descriptors) that
have been incorporated into QSAR studies
include calculated properties, such as:

ovality
HOMO energy
polarizability
molecular volume
vdW surface area
molar refractivity
hydration energy

surface

area, molec. volume


LUMO energy
charges on individual atoms
solvent accessible surface area
maximum + and - charge
hardness
Tafts steric parameter

QSAR Methodology
Often it is found that several descriptors are
correlated; that is, they describe observables that
are closely related, such as MW and boiling point
in a homologous series.
Statistical analysis is used to determine which of
the variables best describe (correlate with) the
observed biological activity, and which are crosscorrelated. The final QSAR involves only the most
important 3 to 5 descriptors, eliminating those
with high cross-correlation.

Limit to the # of Descriptors


The data set should contain at least 5 times as
many compounds as descriptors in the QSAR.
The reason for this is that too few compounds
relative to the number of descriptors will give a
falsely high correlation:

2 points exactly determine a line (2 compds, 2 prop)


3 points exactly determine a plane (etc., etc.)
A data set of drug candidates that is similar in
size to the number of descriptors will give a high
(and meaningless) correlation.

Statistical Analysis of Data


Multiple linear regression analysis can be
accomplished using standard statistical software,
typically incorporated into sophisticated (and
expensive) drug design software packages, such as
MSIs Cerius2 (academic price, over $20K)
An inexpensive statistical analysis software StatMost
(academic price, $39) works just fine.
To discover correlated variables and determine which
descriptors correlate best, a partial least squares or
principal component analysis is done.

Example of a QSAR
Br
X

CH3
N

CH3

Anti-adrenergic Activity and Physicochemical Properties


of 3,4- disubstituted N,N-dimethyl--bromophenethylamines
=Lipophilicity parameter
=Hammett Sigma+ (for benzylic cations)
Es(meta) = Tafts steric parameter

Example of a QSAR...
m-X

p-Y

H
F
H
Cl
Cl
Br
I
Me
Br
H
Me
H
Cl
Br
Me
Cl
Me
H
H
Me
Br Br
Br Me

H
H
F
H
F
H
H
H
F
Cl
F
Br
Cl
Cl
Cl
Br
Br
I
Me
Me
1.96
1.46

0.00
0.13
0.15
0.76
0.91
0.94
1.15
0.51
1.09
0.70
0.66
1.02
1.46
1.64
1.21
1.78
1.53
1.26
0.52
1.03
0.56
0.10

Es(meta)
0.00
0.35
-0.07
0.40
0.33
0.41
0.36
-0.07
0.34
0.11
-0.14
0.15
0.51
0.52
0.04
0.55
0.08
0.14
-0.31
-0.38
0.08
0.08

log (1/C)obs log (1/C)a

log (1/C)b

1.24
0.78
1.24
0.27
0.27
0.08
-0.16
0.00
0.08
1.24
0.00
1.24
0.27
0.08
0.00
0.27
0.00
1.24
1.24
0.00
9.35
9.52

7.82
7.45
8.09
8.11
8.38
8.30
8.61
8.51
8.57
8.46
8.78
8.77
8.75
8.94
9.15
9.06
9.46
9.06
8.87
9.56
9.29
9.33

7.46
7.52
8.16
8.16
8.19
8.30
8.40
8.46
8.57
8.68
8.82
8.89
8.89
8.92
8.96
9.00
9.22
9.25
9.30
9.30
9.25
9.35

7.88
7.43
8.17
8.05
8.34
8.22
8.51
8.36
8.51
8.60
8.65
8.94
8.77
8.94
9.08
9.11
9.43
9.26
8.98
9.47

Calc.

Calc.

Example of a QSAR...
QSAR Equation a: (using 2 variables)
log (1/C) = 1.151 - 1.464 + + 7.817
(n = 22; r = 0.945)
QSAR Equation b: (using 3 variables)
log (1/C) = 1.259 - 1.460 + + 0.208 Es(meta) + 7.619
(n = 22; r = 0.959)

Example of a QSAR...
m-X

p-Y

H
F
H
Cl
Cl
Br
I
Me
Br
H
Me
H
Cl
Br
Me
Cl
Me
H
H
Me
Br Br
Br Me

H
H
F
H
F
H
H
H
F
Cl
F
Br
Cl
Cl
Cl
Br
Br
I
Me
Me
1.96
1.46

0.00
0.13
0.15
0.76
0.91
0.94
1.15
0.51
1.09
0.70
0.66
1.02
1.46
1.64
1.21
1.78
1.53
1.26
0.52
1.03
0.56
0.10

Es(meta)
0.00
0.35
-0.07
0.40
0.33
0.41
0.36
-0.07
0.34
0.11
-0.14
0.15
0.51
0.52
0.04
0.55
0.08
0.14
-0.31
-0.38
0.08
0.08

log (1/C)obs log (1/C)a

log (1/C)b

1.24
0.78
1.24
0.27
0.27
0.08
-0.16
0.00
0.08
1.24
0.00
1.24
0.27
0.08
0.00
0.27
0.00
1.24
1.24
0.00
9.35
9.52

7.82
7.45
8.09
8.11
8.38
8.30
8.61
8.51
8.57
8.46
8.78
8.77
8.75
8.94
9.15
9.06
9.46
9.06
8.87
9.56
9.29
9.33

7.46
7.52
8.16
8.16
8.19
8.30
8.40
8.46
8.57
8.68
8.82
8.89
8.89
8.92
8.96
9.00
9.22
9.25
9.30
9.30
9.25
9.35

7.88
7.43
8.17
8.05
8.34
8.22
8.51
8.36
8.51
8.60
8.65
8.94
8.77
8.94
9.08
9.11
9.43
9.26
8.98
9.47

Calc.

Calc.

QSAR of Antifungal Neolignans


The PM3 semi-empirical method was employed to
calculate a set of molecular properties (descriptors) of
18 neolignan compounds with activities against
Epidermophyton floccosum, a most susceptible species
of dermophytes. The correlation between biological
activity and structural properties was obtained by
using the multiple linear regression method. The QSAR
showed not only statistical significance but also
predictive ability. The significant molecular descriptors
related to the compounds with antifungal activity were:
hydration energy (HE) and the charge on C1' carbon
atom (Q1'). The model obtained was applied to a set of
10 new compounds derived from neolignans; five of
them presented promising biological activities against
E. floccosum.

Neolignans

Descriptors Used
Log P: the values of this property were obtained from the
hydrophobic parameters of the substituents;
superficial area (A) and molecular volume (V), log of the partition
coefficient (Log P), hydration energy (HE): properties evaluated
with the molecular modeling package HyperChem 5.0;
partial atomic charges (Qn) and bond orders (Ln) derived from
the electrostatic potential;
energy of the HOMO (H) and LUMO (L) frontier orbitals;
hardness (): obtained from the equation =(ELUMO-EHOMO)/2;
Mulliken electronegativity (): calculated from the equation =
-(EHOMO+ELUMO)/2;
other electronic properties were calculated: total energy (ET),
heat of formation (Hf); ionization potential (IP), dipole moment
() and polarizability (POL), whose values were obtained from the
molecular orbital pprogram Ampac 5.0.

Two Most Important Descriptors

Antifungal QSAR
Log 1/C = -2.85 - 0.38 HE - 1.45 Q1'
F=29.63, R2=0.86, Q2=0.80, SEP=0.
where:
F is the Fisher test for significance of the eqn.
R2 is the general correlation coefficient,
Q2 is the predictive capability, and
SEP is the standard error of prediction.
A.A.C. Pinheiro, R.S. Borges, L.S. Santos, C.N. Alves,
Journal of Molecular Structure: THEOCHEM, Vol 672, pp 215-219 (2004).

QSAR-Calculated Antifungal Activity

New Neolignans

Example of a Pharmacophore
2D Hypothesis and Alignment

3 Dimensional QSAR Methods


Important regions of bioactive molecules are
mapped in 3D space, such that regions of
hydrophobicity, hydrophilicity, H-bonding
acceptor, H-bond donor, -donor, etc. are rendered
so that they overlap, and a general 3D pattern of
the functionally significant regions of a drug are
determined.
CoMFA (Comparative
testosterone
Molecular Field Analysis)
is one such approach:

CoMFA of Testosterone
Blue means electronegative
groups enhance, red means
Electng. grps reduce binding

Green means bulky groups


enhance, yellow means they
reduce binding

You might also like