Professional Documents
Culture Documents
Key Notes:
Ab Initio Basics:
Ab initio is the Latin term for “from first principles”, or, “from scratch”. In ab initio methods,
100% of the model is done mathematically, based primarily on Schrödinger’s equation. Using
several constants, such as the speed of light and Planck’s constant, and the masses of the electrons
and nuclei, we can use ab initio methods to calculate a wide variety of properties.
Ab Initio Methods:
The primary goal in the use of ab initio methods is the choice of what is known at the model
chemistry. The model chemistry describes a mathematical approach to solving the Schrodinger
equation for any molecule. In choosing a model chemistry, one proposes a level of theory (such as
a Hartree-Fock method) and a basis set (described earlier). At its most basic level, ab initio
methods state that if one knows the structure of the molecule, one should be able to perform a
complete calculation of that molecule completely from mathematical principles.
Advantages:
The primary advantage of ab initio methods is the accuracy with which calculations are
performed. To the degree that a chemist needs to know a property that most accurately matches
experimental data, or that most approximates a theoretical prediction, the ab initio method is
chosen. Fundamentally, ab initio is the most accurate and precise of all of the currently available
methods in molecular modeling.
Disadvantages:
Ab initio methods can currently only be applied to small molecular systems. As a general
guideline, most computational chemists hold the upper limit for use of ab initio methods to be
around 50 atoms. This upper limit is almost completely dependent on the computational power
one has at his or her disposal. As computing power improves (primarily through the use of
massively parallel supercomputers), we should be able to come closer to an exact solution of the
Schrödinger equation.
The more progress physical sciences make, the more they tend to enter the domain of
mathematics, which is a kind of centre to which they all converge. We may even judge the
The underlying physical laws necessary for the mathematical theory of a large part of
physics and the whole of chemistry are tthus completely known, and the difficulty is only
that the exact application of these laws leads to equations much too complicated to be
soluble.
P.A.M Dirac 1902-1984
We are perhaps not far removed from the time when we shall be able to submit the bulk
of chemical phenomena to calculation.
Joseph Louie Gay-Lussac 1778-1850
Ab Initio Basics:
Ab initio comes from the Latin phrase “from first principles”, or, more simply, “from scratch”. Ab initio is the only
computational chemistry method that is 100% mathematical. Unlike other methods that will subsequently defined
and described, ab initio methods do not use any experimental data or other parameters to attempt to calculate
information about a molecule or molecular system. The two quotes shown above describe ab initio well: the first
states that mathematics can “perfectly” describe a physical system, and this certainly applies to chemistry. The
Dirac quote states that (as of 1929, when this quote was made) we know all of the mathematics required to complete
describe a chemical system; the only problem being (again, in 1929), we don’t have any way to solve them. This is
not the case in 2006, now that computers are capable of teraflop (trillions of calculations per second) speeds. And,
finally, the Gay-Lussac quote becomes more and more of a reality everyday!
Ab Initio Methods:
Ab initio methods are unarguably the most accurate, as well as the most difficult, of all of the techniques currently in
use in the field of molecular modeling. A significant reason for this is that, unlike other methods, the ab initio
method really does start “from scratch”. Beginning with just the molecular structure and a few constants – the speed
of light (c), Planck’s constant (h), the mass (me) and charge (qe) of the electron – one can calculate a score of
chemical properties, make insights into the reactivity of a molecule, and “see” the shapes and sizes of molecular
orbitals. All of this comes at a price – both figurative and literal – as is discussed below under “Advantages” and
“Disadvantages”.
Needless to say, the underlying mathematics of ab initio methods are very complicated, involving the solution of
integrals, the establishment and solution of complicated matrices, and the establishment of equations that can only
be solved through the repetitive abilities of computers. Most of the mathematics found in ab initio methods lies well
beyond the scope of this Guide, although for a reader who has progressed through a solid year of calculus, the
mathematics are accessible.
What is important for all users to understand, is the concept of model chemistry. A model chemistry is a complete
mathematical description of the particular calculation. In simplest terms, the model chemistry has two components:
the specific theory being used, and the specific basis set that is being used as the starting point for the calculation.
There are a number of theories, and we will describe a few of them in this reading. The most basic of all theories is
the Hartree-Fock method, named after the two physicists (note: not chemists!) who developed the system. The
“HF” method is also sometimes known as the “self-consistent field (SCF)” theory, which is a better description of
what happens. Most computational chemistry software packages, however, give you pull-down menus that say
“Hartree-Fock” or “RHF” (restricted Hartree-Fock, meaning that all of the electrons are paired) or “UHF”
So what is the self-consistent field theory? Mathematically, it is quite complicated, but conceptually relatively
simple. A procedural description is as follows:
1. Begin with a set of approximate orbitals (a basis set) for all of the electrons in the system
2. Select one electron as a starting electron
3. Calculate the potential (the energy of the system) in which it moves by "freezing" the distribution of all the
other electrons by treating their averaged distribution as a single ("centrosymmetric") source of potential
4. Calculate the Schrodinger equation for the selected electron, resulting in a new, more accurate orbital for
that electron
5. Repeat the procedure for all the other electrons in the system.
6. A single cycle is complete once each electron has been evaluated
7. Begin the process again with the first electron evaluated, using the newly calculated orbitals as the starting
point.
8. Continue this process through the iteration (repeating, or cycling) process until a pass through the
calculations does not change the values of the orbitals
9. Declare the calculation to be done, as the orbitals are now considered to be "self-consistent".
Several observations may have come to mind (and if they didn’t, you should not be concerned!). If you have not
read the chapter on Mathematics, you might consider doing so! In the procedure above, there is no mention of
nuclei – the Born-Oppenheimer approximation. The procedure also talks about treating the electrons as “averaged”
– the Hartree-Fock approximation. By calculating the energy of an electron as measured against all of the other
electrons combined into one big electron, we have an “uncorrelated” system. This lack of electron correlation
introduces a fair degree of inaccuracy to our calculations.
Hartree-Fock, or SCF methods, therefore, do not include electron correlation. This limitation is being addressed with
the development of newer, “post-SCF” methods that do attempt to take into account electron correlation. Some of
these methods are listed below:
• Moller-Plesset (MP) perturbation theory
• Configuration Interaction (CI) theory
• Coupled Cluster (CC) theory
There are several levels of the MP theory, indicated by the number following the abbreviation “MP”, as in MP2,
MP3, etc. The references will often indicate all of these methods with the notation “MP(n)”.
In CI theory, if we replace a single occupied electron orbital with a single virtual orbital, we call that a “single
substitution”, and use the notation CIS. Likewise, replacing two occupieds with two virtuals is a double
substitution, so indicated by the term CID. Why not replace all of the occupied orbitals with virtual orbitals,
which we would label as “Full CI”? As you might be able to determine, the use of Full CI methods is very
impractical without a very powerful supercomputer studying very small molecules. The use of single, double,
triple, and quadruple substitutions is an acknowledgement of the near-impossibility of using a full CI level of
theory.
The problem with doing these substitutions is that it does a fairly poor job of maintaining size consistency. This
is a requirement of any theoretical model. This requirement states that the number of errors in a calculation
should increase proportionally with the size of the molecule. Another way of describing size consistency is that
we can calculate the energy of two non-interacting molecules by adding up the energies of each molecule
calculated separately. The molecules would be non-interacting because of their large distance from each other.
CC methods are the most advanced of the current group of theories. You can identify the coupled cluster theory
by a notational system such as CCSD(T), and this method is available on the NC High School Computational
Chemistry server, using the Gaussian software package. In this notational system, the “CC” refers to coupled
cluster. In the example above, the “SD” refers to the use of a combination of singly and doubly excited electron
calculations. The “T” in brackets states that the method also includes a triple virtual orbital, coming from the
Moller-Plesset perturbation theory set of mathematics. On the Computational Chemistry server at Shodor,
Gaussian and GAMESS offer both CCSD and CCSD(T).
This leads us back to our description of model chemistry. As stated earlier, model chemistry provides a complete
mathematical description of how a calculation is to be performed. It consists of our choice of a theory and our
choice of a supporting basis set, the numbers used to begin the description of the electron orbital. If, for example,
we choose to do a calculation with the Hartree-Fock/SCF theory and a 6-32G* basis set, we would notate our model
chemistry as follows:
HF/6-31G*
Our calculation improves if we use a more robust theory – such as one of the electron correlation, or post-SCF
methods – and a more robust basis set, such as a triple valence, polarized and diffuse basis set such as 6-311+G(p,d).
If it were possible to choose the absolutely best theory and the most powerful basis set, we would reach an exact
solution of the Schrodinger equation! We are, however, a long way from reaching that goal. Indeed, an exact
analytic solution of Schrödinger’s equation is considered by many to be one of the “Holy Grail” areas of modern
chemistry.
On the right hand side of the diagram, we show the exact same configuration for the second oxygen atom. Now,
what happens when the two atoms of oxygen bond to form molecular oxygen, O2. (By the way: atomic oxygen is
quite toxic, while diatomic oxygen is quite necessary!). Electrons will move into molecular orbitals, or MOs.
Starting at the bottom, one electron from the first oxygen atom will move into the σ1s molecular orbital, and one
electron from the second oxygen will move to join it. The next two electrons move into the σ*1s orbital. As we
move up the diagram, we have this pairing going on, at least until we get to the “p” levels. At this level, we have 8
atomic electrons. Two of those electrons go into the σ2p molecular orbital, and the next four go into the π2p MO.
The last two go into the π2p MO orbital. You should note that these electrons are unpaired. Because of this, the
oxygen molecule has a characteristic known as paramagnetism, in this case, diamagnetism.
The diagram also shows the approximate energy levels, in electron-volts (eV) for each of the molecular orbitals
(MOs). For example, the σ2s MO has an energy value of -38.293 electron-volts. As we move up the diagram, notice
that the energy value gets higher (a smaller negative number). There is also a significance to the use of the asterisk
* notation. Any molecular orbital that does not have an asterisk is known as a bonding orbital, whereas those that
are marked with an asterisk are anti-bonding orbitals. If we count up the number of electrons in bonding orbitals
(10), subtract from that the number of electrons in anti-bonding orbitals (6), and divide that number by 2 (4/2), we
get the bond order. In this case, this indicates that molecular oxygen has two oxygen atoms connected with a double
bond.
It should be noted that MOs are a mathematical construct, and do not actually exist! They are, however, a useful
model. MOs and related concepts (such as Natural Bond Orders, or NBOs) provide the chemistry researcher and
chemistry student with an excellent way to predict chemical properties and chemical reactivity. Keeping in mind
that MOs are a mathematical representation, and not a physical reality, is a good thing to do.
As need arises, more theories will be added to the pull-down menus. The available choices provide the educator and
the student researcher with enough variety to explore the various effects of these very different mathematical
models. As of this writing (summer 2006), the following ab initio basis sets are available:
• STO-3G
• 3-21G
• 6-31G(d)
• 6-311+G(d,p)
Again, these choices are provided to give the user a good, but not overwhelming, sample of very different basis sets.
With the five choices of theories and four choices of basis sets, the user can explore in some detail a number of
different model chemistries.
Advantages:
It should be clear to the reader that the choice of one of the ab initio approaches, which is known as a model
chemistry, provides the most accurate computational analysis of a molecule or molecular system possible. Again, as
discussed briefly earlier, the use of this methodology allows us, in the words of Gay-Lussac, to “submit the bulk of
chemical phenomena to calculation”.
Disadvantages:
The disadvantages of this method should not be too much of a surprise! The major disadvantage is that the
researcher has significant limitations on the size of the molecule that he or she can study. As a rule of thumb, ab
initio methods are typically limited to molecules of 50 atoms or less. For the biologist, this, of course, rules out any
study of proteins or molecules of biological importance, which are typically thousands of atoms in size.
Even for small molecules, the user must have access to some reasonably significant computing power. While the
North Carolina High School Computational Chemistry server is a high-end computing tool, a calculation that has
more than 20 atoms and uses one of the electron correlation methods will require run-times that measure in hours.
This is not atypical in the computational chemistry community. Educators and student researchers who wish to run
calculations of this size will need to request a research account. Classroom accounts, designed to allow educators
and students to investigate how the server is used and perform some small calculations, do not provide enough time
for the exploration of a model chemistry that incorporates one of the more advanced theories and/or one of the more
sophisticated basis sets.
The chart below shows what is known as a “benchmark” test. In this test, we ran the molecule benzene (C6H6) using
five different levels of theory and four different basis sets, for a total of 20 different and unique model chemistries.
The table shows both the amount of computing time required (the “runtime”) and the energies of the molecules in
units of Hartrees. A careful review of this data should revel that there is a significant change in the runtimes with
the triple-zeta (6-311+G(d,p)) basis set, and a reasonable increase with a “standard” basis set such as 6-31G as we
increase the level of electron correlation (HF= no correlation to CCSD(T)=substantial correlation).
RUNTIMES (in
seconds)
HF MP2 MP4 CCSD CCSD(T)
STO-3G 10.8 13.2 16.6 20.9 25.5
3-21G 11.2 14.5 89.0 96.0 172.0
6-31G 14.8 26.7 599.4 393.9 1064.1
MOLECULAR
ENERGIES (in
Hartrees)