Professional Documents
Culture Documents
Elements of Programming
29
Chapter 4
4.0.15
Topics
Aim: Communicate certain fundamental concepts about computers and
computing. Relax.
Digital encoding
(Digital) computation
Fundamental programming
Applied programming
1 File:
coding-computation-slides.tex.
31
32
4.0.16
Digital encoding
Digital discrete
(Analog continuous)
. . . used for communication (Morse code, braille, \one if by land,. . . "), for
instructing computers (machine code), &c.
4.0.17
Digital encoding: Examples of codes
Written words: \Bob," \Carol," \Ted," \Alice"
Words \stand for" things, events, relations, etc. Make up sentences, etc.
What an idea! With a few letters, all the words; with a few thousand
words, an innite number of sentences.
4.0.18
Digital encoding: Examples of codes
Morse code: Dots:
A
B
C
D
E
F
G
H
I
Dashes:
J
K
L
M
N
O
P
Q
R
is
(There's more for other symbols, e.g. A
S
T
U
V
W
X
Y
Z
33
4.0.19
Digital encoding: Comments on Morse code
How many elementary / primitive symbols? Answer: 2. Why?
How many symbols do we need to cover the letters of the 26 letters of the
alphabet?
Answer: 21 + 22 + : : : + 2n until > 26, i.e., n = 4.
Note: Why not just 25 ?
Why do the dierent letters have the symbol patterns they have?
4.0.20
Digital encoding: Examples of codes
Braille: array of 32 = 6 dots, raised or not.
Character codes in computing
1. ASCII (American Standard Code for Information Interchange)
Standard in computing. See the IDT book.
2. EBCDIC: IBM mainframes
3. Unicode: For all the world.
4.0.21
ASCII
Binary code: 1s and 0s.
34
4.0.22
Number coding systems
Decimal (10-based) has 10 symbols possible per `slot': 0, 1, 2,. . . , 9
Binary (2-based) has 2 symbols possible per `slot': 0 and 1
Octal (8-based) has 8 symbols: 0, 1, 2, 3, 4, 5, 6, 7, and 8
Hexidecimal (16-based) has 16 symbols: 0, 1, . . . , 9, A, B, C, D, E, and F
Number coding can be in any base: 2, 3, . . . , n. Why base 2 for computers?
Why base 8, base 16?
4.0.23
Digital encoding.
We encode from a list of atomic symbols (e.g., the alphabet) and compose more complex things by combining these symbols (e.g., words are
composed of letters of the alphabet, sentences are composed of words).
At the most abstract, general level, we can use numbers to be our atomic
symbols (numerals, actually).
So, e.g., P = 80 decimal = 01010000 binary in ASCII.
Let dash (
4.0.24
Digital encoding: Comments
Conventionality (arbitrariness)
Why not P = 1010111? etc.
Generality
Can one type of encoding encode everything that another type of encoding
can encode? Does it matter whether we do decimal or binary?
35
4.0.25
Digital computation
. . . or a program instruction?
Roughly, a computation is a manipulation and/or recognition of a digital
encoding
A computer is a machine that does computations, that manipulates, recognizes, and acts on digital encodings
Our computers work on binary (1s and 0s) encodings. Why?
4.0.26
Digital computation
Is that all? Can't some computers do more than others?
Yes, that's possible. Size, speed, of course.
Actually, just a few instructions are sucient to compute all possible
manipulations on binary digital encodings, and these are in turn fully
general.
An amazing fact.
4.0.27
Digital computation
So, all real computers are fundamentally equivalent, just some are bigger
and faster than others.
What about interacting with the world? `I/O' as we say.
Same thing, just hooked to I/O devices.
36
4.0.28
Fundamental programming
Computers run by executing program instructions, one after another.
The program instructions instruct the computer to manipulate, recognize,
and act upon digital (binary) encodings.
How are program instructions represented to the computer? As binary
encodings. What about the data used by the program? Same thing. How
does it know?
Basic cycle: (1) fetch the next instruction (from memory into the CPU),
(2) execute the instruction, (3) gure out where the next instruction is
and go to (1).
4.0.29
Applied programming
Don't like them 1s and 0s (machine language)
So, `higher-level' languages: metaphor
Compiler: takes your `higher-level' jottings and translates them into machine language, so your program can be executed.
Note: machine language programs are specic to the machine type you
are running: Intel X, Intel Y, Macintosh, Sun, Alpha, IBM, etc.
4.0.30
Applied programming
Interpreter: Accepts a compact `semi-compiled' (byte-code) version of
your jottings and executes it by translating it on the y to machine code
and sending it o for execution. Visual Basic, Excel.
Possibility: Interpreters for each type of mahcine (hardware), but all can
then execute the same byte-code. \Write once, run everywhere."
Think of the Internet.
And: Java.
37
4.0.31
Applied programming: On the Internet, etc.
Where does/can code execute?
On your PC (your client, whether Wintel, Mac, Linux, Unix,. . . ): Your
browser (Internet Explorer, Netscape, Hot Metal, . . . )
On your PC: Spreadsheet programs, word processing programs, . . .
4.0.32
Where does code execute? (con't.)
On the server: The Web server program that responds to requests from
your browser and serves up les to you.
On the server: Business programs that run in response to your inputs from
your browser: shopping carts, billing, etc. (Think of buying something on
the Web, or participating in an auction.)
On the server: Business programs that create HTML pages and send them
to you on the y. PHP, ASP, JSP.
4.0.33
Where does code execute? (con't.)
On your PC, via your browser: JavaScript, VBScript for graphics and
animation (but primitively)
On your PC, via your browser: ActiveX (Microsoft) and Java applets
Downloaded in real time from the server! Why? Pluses? Minuses? Worries?
/* $Header$ */
4.1
Bibliographic Note
38
Chapter 5
Why Program?
People program computers for all sorts of reasons and purposes. Without undue
abuse of reality, we can classify programming activities by degrees of diculty,
or required know-how. In increasing levels of technical challenge we have:
1. End-user programming
2. Utility-and-analysis (U&A) programming.
3. Applications programming
4. Systems programming
An end-user is anyone who interacts with a computer program|such as a
spreadsheet, presentation software, or a word processor|in order to accomplish
a task. Putting together a nontrivial spreadsheet and using it to solve problems counts as, and is perhaps the prototypical case of, end-user programming.
Typically, end-user programming is accomplished through visual or graphical
interfaces. The end-user manipulates these interfaces in order to send instructions to the program. Think|again, prototypically|of selecting a range in
Excel and clicking on a menu in order set the color displayed in the range. Similarly, formatting in word processors (e.g., Microsoft Word, among others) and
presentation software (e.g., Microsoft PowerPoint, among others) is a form of
end-user programming that is usually done by manipulating a graphical user
interface. End-users program because no one else will do it for them. The hope
is that an end-user can use the software as tool to solve the problem at hand,
and can do this faster, cheaper, and more eectively than a specialist programmer. Given the requisite domain knowledge and an appropriate softwtare tool,
the end-user can proceed expeditiously to solve the problem at hand, without
having to take the time to explain the substance of the problem to a technician. Such hopes are in fact often reasonable and indeed fullled. End-user
programming is a main topic of this book.
At the other end of the technical challenge scale, systems programming includes writing operating systems, pieces of operating systems (such as device
39
40
41
gram control. In Excel, Word, PowerPoint and other end-user tools such
programs are called macros, and they can be very valuable indeed. Writing
a macro is a prototypical case of U&A programming. Prototypical tasks
for macros include automated loading and formatting of data from an external le, and automation of large or complex tasks involving repetitions
of certain subtasks.
Ad hoc modeling and analysis.
Internet-related tasks.
Extracting information from emails, from ftp sites, or from Web pages
is often valuable, if not necessary. Modern scripting languages facilitate
doing this under program control and on large scale. A better alternative
than doing it manually.
Text, which is the source of so much information, is an even greater challenge than data when it comes to cleaning and formatting for subsequent
analysis. Again, modern scripting languages are excellent tools for this
purpose.
It is often possible and desirable to build special-purpose business applications by assembling them from existing software. Microsoft Oce is often
used this way. A decision support system (DSS) for a special purpose
or analysis project is assembled|\glued together"|from Excel, Access,
Word, and even PowerPoint. The user-analyst sees an Excel interface from
42
43
features that make certain U&A programming tasks very easy. Fundamentally a
scripting language, Python programs can have GUIs (graphical user interfaces),
although we will not discuss them. Python is designed to be extensible and is
continually the beneciary of extensions built by a large community of U&A
programmers. Python's extensions for Internet programming, for mathematical
and numerical computation, for string (or text) manipulation, and for manipulating Microsoft programs (such as Excel, Access, Word, and PowerPoint) are
especially useful for U&A programming problems.
Finally, the skills one acquires in learning any good programming language
transfer readily when learning a new language. Learning Python and a little
VBA, as we shall be doing, positions you well for learning VB and VBA should
that become useful. On the other hand, there are important things to learn
that do not t well (or at all) with VBA and that are easily done in Python.
5.1
Information Sources
What follows in this book about Python is meant to ll in the gaps and slightly
extend introductory material on Python. That material is readily available for
free on the Web. See the Python home page for everything to do with Python,
including documentation and tutorials:
http://www.python.org/
At
http://www.python.org/doc/Newbies.html
you will nd a number of Python tutorials. I recommend reading rst \A NonProgrammer's Tutorial for Python," by Josh Cogliati, at
http://www.honors.montana.edu/~jjc/easytut/easytut/
After that, you might try \How to Think Like a Computer Scientist," the
Python version of Allen Downey's open source book, with Je Elkner. It's
at
http://www.ibiblio.org/obp/thinkCSpy/
Then there's the \Python Tutorial," by Guido van Rossum, with Fred L. Drake,
Jr., editor. Guido created Python. This tutorial is excellent, but it's aimed at
people who are experienced C programmers. If you aren't one, many of the
remarks meant to clarify things will be utterly mysterious. Still, it's a good
source. Guido's tutorial is at
http://www.python.org/doc/current/tut/tut.html
In addition, Python's online documentation is excellent. The \Global Module
Index (for quick access to all documentation)" and the \Library Reference (keep
this under your pillow)" come with the Python installation and are available at
http://www.python.org/doc/.
Several good books on Python are in print, but I think the free tutorials
listed above, along with these notes, and the online documentation should be
plenty. If you are after all a book person, here's a short list.
1. Learning Python by Lutz and Ascher [14] is a good introduction to Python
programming, although not to programming in general.
44
File: why-program.tex
Chapter 6
Notes
6.0.1
1 File:
misnotes-vba-slides.tex.
45
6.0.2
Goals for lecture 1.
1. Introduce the basic concepts of macros in Excel (which are written in
VBA).
2. Show how to record and run a macro, and examine and edit its code in
the Visual Basic Editor.
3. Use the Visual Basic Editor to create a simple VBA program (Sub) and
call it for execution from a button on a worksheet.
4. Use the Visual Basic Editor to create a simple VBA program (Function)
and call it for execution from a cell in a worksheet.
5. Introduce the core structure of VBA programs.
(cf., Worksheets("Lecture1") code module Lecture1)
6.0.3
Macros
Programs in VBA. R^
ole of VB and VBA for Microsoft.
What is VBAnExcel good for?
{ Assembling, \gluing together," applications in MS Oce. (Larger issue:
\component-based applications.")
{ Utility (small job) programming, e.g., for data preparation and manipulation, for programming the interface, . . .
{ For learning how to program.
{ For prototype programming.
{ For learning about modern software concepts (e.g., OOP) and development
environments (now a good one in Excel for VBA).
6.0. NOTES
47
6.0.4
More on macros
Recording macros
{ Tools ) Macro ) Record New Macro. . .
6.0.5
Recorded macro, called Bob
Sub Bob()
'
' Bob Macro
' Macro recorded 2/13/98 by Steven O. Kimbrough
'
'
Range("B3:C4").Select
Selection.Copy
Range("A1").Select
ActiveSheet.Paste
Application.CutCopyMode = False
Range("A1").Select
End Sub
6.0.6
Basics of Visual Basic for Applications
VB, VBA, VBAnExcel, VBAnWord,. . .
Macro modules
Subs
Functions
Variables & declaring them
Structure of a VBAnExcel application
6.0.7
A simple
Sub
6.0.8
Now, make it run from Excel. . . . . .
1. View ) Toolbars ) Control Toolbox
2. Select and draw a button.
3. Right-click with the button selected ) Properties. Set the properties as
desired, then close the Property Window.
4. Right-click the button selected ) View Code
5. Add a call to HelloWord:
Private Sub cmdHelloWold_Click()
HelloWorld
End Sub
6. Return to Excel, exit design mode, and click the button.
6.0. NOTES
49
6.0.9
A simple
Function
6.0.10
Try these procedures in a code module:
Function dblCubed(X As Double) As Double
dblCubed = X ^ 3
End Function
Sub CubeMe()
Dim dblDaNumber As Double
dblDaNumber = _
InputBox("What number " & _
"would you like to cube?")
dblDaNumber = dblCubed(dblDaNumber)
MsgBox "And the answer is " & dblDaNumber
End Sub
Dim dblDaNumber As Double declares (Dimensions) the program variable
dblDaNumber as type Double (precision oating point).
6.0.11
VBAnExcel Programs.
Are collections of Subs and Functions (and
Declarations). Typically they:
1. Are called (started) from Excel.
2. Read in information. From, e.g., worksheets, dialog boxes, les, databases.
3. Store this information in variables.
4. Computationally manipulate the variables.
5. Write out the information. To., e.g., worksheets, dialog boxes, les, databases.
Basic concepts at hand, details now follow.
6.0.12
Goals for lecture 2.
1. Introduce program variables and how to declare them.
2. Discuss and show how|in VBAnExcel|to read and write information
from and to worksheets.
3. Introduce the Object Browser.
(cf., Worksheets("Lecture2") code module Lecture2)
6.0.13
Recall: VBAnExcel Programs.
Are collections of Subs and Functions (and
Declarations). Typically they:
1. Are called (started) from Excel.
2. Read in information. From, e.g., worksheets, dialog boxes, les, databases.
3. Store this information in variables.
4. Computationally manipulate the variables.
5. Write out the information. To., e.g., worksheets, dialog boxes, les, databases.
6.0. NOTES
51
6.0.14
Variables
Expressions (think of names) in programs that can hold dierent values at
dierent times.
X, Hi, Lo, Risk in the dblUtility function.
Option Explicit
Sub FirstVariableExample()
Dim I As Integer
Dim MyFirstVariable As Integer
'Note: Dim I, MyFirstVariable as Integer
' leaves I an Integer and MyFirstVariable
' as a Variant. Thanks, Bill!
MyFirstVariable = 3
For I = 1 To MyFirstVariable
MsgBox "Showing and counting: " & I
Next I
End Sub
6.0.15
Variables have data types
Types:
{ Integer, Long
{ Single, Double
{ Currency
{ Date
{ String
{ Variant
Why?
6.0.16
Programs (in general, I emphasize):
Declare variables
Put values into variables
Make calculations with variables
Store results of calculations in variables
6.0.17
Declaring variables (in VBA)
Variables should always be declared.
{ Why?
{ Variant if not declared.
{ Use Option Explicit in the declarations section to force declaration of
variables.
6.0. NOTES
6.0.18
Value(s) of
MySecondVariable?
Option Explicit
Private MySecondVariable As Integer
Sub PublicExample1()
MsgBox "We're in PublicExample1 " & _
"and MySecondVariable = " & _
MySecondVariable
End Sub
Sub PublicExample2()
MySecondVariable = 23
MsgBox "We're in PublicExample2 " & _
"and MySecondVariable = " & _
MySecondVariable
End Sub
6.0.19
Reading from, and writing to, a worksheet
Sub CosineHardWired()
Dim MyNumber As Double
MyNumber = _
Worksheets("Lecture2").Cells(9, 2).Value
'Note: Cells(9,2) = row 9, column 2 of the
'worksheet.
Worksheets("Lecture2").Cells(11, 2).Value = _
Cos(MyNumber)
'This also works:
'Worksheets("Lecture2").Range("B11").Value = _
'Cos(MyNumber)
'And suppose MyTestRange is defined B2:D13.
'Then this works, too:
'With Worksheets("Lecture2").Range("MyTestRange")
'
.Cells(10, 1).Value = Cos(MyNumber)
'End With
End Sub
Try a nonnumber in B9. Debug. Reset button. Later: the debugger.
53
6.0.20
Generalizing on reading & writing
Reading & writing the Excel worksheet are special cases of getting and
setting
object properties.
So far, the Value property of a particular object, a particular cell.
Why not, say, the color of a cell?
Sub SimpleShowColor()
Dim MyTempHolder As Variant
MyTempHolder = _
Worksheets(2).Cells(14, 2).Interior.ColorIndex
MsgBox "The interior color of B14 is " _
& MyTempHolder & "."
End Sub
LOTS of objects and properties in Excel.
6.0.21
Generalizing on reading & writing (con't.)
Sub GetTheWorksheetName()
Dim Temp As String
Temp = Range("CellBob").Worksheet.Name
MsgBox "The name of the worksheet in which " & _
"the range CellBob resides is " & Temp
End Sub
Sub RenameMeTheWorksheet()
Dim Temp As String
Temp = _
InputBox("New name for this worksheet?")
ActiveSheet.Name = Temp
End Sub
6.0. NOTES
55
6.0.22
The object browser, F2 in the VBA Editor
Displays, and lets you explore, all available objects, methods, and properties. Nifty!
member = (method _ property)
Right-click on a member. Your code. Excel's code.
Example: Look in Excel, Range class, Cells member (a property). Call
for help.
6.0.23
The Object Browser (con't.)
A word to the wise:
Be ever vigilant. No program documentation is ever complete|or
completely accurate|and the VBA on-line Help is no exception.
Some of the descriptions are just plain wrong. Some of the code
samples don't work. Any many, many \gotchas" are left unexplored.
Still, if you take the documentation with a small grain of salt, you'll
nd an enourmous amount of important information there. And the
easiest way to get to the information is via the Object Browser.
|from Excel 97 Annoyances, p. 205.
6.0.24
Goals for lecture 3.
1. Introduce control structures.
2. Introduce and discuss arrays (and their uses).
3. Discuss calling procedures from within procedures.
(cf., Worksheets("Lecture3") code module Lecture3)
6.0.25
Control structures: For...Next
For doing something a known number of times.
You've seen this already.
6.0.26
Control Structures: If...Then
For doing something (THEN) on condition that something else (IF) is true
(else, skip and continue).
If <condition> Then
[statements]
End If
6.0.27
Control Structures: If...Then...Else
For doing something (THEN) on condition that something else (IF) is
true; otherwise doing the ELSE clause.
If <condition> Then
[if-statements]
Else
[else-statements]
End If
Always: Either the if-statements are executed (when <condition> is true),
or the else-statements are executed (when <condition> is false).
6.0. NOTES
6.0.28
If...Then...Else example
Code for a simple comparison of two (cell) values.
Sub SimpleCompare(Left, Right)
If (Left <> Right) Then
MsgBox "The two values are different."
If (Left < Right) Then
MsgBox "The Right value exceeds the Left."
Else
MsgBox "The Left value exceeds the Right."
End If
Else
MsgBox "The two values are the same."
End If
End Sub
6.0.29
If...Then...Else example (con't.)
The button code to call this Sub.
Private Sub cmdSimpleCompare_Click()
Dim Left
Dim Right
'Both Left and Right will be Variants
Left = _
Cells(4, 2).Value
Right = _
Cells(4, 3).Value
Call SimpleCompare(Left, Right)
'This will also work:
'SimpleCompare Left, Right
End Sub
57
6.0.30
An alternative, using If...Then...ElseIf ...
Sub SimpleCompareElseIf(Left, Right)
If (Left = Right) Then
MsgBox "The two values are the same."
ElseIf (Left < Right) Then
MsgBox "The two values are different."
MsgBox "The Right value exceeds the Left."
ElseIf (Left > Right) Then
'The following also works:
'ElseIf True Then
MsgBox "The two values are different."
MsgBox "The Left value exceeds the Right."
End If
End Sub
Which is better code? Why? What about ElseIf (Left > Right) Then versus ElseIf True Then?
6.0.31
Control structures: Select Case
A way of generalizing If...Then...Else...
Select Case <test expression>
Case <1st expression list>
[1st statements]
Case <2nd expression list>
[2nd statements]
:
:
Case Else
[else statements]
End Select
6.0. NOTES
6.0.32
Example of Select Case
Function Bonus(performance, salary)
'This function is from the
'Microsoft VB Help files, on Select Case.
Select Case performance
Case 1
Bonus = salary * 0.1
Case 2, 3
Bonus = salary * 0.09
Case 4 To 6
Bonus = salary * 0.07
Case Is > 8
Bonus = 100
Case Else
Bonus = 0
End Select
End Function
6.0.33
Somewhat better
Function dblBonus(performance As Integer, _
salary As Double) As Double
If (performance < 1 Or performance > 10) Then
dblBonus = -9999
Exit Function
End If
Select Case performance
Case 1
dblBonus = salary * 0.1
Case 2, 3
dblBonus = salary * 0.09
Case 4 To 6
dblBonus = salary * 0.07
Case Is > 8
dblBonus = 100
Case Else
dblBonus = 0
End Select
End Function
59
6.0.34
Control structures: Do...Loop
Alternative to For...Next
Dierences?
Two versions, two cases each:
Do {While | Until} <condition>
[statements]
Loop
Do
[statements]
Loop {While | Until} <condition>
6.0.35
Arrays
Fundamental, but basic, data structures. Very widely used in programming.
Similar to vectors and matrices in mathematics.
{ But can have more than 2 dimensions
Can have 1, 2, 3, . . . dimensions
Named much as are variables
Great for capturing a range on a worksheet.
6.0. NOTES
61
6.0.36
Reversing the contents of a column range
See VBATutor.xls, "Lecture3 Arrays" worksheet, Lecture3Arrays code module. Line numbers below added.
[1] Sub ReverseRange(FromRange As Range, _
ToRange As Range)
[2] Dim intFromLength As Integer
[3] Dim intToLength As Integer
[4] Dim DaReverseArray() As Variant
[5] Dim I As Integer
[6] intFromLength = FromRange.Rows.Count
[7] intToLength = ToRange.Rows.Count
[8] If intFromLength <> intToLength Then
[9]
MsgBox intFromLength & " To length: " _
[10]
& intToLength
[11]
MsgBox "Sorry, the two ranges are " & _
[12]
"not comformable. Exiting."
[13]
Exit Sub
[14] End If
6.0.37
Reversing the contents of a column range (con't.)
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
6.0.38
The button code for calling ReverseRange and for clearing out the range reversed.
Private Sub cmdCallReverse_Click()
Dim FromRange As Range
Dim ToRange As Range
Set FromRange = Range("reverse")
Set ToRange = Range("reversed")
Call ReverseRange(FromRange, ToRange)
End Sub
Private Sub cmdClearReversed_Click()
Range("reversed").ClearContents
End Sub
6.0.39
Command button code for squaring the reverse
Private Sub cmdSquareReversed_Click()
Dim FromRange As Range
Dim ToRange As Range
Dim VectorLength As Integer
Dim FromVector() As Double
Dim ToVector() As Double
Set FromRange = Range("reversed")
Set ToRange = Range("reversedsquared")
VectorLength = FromRange.Rows.Count
ReDim FromVector(1 To VectorLength)
ReDim ToVector(1 To VectorLength)
Dim I As Integer
'Check for numeric input
For I = 1 To VectorLength
If Not IsNumeric(FromRange.Cells(I).Value) Then
MsgBox "Inputs must be numbers. Exiting."
Exit Sub
End If
Next I
6.0. NOTES
6.0.40
(con't.)
For I = 1 To VectorLength
FromVector(I) = FromRange.Cells(I).Value
Next I
Dim MyErrorCode As String
MyErrorCode = "Calling"
Call SquareTheVector(FromVector, _
ToVector, MyErrorCode)
If MyErrorCode = "OK" Then
For I = 1 To VectorLength
ToRange.Cells(I).Value = ToVector(I)
Next I
Else
MsgBox "Failed in cmdSquareReversed." & _
"Error: " & MyErrorCode
End If
End Sub
63
6.0.41
Code for SquareTheVector
Sub SquareTheVector(StartArray, _
ReturnArray, ErrorCode As String)
Dim intStartLower As Integer
Dim intStartUpper As Integer
Dim intReturnLower As Integer
Dim intReturnUpper As Integer
Dim IsOK As Boolean
Dim I As Integer
IsOK = False
intStartLower = LBound(StartArray, 1)
intStartUpper = UBound(StartArray, 1)
intReturnLower = LBound(ReturnArray, 1)
intReturnUpper = UBound(ReturnArray, 1)
If ((intStartUpper - intStartLower) = _
(intReturnUpper - intReturnLower)) Then
IsOK = True
Else
ErrorCode = "Not OK"
Exit Sub
End If
6.0.42
(con't.)
For I = 1 To (intStartUpper - _
intStartLower + 1)
If (Not IsNumeric(StartArray(intStartLower _
- 1 + I))) Then
MsgBox "Error code 303. Bibi."
Exit Sub
End If
ReturnArray(intReturnLower - 1 + I) = _
StartArray(intStartLower - 1 + I) ^ 2
Next I
ErrorCode = "OK"
End Sub
6.0. NOTES
65
6.0.43
Commnets
Code is perhaps more complex than is strictly required.
But, error-checking is awfully important and there could be more of it
here. (How? Why?)
Also, illustrates generality (e.g., arrays must be conformable, but need not
have the same indexing arrangement)
Useful generally for programming in VBA and for programming generally:
drop things into arrays, pass the arrays around and use them to record
computations and for looping through when you do computations.
6.0.44
Goals for lecture 5.
1. Introduce the VBA/Excel debugger.
2. Introduce forms programming.
3. Discuss VBATutorSortem.xls, a spreadsheet for sorting addresses.
6.0.45
Code in Module1 of Excel Workbook,
VBATutorSortem.xls
Option Explicit
Sub MakeAddressesHorizontal(StartRow _
As Integer, _
StopRow As Integer, DaColumn As Integer, _
DaSheet As String)
Dim intDaColumn As Integer
Dim I As Integer
intDaColumn = DaColumn
Dim Temp
Dim strDaSheet As String
strDaSheet = DaSheet
Dim intDaStartRow As Integer
intDaStartRow = StartRow
6.0.46
Code in Module1
Dim intAddressRow As Integer
Dim intAddressStartRow As Integer
Dim boolDone As Boolean
boolDone = False
'intDaStartDow is the first row in which
'an address lies.
'intAddressStartRow = the absolute
'row number in which a single address
'begins
'intAddressRow = the current (relative)
'row of the address (reset to 1 for the
'top of each address.
intAddressStartRow = intDaStartRow
6.0. NOTES
6.0.47
Code in Module1
With Worksheets(strDaSheet)
intAddressRow = 1
Do While Not boolDone
'Do the next address
'if we have two empty rows, we're done.
If (.Cells(intAddressRow + _
intAddressStartRow, _
intDaColumn).Value = "" And _
.Cells(intAddressRow + _
intAddressStartRow _
+ 1, intDaColumn).Value = "") Then
boolDone = True
StopRow = intAddressRow + _
intAddressStartRow + 1
End If
6.0.48
Code in Module1
'if we have only one empty row,
'we start a new address
If .Cells(intAddressRow + _
intAddressStartRow, _
intDaColumn).Value = "" Then
intAddressStartRow = intAddressRow + _
intAddressStartRow + 1
intAddressRow = 1
End If
67
6.0.49
Code in Module1
.Cells(intAddressRow + intAddressStartRow, _
intDaColumn).Select
Selection.Copy
.Cells(intAddressStartRow, intDaColumn + _
intAddressRow).Select
ActiveSheet.Paste
.Cells(intAddressRow + intAddressStartRow, _
intDaColumn).Select
Application.CutCopyMode = False
Selection.ClearContents
intAddressRow = intAddressRow + 1
Loop
End With
End Sub
6.0. NOTES
6.0.50
Code in Module1
Sub SortHorizontalAddresses(StartRow _
As Integer, StopRow As Integer, _
DaColumn As Integer, DaSheet As String)
Dim Top As Range
Dim Bottom As Range
Set Top = _
Worksheets(DaSheet).Cells(StartRow, _
DaColumn)
Set Bottom = _
Worksheets(DaSheet).Cells(StopRow, _
DaColumn + 4)
Range(Top, Bottom).Select
Selection.Sort _
Key1:=Cells(StartRow, DaColumn), _
Order1:=xlAscending, Header:=xlGuess, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Cells.Select
Selection.Columns.AutoFit
Range("A1").Select
End Sub
6.0.51
Code in Module1
Sub StartToSort()
frmSortem.Show
End Sub
69
6.0.52
The button code for Sheet2
Private Sub CommandButton1_Click()
Dim DaStartRow As Integer
Dim DaStopRow As Integer
Dim DaMainColumn As Integer
Dim OurDataSheet As String
OurDataSheet = "sheet2"
DaStartRow = 3
DaMainColumn = 2
DaStopRow = 0
Call MakeAddressesHorizontal(DaStartRow, _
DaStopRow, DaMainColumn, OurDataSheet)
Call SortHorizontalAddresses(DaStartRow, _
DaStopRow, DaMainColumn, OurDataSheet)
End Sub
6.0.53
The button code for Sheet1
Private Sub CommandButton1_Click()
StartToSort
End Sub
6.0.54
Code for the Form (\frmSortem") buttons
Private Sub cmdCancel_Click()
Unload frmSortem
End Sub
6.0. NOTES
71
6.0.55
Code for the Form (\frmSortem") buttons (con't.)
Private Sub cmdOK_Click()
Dim DaSheetName As String
Dim DaStartRow As Integer
Dim DaMainColumn As Integer
Dim DaStopRow As Integer
DaSheetName = txtSheetName.Value
DaStartRow = txtStartRow.Value
DaMainColumn = txtStartColumn.Value
Application.ScreenUpdating = False
Sheets(DaSheetName).Select
Call MakeAddressesHorizontal(DaStartRow, _
DaStopRow, DaMainColumn, DaSheetName)
Call SortHorizontalAddresses(DaStartRow, _
DaStopRow, DaMainColumn, DaSheetName)
End Sub
6.0.56
VB List Boxes
To read values into a list box you must point to the values in the \Initialize" part of the code. You would do this to provide a list of options a user
may then select from.
Select or activate the sheet containing those values before showing the
dialog box or user form.
6.0.57
VB List Boxes con't.
The code to assign the selection(s) to variables should be placed in the
\Click" part of the code.
In the design mode, right click on the \OK" or some such button and
select \View Code." This is where variables are assigned values after a
button is clicked by the user.
You can call other subs from anywhere in the le to run from a form.
However, they must be called from the \Click" part of the code to be
recognized.
6.0.58
VB List Boxes Tips
Scroll bars appear/disappear automatically depending on the size of the
box and the values read into the box (e.g., if you have a list of 100 items
and design the box to be 2 inches long, a vertical scroll bar will appear).
Set \Integral Height" to false on the properties of a text box if you want
the size of the box to remain xed.
6.0.59
Stepping through Code
Helpful for debugging
From the VB editor, select
Tools j Macro j Step Into
6.0. NOTES
73
6.0.60
General tips
Use the macro recorder and then add control statements and variables.
Document all code thoroughly so that you can follow what has been done,
if it is not already obvious.
Use meaningful variable (and procedure) names (e.g., \Year1" is a good
name for a starting year, but \A" is not).
Type all code in lower case so VB can automatically capitalize letters to
better ensure you do not have typographical errors.
6.1
75
First Steps
Visual Basic for Applications (VBA) is the macro language for Excel. It closely
resembles Visual Basic, an independent language from Microsoft, and is used
as the macro language for Microsoft Access and Word. In what follows, we will
be talking for the most part about Visual Basic for Applications as it applies to
Excel. We will feel free to call it VBA, EVB, Visual Basic, VB, etc., so long as
the context makes confusion unnecessary.
Macros consist of one or more VBA code chunks. These code chunks|
procedures|are either functions or subroutines. Here are some simple examples.
' Here is a simple function.
Function bob(x)
bob = x ^ 2 + 3.34
End Function
' Here is a simple VBA sub.
Sub ted()
MsgBox "Hello, world!"
End Sub
6.2
Second Steps
6.2.1
Recording Macros
Macros (VBA procedures) can be recorded. Use Record Macro under the Tools
menu. After selecting Record New Macro, you will be prompted for the name
of this new macro. Either give it a new name, or accept the default. A small
window will then appear with a stop button in it. You click the stop button
when you are done recording your macro. First, however, perform as usual some
action in the workbook, e.g., copy one range of cells to another place. When you
are done, stop the macro recorder by clicking the stop button. In sum, there is
a four-step process to record a macro:
1. Start the macro recorder.
Do this by selecting the menu: Tools / Record Macro / Record New
Macro.
2. Name the macro.
You will be prompted for a name and may accept the default presented
77
by Excel, e.g., Macro1. Once you have done this, a window appears with
a button for stopping the recording of the marco.
3. Record the macro by performing normal activities in the workbook.
It is wise to plan these out before starting to record.
4. Stop recording the macro.
Do this by clicking the stop macro button.
This creates VBA code in a (usually new) module sheet, which Excel will
call Module1 or some such thing. Module sheets reside with the other sheets of
the workbook. As with the other sheets, you click on the tab to view the module
sheet. When it appears, you will see VBA code against a blank background.
While worksheets present spreadsheets (arrays of cells), macro sheets present
text editors. Thus, you can examine and edit the VBA code.
Notice, in particulate, a couple of things with regard to your new macro
module sheet. First, macro sheets come with a context-sensitive text editor.
For example, comments (lines beginning with an apostophe) come out green (by
default) and reserved words come out blue and get capitalized automatically.
Second, the new macro that you just recorded is a Sub, rather than a Function.
Recording macros and examining the results is a good way of learning about
VB, but it takes you only so far. We need to go further.
6.2.2
In order to run (or execute) a Sub macro, including macros created with the
Record New Macro facility in Excel, one can choose to assign the macro to a
graphic object that can call the macro. Assigning a macro to a button or graphic
object is easy. For a previously-existing object, select it (e.g., hold down the
Ctrl key and click on the button or graphic object), then choose Assign Macro...
from the Tools menu. You will be prompted with a list of existing Subs and you
make your choice from the list. That done, you may now simply click on the
graphic object and Excel will call the macro and cause it to be executed.
Note: Typically, you will want to create a new button and assign the macro
to it. Use Create Button from the Drawing icon and draw a new button. Excel
will automatically prompt you to assign a macro.
6.2.3
VBA functions return values (one value each), but cannot take actions otherwise.
VBA subs (subroutines) do not return values, but can take actions. (However,
VBA subs can set the values of variables and these variables may be accessed
by other procedures.) VBA functions, once dened in a macro sheet, may be
used in worksheet cells just as any of the functions Excel has built into it.
Functions and subs may call one another, thus you may create very complex
programs in VBA. We will discuss that later. First things rst. Now, let's look
at variables in VBA.
6.3
6.3.1
Variables
The Very Basics of Variables
Sub variableExample1()
' Assign the number 3 to the variable, MyVariable
' Note: You make up your own, nmemonic,
' names for your variables
MyVariable = 3
For i = 1 To MyVariable
MsgBox "Showing and counting: " & i
Next i
End Sub
The two variables are: MyVariable and I.
Such program variables are used extensively in this sort of programming.
Variables hold values and their values may change during program execution.
Basically, you make computations and assign the results to variables. Then you
make new computations, based on the assigned values of these variables, and
you assign the results to other variables. And on and on.
6.3.2
Some variables are for holding numbers, some for text, some for dates, and so
on. VBA has a special type of variable, called the variant type. It can hold
about anything, but in general you should avoid being so loose.
The main data types in VB are
1. Boolean. Values: True or False
2. Integer. Values: -32,768 to 32,767
3. Long (integer). Values: -2,147,483,648 to 2,147,483,647
4. Single (single precision oating point). Values: [lots]
5. Double. Values: [lots more than singles]
6. Currency. Values: [lots]
7. Date. Values: January 1, 0100 through December 31, 9999
8. String. Values: 0 through 65,535 characters
6.3. VARIABLES
79
9. Variant. Values: Any numeric value thru Double or any character text
You set the data type of a VB variable by declaring it. But, if you don't
declare the data type for a variable (as in the variableexample1 procedure,
above), then the default is that the variable is of type variant.
Within a procedure, you may declaire variables with the Dim (dimension)
statement.
' Now here's variableExample1 again, but
' with the variables properly declared
Sub variableExample2()
' Assign the number 3 to the variable, MyVariable
' Note: You make up your own, nmemonic,
' names for your variables
Dim MyVariable As Integer
Dim I As Integer
MyVariable = 3
For I = 1 To MyVariable
MsgBox "Showing and counting: " & I
Next I
End Sub
6.3.3
6.3.4
81
6.3.5
6.4
Boolean Operators
Often we have to test for the truth or falsity of an expression, for example
MyVar > 7.3
will be true if MyVar has a value that is greater than 7.3. If its value is less
than 7.3 the expression will be false. Note: If MyVar is Null, then the expression
evaluates to Null. See comparison operators. This greatly complicates things
and in these notes, I'll ignore the question of nulls.
So, expressions may be either true or false, in which case we say they have
truth values. Expressions having truth values may be combined using Boolean
operators to yield larger expressions, which also have truth values. The Boolean
operators available in VB are: And, Or, and Not.
Each of these operators has a characteristic truth table, as follows.
Interestingly, many other Boolean (truth-functional) operators are possible.
That is, there are a lot more other truth tables possible. But, these three suce
expression2
T
F
T
F
expression2
T
F
T
F
(expression1 Or expression2)
T
T
T
F
83
(Not expression)
F
T
exp2
T
F
T
F
Table 6.4: Truth Table Showing Denition of And in terms of Not and Or
Can you think of a single Boolean operator that is by itself sucient?
So, we often need Boolean combinations of statements (or expressions) in
programming. The bottom line is that And, Or, and Not are sucient for
expressing anything we can possibly express in this way.
6.5
Control Structures
There are several of these in Visual Basic, and we'll look at a few of them. (And
you should search the online help under \control structures.") We have already
seen one, the For...Next statement.
6.5.1
For...Next
We've already seen this in action (above). The general structure for a For...Next
statement is:
For <counter> = <start> To <end> [Step <increment>]
[statements]
Next [<counter>]
Note: Items in square brackets, [...], are optional. Items capitalized are
required parts of the statement. Items between left and right angle brackets,
<...>, are required to be lled in by the programmer. Thus, valid examples for
the For...Next statement include the following.
For I = 1 To 3
MsgBox "Hello, world!"
Next
Better style is to do this:
For I = 1 To 3
MsgBox "Hello, world!"
Next I
6.5.2
If...Then...
6.5.3
If...Then...Else
6.5.4
85
Select Case
6.5.5
Do...Loop
There are really two forms of Do...Loop: condition-at-the-top and conditionat-the-bottom. Here they are:
Do {While | Until} <condition>
[statements]
Loop
and
Do
[statements]
Loop {While | Until} <condition>
where
{While | Until}
gets unpacked as either While or Until. While <condition> means so long
as the condition is true, and Until <condition> means until the condition
is true. The dierence between the condition-at-the-top and the condition-atthe-bottom versions lies mainly in that the condition-at-the-bottom version is
guaranteed to execute its [statements] at least once.
6.5.6
Exiting a Loop
6.6
Arrays
Arrays in VB should not be confused with arrays and array commands in Excel,
even though Excel's terminology invites this. All standard third-generation programming languages support arrays, and programs in these languages typically
rely a lot on arrays. Arrays are rather like vectors and matrices in mathematics.
A one-dimensional array is an ordered collection of values, rather like a vector,
which you can access (store or retrieve values) by position. Here's a simple
example.
' From "Code Module5" of vbtutor.xls
Sub arraytester1()
Dim I, MyFirstArray(1 To 6) As Integer
87
6.7
Miscellaneous Topics
Now we'll discuss a list of useful things, things|methods and tricks|that didn't
t easily in the previous discussion.
6.7.1
Constants
Constants are like variables, except that they don't change. You use constants
in order to improve the readability of your program and to help reduce errors.
For example, if the maximum number of students in a classroom is 132, and
you need this value a lot in your program, then you might want to consider
declaring a constant. You might do this:
6.7.2
Suppose you wish to copy one worksheet range to another worksheet range. You
can do this in Excel VBA with the copy method. For example:
Sub copytest1()
' Suppose "carol" is the range B3:C4 on Sheet3 and
' "alice" is E4:F5 on Sheet3.
' The following works:
Worksheets("Sheet3").Range("carol").Copy _
destination:=Worksheets("Sheet3").Range("alice")
' And so does this:
Worksheets("Sheet3").Range("carol").Copy _
destination:=Worksheets("Sheet3").Cells(9, 9)
' and so does this:
Worksheets("Sheet3").Range("b3:c4").Copy _
destination:=Worksheets("Sheet3").Cells(10, 2)
End Sub
6.7.3
Suppose the name denise refers to a range consisting of a single column. Then
Range("denise").Cells(1).Value refers to the value in the topmost cell in
the range.
Sub democells1()
x = Range("denise").Cells(1).Value
y = Range("denise").Cells(2).Value
MsgBox "x = " & x & " and y = " & y
End Sub
6.7.4
See the sort method. In Excel VBA you can direct the sorting of a worksheet
range. For example, the following subroutine sorts the range, DaRange, in worksheet, DaWorkSheet, on the column, DaColumn, in descending order.
Sub sort()
Worksheets("DaWorkSheet").Range("DaRange").sort _
key1:=Range("DaColumn"), order1:=xlDescending
End Sub
6.7.5
89
6.7.6
Very straightforward. See the bob function at the start of this appendix. Then,
here's an example.
Function bobagain(x)
bobagain = bob(x) * bob(x)
End Function
6.7.7
Selecting Ranges
6.7.8
The VBA/Excel function, Month will return the number (integer) corresponding
to the month in the date given the function. This could be used to gure out
when the months change in the data.
91
Sub DaMonth()
'Assume that cell A5 on Sheet1
'contains a value that is a date,
'e.g., 4/15/98.
'The month function will return
'an integer corresponding to the
'month in the date.
Dim OurMonth As Integer
OurMonth = Month(Worksheets("sheet1").Range("a5").Value)
MsgBox "And the month is " & OurMonth & "."
End Sub
6.7.9
6.8
Bibliographic Note
6.9
Verson Notes
Chapter 7
VBA Exercises
1. Consider the following code, Sub Question9.
Sub
Dim
Dim
For
Question9()
I As Integer
J As Integer
I = 5 To 1 Step -1
For J = 2 To 6 Step 2
Worksheets("Question9").Cells(J, I).Value = (I / J) ^ 3
Next J
Next I
End Sub
Assuming that the worksheet \Question9" is empty before the Sub Question9 is executed, what is in cell D4 after Sub Question9 is executed?
(a) D4 is empty
(b) 15.625
(c) 1
(d) 0.125
(e) 8
2. Suppose that the worksheet \Question10" has the numbers 1, 2 and 3 in
the range A1:C1 (i.e., A1 has a 1 in it, B1 has a 2 in it, C1 has 3) and
2, 4, and 6 in the range A2:C2 (i.e., A2 = 2, B2 = 4, C2 = 6). Suppose
further that otherwise the worksheet is empty. What is in cell A5 after
the Sub Question10, below, is executed?
Sub Question10()
Dim I As Integer
Dim Huh As Double
93
94
95
1
2
3
4
5
6
7
8
9
10
11
12
A
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
B
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
96
97
Do
J = J + 1
L = Worksheets("Question16").Cells(J, 1).Value
K = Worksheets("Question16").Cells(J, 2).Value
Loop Until L + K < 2
For L = 1 To J
Worksheets("Question16").Cells(L, 3).Value = _
Worksheets("Question16").Cells(L, 1).Value
Worksheets("Question16").Cells(L, 4).Value = _
Worksheets("Question16").Cells(L, 2).Value
Next L
For L = J + 1 To I
Worksheets("Question16").Cells(L, 3).Value = _
Worksheets("Question16").Cells(L, 2).Value
Worksheets("Question16").Cells(L, 4).Value = _
Worksheets("Question16").Cells(L, 1).Value
Next L
End Sub
(a) 2
(b) -1
(c) 4
(d) 0
(e) 1
9. Suppose the worksheet \Question16" has the values in the indicated cells
in Figure 7.1 (above, question 5, e.g., B2 = 0) and is otherwise empty.
What is in cell D12 after the Sub Question16, above, is executed?
(a) -6
(b) -9
(c) The cell is empty
(d) -7
(e) -10
10. Excel supports text box validation on Dialog boxes. Which data type does
it NOT support for this purpose?
(a) Reference
(b) Text
(c) Number
(d) Integer
(e) Date
98
11. Excel supports a number of controls on Dialog boxes. Which control type
does it NOT support for this purpose?
(a) Text Box
(b) Check Box
(c) Scroll Bar
(d) Slider
(e) Spinner
12. TBA
13. TBA
14. TBA
A
1
1
1
2
B
1
0
C
0
1
D
1
0
E
0
1
(b)
(c)
(d)
4
5
A
1
1
B
0
1
C
1
0
D
0
1
E
1
0
4
5
A
1
1
B
1
0
C
1
0
D
0
1
E
1
0
4
5
A
1
1
B
1
0
C
0
1
D
0
1
E
1
0
4
5
A
1
1
B
1
0
C
0
1
D
1
0
E
1
0
99
Sub
Dim
Dim
Dim
Dim
Dim
Dim
J =
For
Question15()
Top(1 To 5) As Integer
Bot(1 To 5) As Integer
ResultTop(1 To 5) As Integer
ResultBot(1 To 5) As Integer
I As Integer
J As Integer
0
I = 1 To 5
Top(I) = Worksheets("Question15").Cells(1,
Bot(I) = Worksheets("Question15").Cells(2,
J = J + Top(I) + Bot(I)
Next I
J = (J Mod 4) + 1
For I = 1 To J
ResultTop(I) = Top(I)
ResultBot(I) = Bot(I)
Next I
For I = J + 1 To 5
ResultTop(I) = Bot(I)
ResultBot(I) = Top(I)
Next I
For I = 1 To 5
Worksheets("Question15").Cells(4, I).Value
Worksheets("Question15").Cells(5, I).Value
Next I
End Sub
Figure 7.2: Code for Question 15
I).Value
I).Value
= ResultTop(I)
= ResultBot(I)
100
This example uses the Mod operator to divide two numbers and
return only the remainder. If either number is a floating-point
number, it is first rounded to an integer.
Dim MyResult
MyResult = 10 Mod 5 '
MyResult = 10 Mod 3 '
MyResult = 12 Mod 4.3
MyResult = 12.6 Mod 5
Returns 0.
Returns 1.
' Returns 0.
' Returns 3.
Figure 7.3: Denitional information on Mod from Micosoft's online help for VBA
16. Suppose that worksheet \Question15" holds input data as indicated in
Table 7.1 (page 98), and that the sub Question16 (Figure 7.4, page 100)
is executed. What number appears when the resulting message box (from
the MsgBox J command in the code) is displayed?
(a) 3
(b) 21
(c) 6
(d) 10101
(e) 91
Sub
Dim
Dim
Dim
J =
For
Question16()
Bot(1 To 5) As Integer
I As Integer
J As Integer
0
I = 1 To 5
Bot(I) = Worksheets("Question15").Cells(2, I).Value
Next I
For I = 1 To 5
J = J + (Bot(6 - I) * 3 ^ (I - 1))
Next I
MsgBox J
End Sub
Figure 7.4: Code for Question 16
101
A
1
1
1
2
B
0
0
C
1
1
D
1
0
E
0
1
A
0
B
1
C
1
D
1
E
1
F
1
A
0
B
1
C
0
D
1
E
1
F
1
A
1
B
0
C
1
D
1
E
1
F
1
A
1
B
0
C
1
D
0
E
1
F
1
A
1
B
1
C
1
D
1
E
1
F
1
Exp2
1
0
1
0
Exp1 Or Exp2
1
1
1
0
102
Sub Question17()
Dim Top(1 To 5) As Integer
Dim Bot(1 To 5) As Integer
Dim Result(1 To 6) As Integer
Dim I As Integer
Dim J As Integer
Dim Carry As Integer
Carry = 0
For I = 1 To 5
Top(I) = Worksheets("Question17").Cells(1, I).Value
Bot(I) = Worksheets("Question17").Cells(2, I).Value
Next I
For I = 5 To 1 Step -1
Result(I + 1) = (Top(I) Or Bot(I)) Or Carry
Carry = (Top(I) And Bot(I)) And Carry
Next I
Result(1) = Carry
For I = 1 To 6
Worksheets("Question17").Cells(4, I).Value = Result(I)
Next I
End Sub
Figure 7.5: Code for Question 17
103
18. Suppose you want to have a function that, when passed a one-dimensional
array holding oating point numbers, would return the largest number in
the array. For example, suppose this function is called Question18 and is
called from the following sub:
Sub TestQ18()
Dim Vector(1 To 4) As Double
Vector(1) = 5
Vector(2) = 17
Vector(3) = 34
Vector(4) = 4
Dim Result As Double
Result = Question18(Vector)
MsgBox Result
End Sub
The value displayed after executing the MsgBox Result command would
be 34.
Suppose, further that someone has provided part of the required function,
Question18, and it is displayed|with absent code indicated|in Figure
7.6, page 105.
104
105
Function Question18(Vector) As Double
Dim Lower As Integer
Dim Upper As Integer
Dim I As Integer
Dim MyMax As Double
Lower = LBound(Vector)
Upper = UBound(Vector)
**** Missing line(s) of VBA code go here. ****
Question18 = MyMax
End Function
Figure 7.6: Code for Question 18
106
Figure 7.7: Illustration of Pythagorean theorem. The triangle has three sides:
a; b; c. The angle between sides a and b is 90 . The lengths of the sides are as
follows: length a = jjajj, length b = jjbjj, length c = jjcjj. According, then, to
the Pythagorean theorem, jjcjj2 = jjajj2 + jjbjj2 .
19. Suppose that we need a function, to be called EDistance2D, for calculating
the (Euclidean) distance between two points in a plane (hence \2D").
Let the points be (x1 ; x2 ) and (y1 ; y2 ) (these are coordinates in the 2dimensional plane, e.g., (4; 5:6)). The function is to be given these four
numbers and is to return the distance between the corresponding two
points.
It may be helpful to recall the Pythagorean theorem, as it pertains to the
hypotenuse of a right (90 ) triangle. See Figure 7.7, page 106.
It will also be helpful to recall that Sqr is the square root function in
VBA. Also, you have been provided with a partial answer to the question,
in the form of a code template. See Figure 7.8, page 107.
107
Function EDistance2D(x1, x2, y1, y2) As Double
'The two points are (x1,x2) and (y1,y2)
**** Missing line(s) of VBA code go here. ****
End Function
Figure 7.8: Code template for Question 19
By way of answering Question 19, which one of the following VBA code
segments is best for lling in the missing line of the code in Figure 7.8,
page 107?
(a) Sqr(EDistance2D) = (x1 - y1) ^ 2 + (x2 - y2) ^ 2
(b) EDistance2D = Sqr((x1 - y1) ^ 2 + (x2 - y2) ^ 2)
(c) EDistance2D = Sqr((x1 - y2) ^ 2 + (y1 - x2) ^ 2)
(d) Sqr(EDistance2D) = (x1 - x2) ^ 2 + (y1 - y2) ^ 2
(e) EDistance2D = Sqr((x1 - x2) ^ 2 + (y1 - y2) ^ 2)
108
Chapter 8
This chunk was extracted from a Web page, which was downloaded from the
109
110
8.2
111
The general idea is to have a language in which we can express patterns (of
strings or text), which can be matched automatically to a given string (or text).
This is a familiar idea to all those who use word processing and are at least
acquainted with using computers. In any Web browser, in Word, and in many
other programs if you type ctrl-f you will get a dialog box that asks you what
you want to nd. You type in a sequence of characters (e.g., theory), click the
appropriate button, and the program nds the rst exact match to your search
string (if there is a match). All this is very helpful, but:
1. Once you nd your pattern, you want your program to do something useful
with it.
Browsers and word processors normally just nd things, then require you
the user to take any required actions.
2. What if you aren't exactly sure of the pattern you are seeking, so that a
strictly literal match won't do?
Example: You don't know whether the name is \Smith" or \Smyth", or
perhaps \color" might be spelled \colour", or perhaps you think the word
might be misspelled, or the target text will change over time, but within
a predictable pattern. As we saw above, the exact value of a stock index
will vary, yet we can expect it to t within a stable pattern.
Regular expressions are a device for handling both of these problems. We'll
discuss the two problems in the context of REs in the next two subsections.
8.2.1
112
1.>>> import re
2.>>> data = open(r'c:\day\attfragment.html','r').read()
3.>>> len(data)
4.1038
5.>>> sandpmatch = re.search(r'S&P 500',data,re.IGNORECASE)
6.>>> sandpmatch
7.<SRE_Match object at 00F12ED0>
8.>>> sandpmatch.span()
9.(681, 692)
10.>>> sandp = data[681:688]
11.>>> sandp
12.'S&P'
Figure 8.1: Simple re.search Example
Exposition on the lines in the gure:
1. We begin by importing (making available) Python's re module (for regular
expressions).
2. The le we want, holding the target text, is attfragment.html, residing
on the C drive under the day directory. In this line we open the le for
reading, and we read() its contents into our variable data, which now
holds a string corresponding to the entire contents of the target le. In
open, the r in r'c:\day... stands for `raw'. In raw mode the backslashes
are taken literally. Normally the backslash is an escape character (more
on this shortly) and if we actually want a backslash we have to escape it|
with a backslash. Instead of r'c:\day... we would have 'c:\\day...
and so on. Raw mode makes things prettier.
3. Here we use the Python function len to obtain the length in characters
of the string data.
4. Python reports that our le and the corresponding string, data, are 1038
characters long.
5. The search method of the re module takes three arguments: a query
string as a regular expression, the target string, and (optionally) ags to
guide the query. The ag re.IGNORECASE is case-sensitive but it tells the
query engine to match regardless of case. Here, this means that 'S&P
500' would match to 's&p 500'.
search pretty much does what its name suggests: it looks through the
target string to nd the rst match to the query string. search returns a
match object or None, depending on whether it nds a match or not. Note
this unsuccessful search:
>>> sandpmatch2 = re.search(r'S&Q 500',data,re.IGNORECASE)
113
>>> sandpmatch2
>>> sandpmatch2 == None
1
>>>
6. In line 6 of the gure we ask for the value of sandpmatch after a successful
search.
7. In line 7 Python tells us the search was successful.
8. Given that sandpmatch != None (i.e., that its search was successful),
there is some place that it found the match. Python's span method for
match objects gives us where the match begins and where the rst character is that is after the match.
9. Python tells us that the match begins at character 681 and continues
through character 691. Character 692 is the rst character after the match.
10. The slice data[681:688] gives us the 7 characters of data beginning with
caracter 681.
11. We ask Python for what was returned from the slice. . .
12. . . . and Python tells us (surprise, surprise) that it is the string "S&P".
8.2.2
REs support pattern matching against strings, not just literal matching, as in
Web browsers and as in the example of the previous section. Note that a minor
paradox lurks. We want to use query strings to form patterned (nonliteral)
matches against target strings. How does the matching program \know" which
strings are to be taken literally and which are to be taken otherwise? The
solution principle is obvious: some characters are to be understood as special.
They are metacharacters and are not taken literally.
Using metacharacters for pattern matching is familiar to most computer
users. When at the command prompt one types
C:\dir *.xls
you are requesting a list of les in the current directory, ending in ".xls". The
asterisk { "*" { is a metacharacter (in this context). It means \match to 0 or
more characters of any kind." So, dir *.xls means \Display all les ending in
`.xls', no matter what comes earlier." Ask yourself: What does dir e*e.xls
mean?
The language of regular expressions has a short list of metacharacters (a
dozen or so) and a clever syntax that combine to provide a very powerful pattern
matching capability. Practice and examples are required to learn how to use
this capability. In x8.3 we summarize the syntax for purposes of reference and
in x8.4 we continue our discussion of problem 2 (pattern matches) in the context
of Python and regular expressions.
114
8.3
Python's RE Syntax
The following list is taken more or less directly from the online Python Library
Reference for the re module. We present it here for the sake of convenience
and completeness. (There is additional information online regarding the re
module.) The reader will likely nd many of the items below unduely arcane,
and rarely used. Learning just a few of the syntactic elements will suce for
most purposes. We oer the suggestion that items 1{13 and several of the items
in 25-35 are the most useful.
1. "." (Dot.)
In the default mode, this matches any character except a newline. If the
DOTALL ag has been specied, this matches any character including a
newline.
2. "^" (Caret.)
Matches the start of the string, and in MULTILINE mode also matches
immediately after each newline.
3. "$"
Matches the end of the string, and in MULTILINE mode also matches
before a newline. foo matches both 'foo' and 'foobar', while the regular
expression foo$ matches only 'foo'.
4. "*"
Causes the resulting RE to match 0 or more repetitions of the preceding
RE, as many repetitions as are possible. ab* will match 'a', 'ab', or 'a'
followed by any number of 'b's.
5. "+"
Causes the resulting RE to match 1 or more repetitions of the preceding
RE. ab+ will match 'a' followed by any non-zero number of 'b's; it will
not match just 'a'.
6. "?"
Causes the resulting RE to match 0 or 1 repetitions of the preceding RE.
ab? will match either 'a' or 'ab'.
7. *?, +?, ??
The "*", "+", and "?" qualiers are all greedy; they match as much
text as possible. Sometimes this behaviour isn't desired; if the RE <.*>
is matched against '<H1>title</H1>', it will match the entire string,
and not just '<H1>'. Adding "?" after the qualier makes it perform
the match in non-greedy or minimal fashion; as few characters as possible
will be matched. Using .*? in the previous expression will match only
'<H1>'.
115
8. fm,ng
9. fm,ng?
10. "\"
Either escapes special characters (permitting you to match characters like
"*", "?", and so forth), or signals a special sequence; special sequences
are discussed below.
If you're not using a raw string to express the pattern, remember that
Python also uses the backslash as an escape sequence in string literals;
if the escape sequence isn't recognized by Python's parser, the backslash
and subsequent character are included in the resulting string. However, if
Python would recognize the resulting sequence, the backslash should be
repeated twice. This is complicated and hard to understand, so it's highly
recommended that you use raw strings for all but the simplest expressions.
11. []
Used to indicate a set of characters. Characters can be listed individually,
or a range of characters can be indicated by giving two characters and
separating them by a "-". Special characters are not active inside sets.
For example, [akm$] will match any of the characters "a", "k", "m",
or "$"; [a-z] will match any lowercase letter, and [a-zA-Z0-9] matches
any letter or digit. Character classes such as \w or \S (dened below) are
also acceptable inside a range. If you want to include a "]" or a "-" inside
a set, precede it with a backslash, or place it as the rst character. The
pattern []] will match ']', for example.
You can match the characters not within a range by complementing the
set. This is indicated by including a "^" as the rst character of the set;
"^" elsewhere will simply match the "^" character. For example, [^5]
will match any character except "5".
12. "|"
A|B, where A and B can be arbitrary REs, creates a regular expression that
will match either A or B. An arbitrary number of REs can be separated
by the "|" in this way. This can be used inside groups (see below) as well.
REs separated by "|" are tried from left to right, and the rst one that
116
13. (...)
Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved
after a match has been performed, and can be matched later in the string
with the \number special sequence, described below. To match the literals
"(" or ")", use \( or \), or enclose them inside a character class: [(]
[)].
14. (?...)
This is an extension notation (a "?" following a "(" is not meaningful
otherwise). The rst character after the "?" determines what the meaning
and further syntax of the construct is. Extensions usually do not create a
new group; (?P<name>...) is the only exception to this rule. Following
are the currently supported extensions.
15. (?iLmsux)
(One or more letters from the set "i", "L", "m", "s", "u", "x".) The
group matches the empty string; the letters set the corresponding ags
(re.I, re.L, re.M, re.S, re.U, re.X) for the entire regular expression. This is useful if you wish to include the ags as part of the regular
expression, instead of passing a ag argument to the compile() function.
Note that the (?x) ag changes how the expression is parsed. It should
be used rst in the expression string, or after one or more whitespace
characters. If there are non-whitespace characters before the ag, the
results are undened.
16. (?:...)
A non-grouping version of regular parentheses. Matches whatever regular
expression is inside the parentheses, but the substring matched by the
group cannot be retrieved after performing a match or referenced later in
the pattern.
17. (?P<name>...)
Similar to regular parentheses, but the substring matched by the group
is accessible via the symbolic group name name. Group names must be
valid Python identiers. A symbolic group is also a numbered group, just
as if the group were not named. So the group named 'id' in the example
above can also be referenced as the numbered group 1.
For example, if the pattern is (?P<id>[a-zA-Z ]\w*), the group can be
referenced by its name in arguments to methods of match objects, such as
117
118
25. \A
Matches only at the start of the string.
26. \b
Matches the empty string, but only at the beginning or end of a word. A
word is dened as a sequence of alphanumeric characters, so the end of a
word is indicated by whitespace or a non-alphanumeric character. Inside a
character range, \b represents the backspace character, for compatibility
with Python's string literals.
27. \B
Matches the empty string, but only when it is not at the beginning or end
of a word.
28. \d
Matches any decimal digit; this is equivalent to the set [0-9].
29. \D
Matches any non-digit character; this is equivalent to the set [^0-9].
30. \s
Matches any whitespace character; this is equivalent to the set
[\t\n\r\f\v].
31. \S
Matches any non-whitespace character; this is equivalent to the set
[^\t\n\r\f\v].
32. \w
When the LOCALE and UNICODE ags are not specied, matches any
alphanumeric character; this is equivalent to the set [a-zA-Z0-9 ]. With
LOCALE, it will match the set [0-9 ] plus whatever characters are dened as letters for the current locale. If UNICODE is set, this will match
the characters [0-9 ] plus whatever is classied as alphanumeric in the
Unicode character properties database.
33. \W
When the LOCALE and UNICODE ags are not specied, matches any
non-alphanumeric character; this is equivalent to the set [^a-zA-Z0-9 ].
With LOCALE, it will match any character not in the set [0-9 ], and
not dened as a letter for the current locale. If UNICODE is set, this will
match anything other than [0-9 ] and characters marked at alphanumeric
in the Unicode character properties database.
34. \Z
Matches only at the end of the string.
119
35. \\
Matches a literal backslash.
8.4
Here in Figure 8.2 is a Python program using REs that uses pattern matching
to nd the value of the S&P 500 index.
1.>>> sandpindex = re.search(r'(S&P 500)(.*?)(\d+\.\d+)',data,\
re.IGNORECASE|re.DOTALL)
2.>>> sandpindex.group(0)
3.'S&P 500</p></td>\n<td align="right"><p> 1145.96'
4.>>> sandpindex.group(1)
5.'S&P 500'
6.>>> sandpindex.group(2)
7.'</p></td>\n<td align="right"><p> '
8.>>> sandpindex.group(3)
9.'1145.96'
10.>>> sandpindex.span(3)
11.(735, 742)
12.>>> data[735:742]
13.'1145.96'
>>>
Figure 8.2: RE Program for Finding S&P 500 Index
Points arising:
1. In line 1, in the RE we use parentheses to create three groups. Reading
from left to right: group 1 is our old friend (S&P 500); group 2,
(.*?), is the junk between group 1 and the index value; and group 3,
(\d+\.\d+), matches the index value. Lines 4{5 show the match to group
1; lines 6{7 show the match to group 2; lines 8{9 show the match to group
3; and since group 0 is the entire match, lines 2{3 show it.
2. Groups 2 and 3 use Python's RE metacharacters and syntax to eect
pattern matching. Group 1, as we saw previously, is an RE that eects
only literal matching.
3. Group 2 = (.*?). (See x8.3 items 1 (page 114), 4 (page 114), and 7 (page
114).) The parentheses dene the group. The dot means \any character
at all, except a newline," because because the ag re.DOTALL is present
(line 1), even newlines are matched. So, the dot matches any character at
all in this query. The asterisk means \0 or more occurences matching the
120
4. Group 3 = (\d+\.\d+). (See x8.3 items 28 (page 118), 5 (page 114), and
35 (page 119).) \d+ means \one or more decimal digits." \. means \one
occurrence of the dot character." Note: The backslash is used to escape
from the normal meaning of the dot character, causing the RE engine to
take it literally.
5. The backslash at the end of the top of line 1 is Python's line continuation
character. Not strictly necessary, it is useful for display purposes.
6. Note the re.IGNORECASE|re.DOTALL construction in line 1. This sets two
ags for the RE. As mentioned earlier, IGNORECASE says to do matching
regardless of upper or lower case, and DOTALL says that the dot should
match every character, including the newline character, \, which by default
it fails to match.
8.5
8.6
Exercises
8.6.1
Find two sources on the Web that report the NASDAQ levels. As of December
25, 2001, both
http://money.cnn.com/markets/nasdaq.html, and
http://moneycentral.msn.com/investor/research/msnbc/newsnap.asp?symbol=$COMPX
121
do this.
Write a Python script, to be run from the command line (in script mode),
that grabs the reported value of the NASDAQ, and that prints out (to the
screen):
1. what the two values are,
2. where they came from,
3. what the dierence between them is, and
4. the date/time at which these queries were made.
Hints: In addition to using the re module to extract necessary information,
you will nd it useful to use the urllib module for downloading Web pages and
the time module for nding the date and time. The following interaction with
Python in interactive mode should be helpful to you in this regard.
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> import urllib
>>> cnnnasdaq = "http://money.cnn.com/markets/nasdaq.html"
>>> cnnpage = urllib.urlopen(cnnnasdaq).read()
>>> len(cnnpage)
26697
>>> cnnfile = open('d:\\cnn.txt','w')
>>> cnnfile.write(cnnpage)
>>> cnnfile.close()
>>> import time
>>> now = time.localtime(time.time())
>>> print time.asctime(now)
Tue Dec 25 14:52:30 2001
8.7
122
if tomatch == None:
print "tomatch is None."
else:
print tomatch
mymatch = tomatch.search(data)
if mymatch == None:
print "It's None"
else:
print mymatch.group(3)
When run, bob.py produces this output (as it should):
>>>
1038
S&P 500
<SRE_Pattern object at 0CAB5590>
1145.96
>>>
Note:
1. If queries are to be done repeatedly on a regular expression, then compiling
then once and running particular queries many times will be more ecient.
2. Compilation produces an SRE PatternObject from an RE. Roughly:
SRE PatternObject = re.compile(RE)
Then, instead of re.search(RE,TargetString) we use
SRE PatternObject.search(TargetString). The eects are the same.
3. I had trouble with the compilation ags. If re.VERBOSE was present the
search failed to nd anything. Moreover, even with triple quoting, spreading out the RE to be compiled across several lines only produced errors.
But what's above (") does work.
File: text-pattern.tex
Chapter 9
Programming Excel
Microsoft's COM (Component Object Model) is a standard by which so-called
client programs can manipulate other programs called server programs.1 In
the lingo of the trade, a COM server (server program) \exposes its objects" in
accordance with the COM standard so that client programs can get the server to
do things for them. Client programs, of course, must know how to communicate
with COM objects.
Excel, Access, Word, PowerPoint and the other parts of the Microsoft Oce
suite all support and conform to COM. That is, each of these programs is capable
of being a COM server. Moreover, the VBA built into them is COM-aware so
that VBA programs can be COM clients. In fact, an Excel (Access, Word,. . . )
macro is a COM client that uses the Excel (Access, Word,. . . ) host as a COM
server.
From the perspective of a COM client, a COM server program has a certain
look and feel. A server is a hierarchical collection of objects, each object having
its own properties. Each application, of course, has its own characteristic objects
and hierarchy, its own object model. Once you know the object model for a
server, you can write a client-side program to manipulate the server. What this
amounts to is simply changing the properties of the server's objects.
This is in principle a very elegant design. When you work interactively with
Excel, manipulating it in the end-user programming style (using the graphical interface), you are essentially just creating and deleting objects (such as
worksheets in a workbook), and changing properties of objects. To enter a new
number in a cell (object) is just to change the cell's Value property. Coloring
and other formatting operations should be understood similarly. Here by way
of example is a line of Excel VBA code:
1 Dierent names abound and create confusion. Names have come and gone, including
OLE, Automation, DCOM, COM+, and ActiveX. The programming community has more or
less settled on `COM' as the name of this evolving technology standard. The following passage
from a recent Microsoft publication indicates that Microsoft may have acceded to common
practice. \The key technology that makes individual Oce applications programmable and
makes creating an integrated Oce solution possible is the Component Object Model (COM)
technology known as automation" [15, page 37]. We'll stick with `COM'.
123
124
Worksheets("Sheet1").Range("A1").Interior.ColorIndex = 3
This is a simple assignment statement. It assigns the value 3 to the stu on the
left. Interpreted, what's going on is 3 is assigned to the ColorIndex property of
the Interior (object), which is part of the A1 Range object (the northwest-most
cell), which is part of the Sheet1 Worksheet object. Since 3 is Excel's code for
the color red, execution of this statement causes the A1 cell on Sheet1 to turn
red. Similarly,
Worksheets("Sheet1").Range("A1").Value = "Yo, world!"
is VBA code for putting \Yo, world!" in A1, i.e., setting the Value property of
A1 to \Yo, world!".
We're now going to discuss in more detail how to manipulate Excel from
Python, instead of from VBA. As we will see, however, the dierence is slight.
This is a consequence of using COM. Once you've learned it from Python, you've
learned from VBA, and vice versa.
9.1
Our focus will be entirely on using Python for COM client-side programming.
That is, our Python scripts will manipulate the exposed objects of a COM
server, here Microsoft Excel. Python can also be used to create a COM server
(which could be manipulated by an Excel VBA client, among others!). The
additional steps to do this are really minimal. We shall forebear in part because
security considerations prohibit this kind of programming on machines in public
labs. The interested reader might consult [9] or the Website
http://starship.python.net/crew/mhammond/ppw32/
and proceed cheerfully at home.
Here's the obligatory \Hello, world" program, with Python the client and
Excel the COM server. And to liven things up, we even turn the A1 cell red.
(Line numbers have been added.)
Python 2.1.1 (#20, Jul 20 2001, 01:19:29) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
1.>>> import win32com.client
2.>>> xl = win32com.client.Dispatch('Excel.Application')
3.>>> xl.Visible = 1
4.>>> xl.Workbooks.Add()
5.<COMObject Add>
6.>>> xl.Worksheets("Sheet1").Range("A1").Value = "Yo, world!"
7.>>> xl.Worksheets("Sheet1").Range("A1").Interior.ColorIndex = 3
What's going on here is really quite simple. Line 1 imports the Win32 COM
client module. Making Python COM aware boils down to importing the right
module. That's it. Now we can do things. The rst thing we do is to launch|
\dispatch" is the technical term|a new Excel process. If we wanted to dispatch
Word we would say "Word.Application" instead of "Excel.Application",
125
and so on. Note that xl is a Python variable, which will now be an instance of
an Excel COM object. (Try typing type(xl) at the Python prompt.) We could
have used any other valid Python variable. xl was chosen for purely mnemonic
reasons. Think how confusing it would have been to use word.
By default Excel is dispatched in the background, so you can't see it. In
line 3 we set the Visible property of the dispatched Excel application instance,
xl, to true, i.e., to 1. After Python executes this line you will see Excel come
alive but without any worksheets. We have to add them and we do in line 4.
Here, the interpretation is slightly new. Add() is not a property of Workbooks,
it is a method (note the parentheses), a program associated with the object
Workbooks, which now gets executed. The eect is that a new workbook, by
default Book1 with 3 worksheets, is added to the collection of workbooks in xl.
If we execute this line again, Book2 with 3 worksheets is added. We'll stick with
Book1 for now.
Line 6 is our \Hello, world" program. I wrote it by typing xl. and then
copying
Worksheets("Sheet1").Range("A1").Value = "Yo, world!"
from the VBA program discussed above. I did the same thing in line 7. The
eects are the same as in the VBA program, and for the same reasons. The
pattern is evident and straightforward. You program Excel from Python or
VBA or any other environment supporting COM clients by identifying the Excel
object you want to aect and then either assign a value to one of its properties
or identify and execute one of its methods. That's essentially it.
The rest is details. The big question now is \How do I learn about the Excel
object model?" We'll cover what you most need here. You can also explore
with the Object Browser in Excel (View then Object Browser from the VBA
development environment in Excel). The best documentation I have found is
in [15], but note the blub on the cover: \The hard-core programming guide to
Microsoft Oce XP development." (You don't need it.)
Perhaps the easiest way to get a handle on the Excel object model is to
record VBA macros in Excel and examine them. Here's one, for copying from
the range C2:D3 to C6:D7.
Sub Macro4()
'
' Macro4 Macro
' Macro recorded 12/26/2001 by Steven O. Kimbrough
'
Range("B2:C3").Select
Selection.Copy
Range("B5").Select
ActiveSheet.Paste
Application.CutCopyMode = False
Range("A1").Select
End Sub
What it does is plain. On Excel's Activesheet, it copies the range C2:D3 to
126
127
>>> bob
((u'a', u'c'), (u'b', u'd'))
>>> carol = str(bob[0][0]),str(bob[0][1])
>>> carol
('a', 'c')
If you have a string in Python and you want to write it out to Excel, Python
automatically makes the conversion to unicode.
Most of what you want to do using Python as a client to Excel is one of the
following three things:
1. Write information onto a worksheet
2. Read information o of a worksheet
3. Format a worksheet
Writing and reading to Excel are simple inverses. Here are some writing (to
Excel) examples.
>>>
>>>
>>>
>>>
xl.Range("A6").Value = "Hello."
xl.Range("A7").Value = 12.3
xl.Range("A8").Value = 12.3
xl.Cells(9,1).Formula = "=sum(a7:a8)"
128
>>> xl.Range("A1:B3").Value = 12
which assigns 12 to every cell in the range. These are things you learn when
you need them. Better when starting out to keep to the KISS principle.
And formatting is simply a minor variation. Instead of reading or writing
the Value or Formula property of a cell (or range of cells), we read or write
some other property, such as (see above) the Interior.ColorIndex property. The
details, however, are a bit daunting, especially since Excel relies on special
program constants, standing for integers whose values are really hard to nd.
So, I'm going to show you how, but I'll put the information in an optional
section.
9.2
Suppose you recorded an Excel VBA macro while you selected the \Borders"
toolbar icon and choose the \Bottom Double Border" item. The resulting VBA
macro would look like this:
Sub Macro2()
'
' Macro2 Macro
' Macro recorded 12/28/2001 by Steven O. Kimbrough
'
'
Selection.Borders(xlDiagonalDown).LineStyle = xlNone
Selection.Borders(xlDiagonalUp).LineStyle = xlNone
Selection.Borders(xlEdgeLeft).LineStyle = xlNone
Selection.Borders(xlEdgeTop).LineStyle = xlNone
With Selection.Borders(xlEdgeBottom)
.LineStyle = xlDouble
.Weight = xlThick
.ColorIndex = xlAutomatic
End With
Selection.Borders(xlEdgeRight).LineStyle = xlNone
End Sub
Suppose now you wanted to format cell A9 with a \Bottom Double Border"
border. You might think this command will do that:
xl.Range("A9").Borders(xlEdgeBottom).LineStyle = xlDouble
You would be wrong. Python naturally thinks that xlEdgeBottom and xlDouble
are Python variables, and Excel will certainly get the wrong message, if it gets
any message at all. You could quote the variables, but then Excel will think
they are strings and will get very confused. The problem is that in VBA,
xlEdgeBottom is a constant, standing for the integer 9, and xlDouble is a constant whose value is the integer -4119. This command does work:
129
xl.Range("A9").Borders(9).LineStyle = -4119
This solves our problem, provided we can discover what Excel/VBA's constants
stand for. Here's what you do. Launch PythonWin and enter the following
commands at the prompt in interactive mode:
>>> import win32com.client
>>> xl = win32com.client.Dispatch('Excel.Application')
>>> xl.Visible = 1
>>> win32com.client.constants.xlNone
-4142
>>>
If this is what you get, then proceed after skipping the next paragraph.
If, instead of getting a -4142 in response to your last line you get an error message, do this. In PythonWin, under the Tools menu you will see the item COM
Makepy utility. Choose it. You will be presented with a very long list of applications. Scroll down until you nd \Microsoft Excel X.Y Object Library (W.Z)"
where W, X, Y, and Z are all small integers. Select it, say OK, and wait until
Python builds a le for you. When the prompt returns in the interactive shell
(hit RETURN if things quiet down), try win32com.client.constants.xlNone
again. It should work. If so, your Python installation is set up for what we need
to do. This only needs doing once per Python installation.
At this point we're essentially done. If X is an Excel VBA constant, then
win32com.client.constants.X will reveal its value. Here are two examples.
>>> win32com.client.constants.xlEdgeBottom
9
>>> win32com.client.constants.xlDouble
-4119
>>>
So if you can identify the Excel constant, Python will tell you its value, and
you can happily program away. Also, you can use either the integers or their
Python expressions in your Python code. The integer version
xl.Range("A9").Borders(9).LineStyle = -4119
works just as well as the Python object version
xl.Range("A9").Borders(9).LineStyle = win32com.client.constants.xlDouble
My own druthers are to use Python variables as mnemonics.
>>> xlDouble = win32com.client.constants.xlDouble
>>> xlDouble
-4119
>>> xlEdgeBottom = win32com.client.constants.xlEdgeBottom
>>> xlEdgeBottom
9
>>>
And now this does work:
130
xl.Range("A9").Borders(xlEdgeBottom).LineStyle = xlDouble
You weren't so wrong after all.
9.3
Excel and other Microsoft Oce programs are structured as hierarchical collections of objects. The application is an object. The application's workbooks
are objects. The worksheets within a workbook are objects, as are the charts.
Ranges are objects contained by worksheets, but not by charts. And so on. Objects are particular; they are said to be instances of their classes. For example,
Workbooks(3).Worksheets(2).Range("A2:B4") is a particular object. It is an
instance of the Range class.
Objects have properties and methods. Together, they are called members.
(Typically, a property is also an object, which may be confusing.) Properties
are types of values, which usually can be set under program control. We have
seen that Value and Formula are two of the properties of objects in the Range
class. Methods are programs that their objects can execute. Methods may or
may not require parameters on input, but they always require parentheses. An
example of a parameterless method is Add, which we have seen before:
xl.Workbooks.Add()
So, Add is a method for objects in the Workbooks class. If we look at the Object
Browser in Excel (in the VBA editor, ViewjObject Browser) and we focus on
the Excel library, we nd Workbooks among the classes. Selecting Workbooks
we see that Add is the very rst member of the Workbooks class. (Notice how
the icons distinguish the properties and the methods.) If we select Add, rightclick on it, and select Help, we see a display that tells us Add is a method for
adding objects to collections. It is used with many dierent types of collections
(classes). If we select Workbooks we see this information displayed.
Add Method (Workbooks Collection)
Creates a new workbook. The new workbook becomes the active
workbook. Returns a Workbook object.
Syntax
expression.Add(Template)
expression Required. An expression that returns a Workbooks object.
Template Optional Variant. Determines how the new workbook is
created. If this argument is a string specifying the name of an existing Microsoft Excel le, the new workbook is created with the
specied le as a template. If this argument is a constant, the new
workbook contains a single sheet of the specied type. Can be one of
131
the following XlWBATemplate constants: xlWBATChart, xlWBATExcel4IntlMacroSheet, xlWBATExcel4MacroSheet, or xlWBATWorksheet. If this argument is omitted, Microsoft Excel creates a
new workbook with a number of blank sheets (the number of sheets
is set by the SheetsInNewWorkbook property).
Remarks
If the Template argument species a le, the le name can include
a path.
Objects have properites and methods. You call the methods and get
or set the properties. See the Excel object browser and then use the
help facility. Also, record VBA macros.
Exploring in this manner will tell you much about Excel's object model. The
Object Browser is there when you need it, so you shouldn't bother with memorization. Recording VBA macros and studying them (when you need to know
something) is also a good way to learn about the object model.
Finally, Python is very helpful for learning about the object model. Recall:
>>> xl.Name
u'Microsoft Excel'
Microsoft Excel is our top-level object. We can also get the name and the parent
of any (particular) object:
>>> xl.Workbooks(1).Name
u'Book1'
>>> xl.Workbooks(1).Parent.Name
u'Microsoft Excel'
Notice that the top-level object is its own parent:
>>> xl == xl.Parent
1
>>> xl.Parent.Name
u'Microsoft Excel'
>>> xl.Workbooks(1) == xl.Workbooks(1).Parent
0
And the pattern generalizes:
>>> xl.ActiveSheet.Name
u'Sheet2'
>>> xl.ActiveSheet.Parent.Name
u'Book1'
Notice that the Name property is settable:
>>> xl.ActiveSheet.Name = 'Ted'
>>> xl.ActiveSheet.Name
u'Ted'
132
9.4
9.4.1
Miscellany
Gotchyas
Capitalization:
>>> xl.Name
u'Microsoft Excel'
>>> xl.name
Traceback (most recent call last):
File "<pyshell#15>", line 1, in ?
xl.name
File "C:\Python21\win32com\client\__init__.py", line 348, in __getattr__
raise AttributeError, attr
AttributeError: name
>>>
Lesson: Get it right.
9.4.2
Range Names
>>> xl.Workbooks(1).Names(1).Name
u'ted'
>>> xl.Workbooks(1).Names(1).RefersTo
u'=Sheet1!$C$2:$D$3'
>>> myted = xl.Workbooks(1).Names(1)
>>> myted.Name
u'ted'
>>> xl.Workbooks(1).Names.Add('alice','=Sheet1!R1C1')
<win32com.gen_py.Microsoft Excel 9.0 Object Library.Name>
>>> xl.Workbooks(1).Names.Count
2
>>> xl.Workbooks(1).Names(2).Name
u'ted'
>>> xl.Workbooks(1).Names(1).Name
u'alice'
>>> xl.Workbooks(1).Worksheets(1).Range("alice").Value = 19
>>> xl.Range("alice").Value = 22
The last line works only if workbook 1 and worksheet 1 are active; otherwise
Excel would be confused (and understandably so). Assuming it is, this line
turns an entire range red:
>>> xl.Range("ted").Interior.ColorIndex = 3
9.4.3
Saving Workbooks
>>> xl.ActiveWorkbook.SaveAs("C:\day\pythonegs3.xls")
9.4. MISCELLANY
133
>>> xl.ActiveWorkbook.Name
u'pythonegs3.xls'
>>> xl.ActiveWorkbook.Save()
>>>
9.4.4
Directories
9.4.5
When you run a Python program in script mode it is often useful to be able
to specify arguments on the command line, which the script will process as it
executes. Here's an example. The Python le, arglist.py, looks like this:
import sys
argumentlist = sys.argv
print argumentlist
Here is what it does when executed from the command line, with and without
arguments.
C:\>arglist.py
['C:\\arglist.py']
C:\>arglist.py Now is the time for 12.3
['C:\\arglist.py', 'Now', 'is', 'the', 'time', 'for', '12.3']
C:\>
134
In a real program, you would process the list argumentlist, converting the
strings, where necessary, to numbers. Example:
>>> int('12')
12
>>> float('12.3')
12.300000000000001
>>>
9.4.6
raw input prompts the user and awaits a response. The response is read into a
string for subsequent processing by the program. Here's an example. In an IDE,
when input = raw input("Tell me: ") is executed a dialog box appears and
the program stops until the user gives a response. That response is read into
input and the program continues.
>>> input = raw_input("Tell me: ")
>>> input
'Go Eagles!'
>>> type(input)
<type 'string'>
>>>
9.4.7
Copying Worksheets
9.5
9.5.1
Gotchyas
Forgetting Parentheses in Methods
Some things don't work the way you might think they should. Here's an example.
PythonWin 2.1.1 (#20, Jul 20 2001, 01:19:29) [MSC 32 bit (Intel)]
on win32.
Portions Copyright 1994-2001 Mark Hammond (MarkH@ActiveState.com)
- see 'Help/About PythonWin' for further copyright information.
>>> import win32com.client
>>> xl = win32com.client.Dispatch('Excel.Application')
>>> xl.Visible
0
>>> xl.Visible = 1
>>> xl.Workbooks.Add()
9.5. GOTCHYAS
135
9.5.2
Capitalization
Excel's COM objects and members (properties and attributes) have an ocial
capitalization scheme|and this matters. Usually, that is. Sometimes it seems
that you can get away with incorrect capitalization. Mostly you can't.
>>> xl.Name
u'Microsoft Excel'
>>> xl.Selection.Address
u'$A$1'
>>> xl.ActiveSheet.Range("B5:C8").Select()
>>> xl.Selection.Address
u'$B$5:$C$8'
>>> xl.Selection.address
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "D:\Python21\win32com\client\__init__.py", line 348, in __getattr__
raise AttributeError, attr
AttributeError: address
>>> xl.name
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "D:\Python21\win32com\client\__init__.py", line 348, in __getattr__
raise AttributeError, attr
AttributeError: name
>>> xl.Name
u'Microsoft Excel'
>>> xl.Activesheet.name
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "D:\Python21\win32com\client\__init__.py", line 348, in __getattr__
136
9.6
9.7
Exercises
9.7.1
The SEC (Securities and Exchange Commission) requires various reports to be
led by American corporations. Stock market analysts are particularly keen in
following the 10-Q reports, which are led quarterly and contain, among other
things, nancial statements by the companies. The SEC ling, including the 10Q reports, are available online at http://www.sec.gov. Write a Python program
9.7. EXERCISES
137
that grabs data from 10-Q reports for 4-6 companies and loads these data into
a well-designed spreadsheet format for analysis and comparison purposes.
File: python-excel.tex
138
Chapter 10
Preliminaries
See in the Excel VBA editor ToolsjRefrences. Scroll down until you nd something like \Microsoft DAO 3.6 Object Library." This tells you that you need
to dispatch "DAO.DBEngine.36". You can also see the References from Access.
Note the directions from Microsoft's online help:
On the Tools menu, click References. The References command
on the Tools menu is available only when a Module window is open
and active in Design view.
Also, you should rst run the makepy utility. In Pythonwin under the Tools
menu choose \COM Makepy utility" then select \Microsoft DAO 3.6 Object
Library". You only need to do this once.
10.2
The basic structure for connecting to an Access database is like that for connecting to Excel. The following code imports the win32com.client module,
dispatches the appropriate DAO.DBEngine (version 3.6), and opens an existing
database, called db1.mdb.
>>> import win32com.client
>>> engine = win32com.client.Dispatch("DAO.DBEngine.36")
>>> db = engine.OpenDatabase(r'c:\day\db1.mdb')
As usual, the database is an object with properties.
>>> db.Name
u'c:\\day\\db1.mdb'
139
140
141
>>> rs.Fields(0).Name
u'docid'
>>> for i in range(rs.Fields.Count):
... print rs.Fields(i).Name
...
docid
position
wordid
>>>
>>> db.Close()
MoveLast is a recordset method. Given a recordset there is always a cursor
or record pointer pointing a some record in the recordset. MoveFirst moves
the cursor to the rst record in the recordset. MoveLast moves it to the last
record. MoveNext and MovePrevious do the obvious things. Fields is a property of a recordset, and Count is a property of Fields. In the above interaction,
we see that Table1 has three elds (columns), called \docid," \position," and
\wordid."
Now we do another SELECT query.
>>> rs = db.OpenRecordset('select * from Table1')
>>> while not rs.EOF:
print rs.Fields('wordid').value
rs.MoveNext()
3
4
What this does is tell us that \wordid" is 3 and 4 in the two records of this
recordset. EOF means \end of le." Basically, while not rs.EOF: prevents the
program from trying to access a nonexistent record in the recordset.
Here we explore the database further, looking at various properties.
>>> db = engine.OpenDatabase(r'c:\day\db1.mdb')
>>> for i in range(db.TableDefs.Count):
... print i, db.TableDefs(i).Name
...
0 MSysAccessObjects
1 MSysACEs
2 MSysObjects
3 MSysQueries
4 MSysRelationships
5 Table1
>>> for i in range(db.TableDefs(5).Fields.Count):
... print i, db.TableDefs(5).Fields(i).Name
...
142
0 docid
1 position
2 wordid
>>> rs = db.OpenRecordset('select * from Table1')
>>> rs.MoveLast()
>>> rs.RecordCount
2
>>> rs.MoveFirst()
>>> rs.GetRows()
((1,), (1,), (3,))
>>> rs.GetRows()
((1,), (2,), (4,))
>>> rs.GetRows()
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
[further error message]
>>> rs.Close()
Notive (above) that there are actually 6 tables in this database. Five of them
are Access housekeeping tables (you can access them). Only Table1 is \for
real." Note that for i in range(db.TableDefs(5).Fields.Count): does
here what while not rs.EOF: did above: prevent an error due to exceeding
the recordset. RecordCount is new here. Very useful, but be sure you rst
execute
>>> rs.MoveLast()
Finally, GetRows() is new. GetRows() retrieves one row|the one being pointed
to by the cursor|as a tuple of tuples. (The notation \(1,)" means a tuple with
one element, the integer 1.) GetRows(n) retrieves n rows as tuples of tuples,
starting with the current row. Notice that if you try to get a row beyond the
recordset you get an error.
10.3
Beyond SELECT
To run SQL commands other than SELECT commands you use a dierent
mechanism in DAO. You use Execute. Here's an example.
>>> db.Execute("delete * from Table1")
>>> rs = db.OpenRecordset("select * from Table1")
>>> rs.MoveLast()
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
[further error message]
>>> rs.RecordCount
0
>>> db.Execute("INSERT INTO Table1 VALUES(10, 11, 12)")
143
10.4
These interactions demonstrate use of strings to dene the SQL queries, and
the use of SQL WHERE clauses.
>>> rs.Fields.Item(0).Name
u'docid'
>>> db.Name
u'c:\\day\\db1.mdb'
>>> rs = db.OpenRecordset("select * from Table1")
>>> rs.MoveLast()
>>> rs.RecordCount
1
>>> rs.GetRows()
((10,), (11,), (12,))
>>> db.Execute("INSERT INTO Table1 VALUES(20, 21, 22)")
>>> rs = db.OpenRecordset("select * from Table1")
>>> rs.MoveLast()
>>> rs.RecordCount
2
>>> rs.GetRows()
((20,), (21,), (22,))
>>> strSQL = "SELECT * FROM Table1 WHERE docid < 15"
>>> rs =db.OpenRecordset(strSQL)
>>> rs.MoveLast()
>>> rs.RecordCount
1
>>> rs.GetRows()
((10,), (11,), (12,))
>>>
Finally, this interaction with the spj-begin.mdb database illustrates construction of an SQL query string containing an element that has to be quoted.
144
Also, Access allows odd characters in its eld names. \S#" is not permitted in
standard SQL. So, Access uses a square bracket notation to indicate that this
is indeed a valid eld name (in Access). This mechanism is also needed when
eld names have spaces in them.
>>> db.TableDefs(5).Fields(0).Type
4
>>> spdb = engine.OpenDatabase(r'c:\day\spj-begin.mdb')
>>> spdb.Name
u'c:\\day\\spj-begin.mdb'
>>> strSQLsp = "SELECT [S#] From s WHERE SNAME = 'Adams'"
>>> rssp = spdb.OpenRecordset(strSQLsp)
>>> rssp.MoveLast()
>>> rssp.RecordCount
1
>>> rssp.GetRows()
((u'S5',),)
>>>
10.5
Gotchyas
If you are accessing a database from Python and then you open it up Microsoft
Access, suddenly your Python DAO commands won't work. The problem is
that Access is a single user system.
Referential integrity considerations in Access may prevent deletion of a
record. If so, your DAO SQL DELETE command won't work; it'll do nothing.
Lesson: try things rst in Access in SQL.
10.6
Microsoft's [15], especially page 730, has useful information on DAO. The help
le is DAO360.CHM and is a useful install. Helen Feddema's DAO Object Model:
The Denitive Reference [7] is excellent. Two terric Web pages for Python and
DAO and Access:
http://starship.python.net/crew/bwilk/access.html
http://www.e-coli.net/pyado.html
File: python-dao.tex
Chapter 11
To use an example without loss of generality, suppose you have Python module
le, startcom.py, at D:\athomepc\day, as follows:
import win32com.client
xl = win32com.client.Dispatch('Excel.Application')
If you try to import this module, you'll likely get an error message.
Python 2.1.1 (#20, Jul 20 2001, 01:19:29) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> from startcom import *
Traceback (most recent call last):
File "<pyshell#0>", line 1, in ?
from startcom import *
ImportError: No module named startcom
>>>
The problem is that D:\athomepc\day is not in Python's search path. The
following code shows this.
>>> import sys
>>> sys.path
['D:\\PYTHON21\\Tools\\idle', 'D:\\Python21\\win32',
'D:\\Python21\\win32\\lib','D:\\Python21',
'D:\\Python21\\Pythonwin', 'D:\\PYTHON21\\DLLs',
'D:\\PYTHON21\\lib',
'D:\\PYTHON21\\lib\\plat-win',
'D:\\PYTHON21\\lib\\lib-tk']
>>>
145
146
The thing to do is to add (append) the path you want Python to look in to
the sys.path list. You can do it as follows.
>>> sys.path.append('d:\\athomepc\\day')
>>> sys.path
['D:\\PYTHON21\\Tools\\idle', 'D:\\Python21\\win32',
'D:\\Python21\\win32\\lib', 'D:\\Python21',
'D:\\Python21\\Pythonwin', 'D:\\PYTHON21\\DLLs',
'D:\\PYTHON21\\lib', 'D:\\PYTHON21\\lib\\plat-win',
'D:\\PYTHON21\\lib\\lib-tk', 'd:\\athomepc\\day']
>>>
Now your import will work just ne.
>>> from startcom import *
>>> xl.Name
u'Microsoft Excel'
>>>
11.2
Dispatching Excel
11.3
147
>>> db.Properties(0).Name
u'Name'
>>> db.Properties.Count
13
>>> for i in range(db.Properties.Count):
... print i, db.Properties(i).Name
...
0 Name
1 Connect
2 Transactions
3 Updatable
4 CollatingOrder
5 QueryTimeout
6 Version
7 RecordsAffected
8 ReplicaID
9 DesignMasterID
10 Connection
11 AccessVersion
12 Build
>>> db.Properties(11).Name
u'AccessVersion'
>>> db.Properties(11).Value
u'08.50'
>>>
11.4
>>> xl.ActiveWorkbook.Worksheets('Sheet1').cells(4,2).FormulaR1C1 = \
"=SUM(R[-2]C:R[-1]C)"
Note: \ is Python's line-continuation character.
11.5
148
...
Expr1000
File: python-quick-ref.tex