Professional Documents
Culture Documents
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Learn 32-bit ( for 80386/486 /Pentium/ MMX CPUs ) assembly language programming on a Linux Platform. All you need to carry out these exercises is a PC (386 or up) with Linux loaded. It normally comes with gcc, the Gnu C compiler.
S. K. Ghoshal SERC, IISc Bangalore 560 012 India.
Index
Copyright 1998 Permission granted only for educational use. For other types of usage, Contact the author. Introduction Step 1 - The rst program Step 2 - There is an assembly equivalent for anything that executes Step 3 - Assembly language needs no segmentation to work appreciate the at model Step 4 - The C compilation roadmap Step 5 - How the control ows Step 6 - Public Symbols Step 7 - Parameter Passing Rules Step 8 - Look at the Stack Frame Step 9 - Pass Two Parameters now Step A - Call C program from Assembly Step B - Allocate and access static variables Step C - Static variables are by reference Step D - Master address arithmetic Step E - Static Variables dened in C program body Step F - Automatic Variables Step 10 - towards recursion
1 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Introduction
Assembly language programs are required when one needs to control hardware directly, without any compiler-generated code coming in between the programmer and the microprocessor. This (that is, the practice of programming in assembly language) has the advantage of speed of execution as well as predictable latencies. In other words, segments of human-written assembly language code reduces execution time down to the last possible clock cycle. Also, the time for the program control to pass between two checkpoints (e.g. two labels in an assembly program fragment) can be predicted with the resolution of one clock cycle. The disadvantage is that writing huge assembly language programs is tediuos and error-prone. So for writing larger programs, one prefers to write in high-level languages. C is one such language widely used by programmers who write system software. Most system software (e.g. the Unix operating system) is written mostly in C, with a few assembly language routines at the core. 95% of the code is in C. The C routines call assembly routines where it must. Sometimes assembly routines call C programs too. One example is where the kernel initializations are done by executing assembly language programs upto the point that the execution model required by a C program (that is a program written in C and compiled using a C compiler) can be supported. From then onwards, the rest of the initialization in C. Therefore, at that point an assembly routine has to call a C routine. Thus a person aspiring to write or understand the structure of large and complex system software must rst understand how to write C-callable assembly language programs and vice-versa. In this class, we shall learn assembly language programming. We focus on C-callable assembly language programs. We shall use 80X86 (That includes 80386/80486/Pentium/Pentium Pro/MMX/Pentium II
2 von 23 2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
microprocessors) based IBM PCs with Linux operating system. Once you undergo this course, you will also learn how compilertranslated code actually works on the PC platform. Many of your other doubts will be resolved as well. Go back to Top
Call this program as caller.c This small C program will be our driver program for many assembly routines that we write. And an assembly language callee like this:
.globl ascallee ascallee: push %ebp movl %esp, %ebp; movl 8(%ebp), %eax shl %eax pop %ebp ret
Call this program as callee.s This assembly language function returns twice the value of the integer argument supplied to it. In the next step we shall see how this works. Compile by typing:
cc caller.c callee.s
3 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
That will print out 10, which is twice the value of 5. Go back to Top
that C function would been translated into code which is similar to the assembly code we wrote ourselves. Ultimately every high-level language code becomes assembly language code after compilation which is assembled into machine language and that is what is actually executed by the microprocessor. It is just that there is only one way to translate an assembly program into an equivalent machine language program. On the other hand there are many (possibly innite) ways to translate a high-level language program into an assembly language program. Go back to Top
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
implementors love this model while architects and purists frown at it. The architecture does not come in between the programmer and the hardware when you set up a at memory model. You do your type of address arithmetic, point at anything you want and modify it the way you like it. Thus there is really no point in demarcating code, data, stack etc. as segments and we have not done that. There is no segmentation directive to the assembler. We use the memory model set up by the runtime system of the C compiler, which incidentally is a at model. Go back to Top
5 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
End of search list. /usr/lib/gcc-lib/i486-linux/2.7.0/cc1 /tmp/cca00941.i -fno-strength-reduce -quiet -dumpbase caller.c -version -o /tmp/cca00941.s GNU C version 2.7.0 (i386 Linux/ELF) compiled by GNU C version 2.7.0. /usr/i486-linux/bin/as -V -Qy -o /tmp/cca009411.o /tmp/cca00941.s GNU assembler version cygnus/linux-2.5.2l.15 (i486-linux), using BFD version cygnus/linux-2.5.2l.11 /usr/i486-linux/bin/as -V -Qy -o /tmp/cca009412.o callee.s GNU assembler version cygnus/linux-2.5.2l.15 (i486-linux), using BFD version cygnus/linux-2.5.2l.11 /usr/i486-linux/bin/ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.1 /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtbegin.o -L/usr/lib/gcc-lib/i486-linux/2.7.0 -L/usr/i486-linux/lib /tmp/cca009411.o /tmp/cca009412.o -lgcc -lc -lgcc /usr/lib/crtend.o /usr/lib/crtn.o
which basically means that any C compiler in a Unix system, essentially goes through the following phases: First cpp, the C pre-processor strips comments, merges #include les, expands macros and produces a C program that is easier for an automation (viz. a C compiler) to read. The output can be viewed using the -P switch and usually is painful to read for a human being. Then cc, the actual C compier processes the C program and produces an assembly language program as the output. the lexical scanner the parser the code generator The output assembly program (with a .s extension) can be obtained by using a -S switch while invoking cc . In that case, processing stops at this stage. Thereafter, as (the unix's assembler) runs and produces an object le with a .o extension. If cc in invoked with a -c switch, then processing ends at this phase and the output .o le can be seen. Then ld (unix's link-editor) runs. It combines all the object modules, the C runtime library ( some kind of crt.o as you just saw) and puts in some startup code and some epilogue at the end. It puts coff (common object le format) header to the le and writes it as a.out with execute permissions set. And when
cc
sees a le with a
.s
extension, it invokes
as
on that le and
6 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
proceeds normally afterwards. And now let me tell you that the .globl keyword at the beginning of the assembly language program informs the assembler that the symbol ascallee should be exported outside the module callee.o so that ld can nd it and link it. Go back to Top
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
properly
So remember that there can only be one main() function in the many modules that go into a.out. If ld nds more than one main function, it will howl. It knows that nobody can transfer control to your program at more than one place. And there must be at least one main function in all the modules combined. Otherwise, ld will refuse to put the startup code and give a.out the execute permissions. There must be somewhere to begin from, if you want to execute something. No global symbol can be dened more than once. Be it variable names, labels or function names, they can only be unique if they have to be public. And every public symbol that is referenced anywhere, must be dened somewhere. Else ld will howl and refuse to proceed. Go back to Top
8 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
pushing the registers used within the callee within the stack. So before the control is passed back to the caller, the callee cleans the stack. Activation record is created by the execution of the instruction sequence:
push %ebp mov %esp, %ebp
which makes the EBP register point to its old copy and thus x the activation record. The activation record is pulled down by executing:
pop %ebp
Thus in C, the convention is that each cleans up the mess on the stack that it itself had created sometime ago. In C always values are pushed. So if you want any variable to change within the caller, you must explicitly make the compiler generate its address by typing an ampersand before it and use that addres as the parameter. That is why the scanf routine and the swap function have ampersands in their respective calls. The return value is passed in the accumulator. In this case, the return value is 32-bit long and therefore is passed in the EAX register. That is why functions whose return value types are not explicitly specied are assumed to be integer functions in C, as that is the easiest to implement. This is also the reason that EAX need not be saved or restored inside the callee. It is meant to be destroyed. Go back to Top
9 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
The value 5 has been pushed by the caller (written in C). That takes 4 bytes to keep as this is a 32-bit machine. After that the microprocessor executed the call instruction and pushed the return address, which is a 32-bit address. After that, it came inside the callee and the rst thing we did there was to make new EBP point to the place in stack where the value of old EBP has been kept. The callee accessed the parameter, put it in EAX register, doubled it, pulled down the activation record and went back to the caller with 10 in its EAX. The caller (you have not seen that part yet, but soon I shall show you soon in another example) added 4 to the value of ESP to clean the stack. Then it used (that is in this case printed) the value) the return value. Go back to Top
A caller with two parameters, push parameters the way we described. The following is its assembly language equivalent, actually produced using the -s switch:
.file "caller.c" .version "01.01" gcc2_compiled.: .section .rodata
10 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Note that the caller is cleaning the stack by adding 8 (each parameter is 32 bits or 4 bytes long and there are two parameters) to the stack immediately after the return from the callee. Note also the redundant move instruction just after the stack is cleaned. Compiler generated code often has such redundant code. It also has redundant jump instructions often. A callee that accept two parameters correctly is listed below:
.globl ascallee ascallee: push %ebp movl %esp, %ebp; / get first parameter into eax movl 8(%ebp), %eax / subtract second parameter from eax subl 12(%ebp), %eax pop %ebp ret
as you can see, it subtracts the second parameter value from the rst parameter value and returns the result.
11 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Go back to Top
Keep this in le called ppar.c An assembly language program (kept in callee.s) to drive this C function can be:
.globl ascallee ascallee: push %ebp movl %esp, %ebp; / Push 67 into stack pushl $67 / Push 83 into stack pushl $83 call ppar /clean the stack addl $8, %esp pop %ebp ret
So 67 will be the last (second, in this case) parameter and 83 will be the
12 von 23 2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
rst parameter. And this in turn can be called by a main() function written in C and kept in caller.c:
main() { int iret; iret = ascallee(); }
which is what we wanted. So remember, no matter whether caller is written in C or assembly, it has to clean the stack. In case of a C program, the compiler automatically puts the cleanup code in the proper place. If you are writing any caller in assembly that calls a C callee (and I mean even assembly routines that are C-callable), make sure you clean up the stack after return from the call. Go back to Top
13 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
and a caller:
main() { int iret; iret = ascallee(); printf("iret=%x\n", iret); }
produces a result:
iret=12345678
which goes to show that: A .word directive reserves 16 bits. You need two such dot words to ll up a 32-bit register. Intel 80X86 follows a little-endian scheme. The MSW (Most Signigicant Word) stays at the highest address Go back to Top
Go back to Top
14 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
that means statv1 is the address and one can do arithmetic on that address. And had we said:
movl statv1+1, %eax
the bytes are laid out (in the order of increasing memory addresses) as: 78, 56, 34, 12, BC, 9A, F0, DE. The assembler lays it out as it knows that the CPU is little-endian. statv1 is the address of the byte 78. So looking at statv1+1, one will see 56. So that will be the LSByte. So 3456 is the LSW. BC12 will be the MSW. Thus the 32-bit word beginning at statv1+1 is BC123456 and that is what our program prints. Go back to Top
15 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Note that no special keyword needs to be put before int declaration. The very position of the delaration within the C program indicates that it is a global public integer variable. The assembly program that accesses that variable and returns it, is:
.extern intpub .globl ascallee ascallee: push %ebp movl %esp, %ebp; movl intpub, %eax pop %ebp ret
Go back to Top
16 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
has the same eect as a C program, in which an automatic integer variable is dened, 5 is set its value, and 15 is added to it. Then the sum, accumulated on that storage location is returned. With a caller like:
main() { int iret; iret = ascallee(); printf("iret=%d\n", iret); }
Note here that the type of the variable does not matter. What we have allocated will suce for ve integers, or ve single-precision variables or 20 characters, or two double precision IEEE754 automatic variables followed by one single precision IEEE754 automatic variable. Go back to Top
17 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Go back to Top
The Factorial
See the C-callable assembly language program which computes and returns the factorial of its only argument:
.globl ascallee ascallee: push %ebp movl %esp, %ebp
/access parameter movl 8(%ebp), %eax /is it zero orl %eax, %eax /yes, return one jz retone /no, go deeper push %edx push %ebx movl decl push call addl %eax, %ebx %eax %eax ascallee $4, %esp
pop %ebp ret retone: /return one movl $1, %eax pop %ebp ret
recall here that the execution of the multiplication instruction destroys the EDX register. That is why we push and pop it. EBX register we use as a temporary storage location to store the current value of the argument.
18 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Go back to Top
Fibonacci
Notice the use of the automatic variable used for accumulating the temporary sum:
.globl ascallee ascallee: push %ebp movl %esp, %ebp
/access parameter movl 8(%ebp), %eax /is it zero? orl %eax, %eax /yes, return one jz retone /is it one? cmpl $1, %eax /yes, return one jz retone /no, go deeper /allocate automatic storage subl $4, %esp decl push call addl movl %eax %eax ascallee $4, %esp %eax, -4(%ebp)
/access parameter again movl 8(%ebp), %eax decl %eax decl %eax push %eax call ascallee addl $4, %esp addl -4(%ebp), %eax /deallocate automatic storage addl $4, %esp
19 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Go back to Top
Ackerman
This C program computes the Ackerman function. The denition of Ackerman function also is clear from this C program:
int main() { printf("%d\n", ack(2,3)); } int ack(int m, int n) { if (m==0) return (n+1); if (n==0) return (ack(m-1,1)); return(ack(m-1,ack(m,n-1))); }
And this C-callable assembly language program produces identical results for all parameter pairs:
.globl ascallee ascallee: push %ebp movl %esp, %ebp / now there are two parameters f(m,n) / n, the last parameter was pushed first / after that m was pushed, after that return address
20 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
/ and still after that EBP - these are all 32-bit integers /therefore at EBP+8, there is m /therefore at EBP+12, there is n /access m movl 8(%ebp), %eax /is it zero? orl %eax, %eax /yes, return n plus one jz retnp1
/access n movl 12(%ebp), %eax /is it zero? orl %eax, %eax /yes, return f(m-1,1) jz retmn1 /else return f(m-1, f(m, n-1)) /access n movl 12(%ebp), %eax decl %eax push %eax movl 8(%ebp), %eax push %eax / (m, n-1) has been pushed into stack just now call ascallee addl $8, %esp /push f(m, n-1) as n push %eax /get m movl 8(%ebp), %eax decl %eax push %eax / (m-1,f(m, n-1)) has been pushed into stack just now call ascallee addl $8, %esp / f(m-1,f(m, n-1)) is in EAX pop %ebp ret retnp1: /return n plus one movl $1, %eax addl 12(%ebp), %eax pop %ebp ret retmn1: /return f(m-1,1)
21 von 23
2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
Go back to Top
From here
For further studies, read: Allen I. Holub, ``Compiler Design in C'', Prentice Hall, 1993. Visit the companion Website: Click here to visit Allen Holub Get a lot of free software and documentation from this public ftp site: Click here to visit Sunsite Li nux Archives Read these assorted articles from Ghoshal's website on this topic:
On MSDOS platforms:
ACKERMAN.80X86.ASSEMBLY has a C-callable 80X86 assembly language program that computes the ACKERMAN function. It uses the small memory model. Take ACKERMAN.80X86.ASSEMBLY Now.
22 von 23 2011-06-09 16:46
http://www.hs-augsburg.de/~hhoegl/mnp/www/lin...
ACKERMAN.C is its C caller. Take ACKERMAN.C Now. ACKERMAN.COMMENTS.ASSEMBLY is commented. Take ACKERMAN.COMMENTS.ASSEMBLY Now. FACTORIAL.ASSEMBLY is a C-callable assembly language program to compute the factorial function. Take FACTORIAL.ASSEMBLY Now. FACTORIAL.C is its caller. Take FACTORIAL.C Now. RULES.C.CALLS.ASSEMBLY describes the rules of writing C and assembly language programs that call each other. Take RULES.C.CALLS.ASSEMBLY Now. resonance.article.5.ps has the fth article of resonance. It describes the 80X86 CPU assembly language programming Take resonance.article.5.ps Now.
On UNIX PLATFORMS:
unix.c.calls.assembly.ps has a series of experiments. They show how to write C programs and assembly language programs that call each other recursively. This postscript le supplements this html tutorial. Take unix.c.calls.assembly.ps Now. Read these other books also. For a list, Click here for list of recommended books Go back to Top
Your Response
Send me Email: ghoshal@serc.iisc.ernet.in if you have any question/ suggestion/ comment. Go back to Top
23 von 23
2011-06-09 16:46