Professional Documents
Culture Documents
STRING OPERATION
If you learn to efficiently process strings and arrays, you can master the most common area of
code optimization. Studies have shown that most programs spend 90% of their running time
executing 10% of their code. No doubt the 10% occurs frequently in loops, and loops are
required when processing strings and arrays. In this chapter, we will show techniques for
string and array processing, with the goal of writing efficient code. We will begin with the
optimized string primitive instructions designed for moving, comparing, loading, and storing
blocks of data.
A Character String (or character array) is used for descriptive data such as people's name and
page titles.
String variable can be defined within single quotes as ' Hello World' or double quotes as
"Hello World". Drective DB is the only sensible format for defining character data. An
example is as follows:
TEXT
DB 'Hello World!
NAME
DB 'Ali DEMR'
SCHOOL
DB 'Karabuk University'
ARRAY
MESSAGE
The x86 instruction set has five groups of instructions for processing arrays of bytes, words,
and doublewords. Although they are called string primitives, they are not limited to character
arrays. Each instruction in the Table below implicitly uses SI, DI, or both registers to address
memory. References to the accumulator imply the use of AL, AX, or EAX, depending on the
instruction data size. String primitives execute efficiently because they automatically repeat
and increment array indexes.
String Primitive Instructions.
DESCRIPTION
BYTE
INSTRUCTION
WORD
INSTRUCTION
DOUBLE
WORD
INSTRUCTION
IMPLIED
OPERANDS
Copy String
MOVSB
MOVSW
MOVSD
ES:DI, DS:SI
Load String
LODSB
LODSW
LODSD
AX, DS:SI
Store String
STOSB
STOSW
STOSD
ES:DI, AX
Compare String
CMPSB
CMPSW
CMPSD
DS:SI, ES:DI
Scan String
SCASB
SCASW
SCASD
ES:DI,AX
;
;
D=0
D=0
ise
ise
Example: Copy a String In the following example, MOVSB moves 10 bytes from string1 to
string2. The repeat prefix first tests ECX > 0 before executing the MOVSB instruction. If
ECX = 0, the instruction is ignored and control passes to the next line in the program.
If ECX > 0, ECX is decremented and the instruction repeats:
cld
mov
mov
mov
rep
si,OFFSET string1
di,OFFSET string2
cx,10
movsb
;
;
;
;
;
Forgetting to set the Direction flag before a string primitive instruction can be a major
headache, since the ESI and EDI registers may not increment or decrement as intended.
You can use a repeat prefix with MOVSB, MOVSW, and MOVSD. The Direction flag
determines whether ESI and EDI will be incremented or decremented. The size of the
increment/ decrement is shown in the following table:
Example: Copy Doubleword Array Suppose we want to copy 20 doubleword integers from
source to target. After the array is copied, ESI and EDI point one position (4 bytes) beyond
the end of each array:
.data
source DD 20 DUP(0FFFFFFFFh)
target DD 20 DUP(?)
.code
cld
;
mov cx,20
;
mov si,OFFSET source
;
mov di,OFFSET target
;
rep movsd
;
direction = forward
set REP counter as length source
SI points to source
DI points to target
copy doublewords
You can use a repeat prefix with CMPSB, CMPSW, and CMPSD. The Direction flag
determines the incrementing or decrementing of ESI and EDI.
; compare doublewords
; jump if source > target
To compare multiple doublewords, clear the Direction flag (forward direction), initialize ECX
as a counter, and use a repeat prefix with CMPSD:
Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi
Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE
Scan for a Matching Character In the following example we search the string alpha, looking
for the letter F. If the letter is found, DI points one position beyond the matching character. If
the letter is not found, JNZ exits:
.data
alpha DB "ABCDEFGH",
.code
mov di,OFFSET alpha ; EDI points to the string
mov al,'F'
; search for the letter F
mov cx,8
; set the search count as length of alpha
cld
; direction = forward
repne scasb
; repeat while not equal
jnz quit
; quit if letter not found
dec di
; found: back up DI
JNZ was added after the loop to test for the possibility that the loop stopped because CX = 0
and the character in AL was not found.
useful for filling all elements of a string or array with a single value. For example, the
following code initializes each byte in string1 to 0FFh:
.data
Count DB 100
string1 DB Count DUP(?)
.code
mov al,0FFh
mov di,OFFSET string1
mov cx,Count
cld
rep stosb
;
;
;
;
;
value to be stored
DI points to target
character count
direction = forward
fill with contents of AL
Examples
. MODEL SMALL
.STACK 64
.DATA
SOURCE DB '-COMPUTER-'
TARGET DB 'ELECTRONIC'
TARGET2 DB '----------'
.CODE
MAIN PROC FAR
CALL LEFTSIDE
CALL RIGHTSIDE
MOV AH,4CH
INT 21H
MAIN ENDP
LEFTSIDE
CLD
MOV
LEA
LEA
REP
RET
LEFTSIDE
PROC NEAR
; left to right
CX,10
SI, SOURCE
DI, TARGET
MOVSB
ENDP
FINISH:
MOV AH,4CH
INT 21H