You are on page 1of 8

7.

STRING OPERATION
If you learn to efficiently process strings and arrays, you can master the most common area of
code optimization. Studies have shown that most programs spend 90% of their running time
executing 10% of their code. No doubt the 10% occurs frequently in loops, and loops are
required when processing strings and arrays. In this chapter, we will show techniques for
string and array processing, with the goal of writing efficient code. We will begin with the
optimized string primitive instructions designed for moving, comparing, loading, and storing
blocks of data.
A Character String (or character array) is used for descriptive data such as people's name and
page titles.
String variable can be defined within single quotes as ' Hello World' or double quotes as
"Hello World". Drective DB is the only sensible format for defining character data. An
example is as follows:
TEXT

DB 'Hello World!

NAME

DB 'Ali DEMR'

SCHOOL

DB 'Karabuk University'

ARRAY

DB 200 DUP ('*')

MESSAGE

DB ' Press a Key'

The x86 instruction set has five groups of instructions for processing arrays of bytes, words,
and doublewords. Although they are called string primitives, they are not limited to character
arrays. Each instruction in the Table below implicitly uses SI, DI, or both registers to address
memory. References to the accumulator imply the use of AL, AX, or EAX, depending on the
instruction data size. String primitives execute efficiently because they automatically repeat
and increment array indexes.
String Primitive Instructions.

DESCRIPTION

BYTE
INSTRUCTION

WORD
INSTRUCTION

DOUBLE
WORD
INSTRUCTION

IMPLIED
OPERANDS

Copy String

MOVSB

MOVSW

MOVSD

ES:DI, DS:SI

Load String

LODSB

LODSW

LODSD

AX, DS:SI

Store String

STOSB

STOSW

STOSD

ES:DI, AX

Compare String

CMPSB

CMPSW

CMPSD

DS:SI, ES:DI

Scan String

SCASB

SCASW

SCASD

ES:DI,AX

Using a Repeat Prefix


a string primitive instruction processes only a single memory value or pair of values. If you
add a repeat prefix, the instruction repeats, using CX or ECX as a counter. The repeat prefix
permits you to process an entire array using a single instruction.
The following repeat prefixes are used:

The following code is equal to REP MOVSB instruction.


Again:
MOV AL, [SI]
MOV [DI], AL
INC SI
INC DI
LOOP Again

;
;

D=0
D=0

ise
ise

Example: Copy a String In the following example, MOVSB moves 10 bytes from string1 to
string2. The repeat prefix first tests ECX > 0 before executing the MOVSB instruction. If
ECX = 0, the instruction is ignored and control passes to the next line in the program.
If ECX > 0, ECX is decremented and the instruction repeats:
cld
mov
mov
mov
rep

si,OFFSET string1
di,OFFSET string2
cx,10
movsb

;
;
;
;
;

clear direction flag


SI points to source
DI points to target
set counter to 10
move 10 bytes

Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi


Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

in the following example copy 20 character from string1 to string2


STRING1 DB
20 DUP (*)
STRING2 DB
20 DUP ( )
CLD
MOV CX,20;
LEA DI,STRING2;
LEA SI,STRING1;
REP MOVSB.
SI (or ESI) and DI (or EDI) are automatically incremented when MOVSB repeats. This
behavior is controlled by the CPUs Direction flag.
Direction Flag String primitive instructions increment or decrement SI and DI based on the
state of the Direction flag (see Table 9-2). The Direction flag can be explicitly modified using
the CLD and STD instructions:
CLD
STD

; clear Direction flag (forward direction)


; set Direction flag (reverse direction)

Forgetting to set the Direction flag before a string primitive instruction can be a major
headache, since the ESI and EDI registers may not increment or decrement as intended.

MOVSB, MOVSW, and MOVSD


The MOVSB, MOVSW, and MOVSD instructions copy data from the memory location
pointed to by ESI to the memory location pointed to by EDI. The two registers are either
incremented or decremented automatically (based on the value of the Direction flag):

You can use a repeat prefix with MOVSB, MOVSW, and MOVSD. The Direction flag
determines whether ESI and EDI will be incremented or decremented. The size of the
increment/ decrement is shown in the following table:

Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi


Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

Example: Copy Doubleword Array Suppose we want to copy 20 doubleword integers from
source to target. After the array is copied, ESI and EDI point one position (4 bytes) beyond
the end of each array:
.data
source DD 20 DUP(0FFFFFFFFh)
target DD 20 DUP(?)
.code
cld
;
mov cx,20
;
mov si,OFFSET source
;
mov di,OFFSET target
;
rep movsd
;

direction = forward
set REP counter as length source
SI points to source
DI points to target
copy doublewords

CMPSB, CMPSW, and CMPSD


The CMPSB, CMPSW, and CMPSD instructions each compare a memory operand pointed to
by ESI to a memory operand pointed to by ED

You can use a repeat prefix with CMPSB, CMPSW, and CMPSD. The Direction flag
determines the incrementing or decrementing of ESI and EDI.

Example: Comparing Doublewords Suppose you want to compare a pair of doublewords


using CMPSD. In the following example, source has a smaller value than target, so the JA
instruction will not jump to label L1.
.data
source DW 1234h
target DW 5678h
.code
mov si,OFFSET source
mov di,OFFSET target
cmpsd
ja L1

; compare doublewords
; jump if source > target

To compare multiple doublewords, clear the Direction flag (forward direction), initialize ECX
as a counter, and use a repeat prefix with CMPSD:
Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi
Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

mov si,OFFSET source


mov di,OFFSET target
cld
; direction = forward
mov cx,length of source ; repetition counter
repe cmpsd
; repeat while equal
The REPE prefix repeats the comparison, incrementing SI and DI automatically until CX
equals zero or a pair of doublewords is found to be different.

SCASB, SCASW, and SCASD


The SCASB, SCASW, and SCASD instructions compare a value in AL/AX/EAX to a byte,
word,or doubleword, respectively, addressed by DI. The instructions are useful when looking
for a single value in a string or array. Combined with the REPE (or REPZ) prefix, the string
or array is scanned while CX > 0 and the value in AL/AX/EAX matches each subsequent
value in memory. The REPNE prefix scans until either AL/AX/EAX matches a value in
memory or CX = 0.

Scan for a Matching Character In the following example we search the string alpha, looking
for the letter F. If the letter is found, DI points one position beyond the matching character. If
the letter is not found, JNZ exits:
.data
alpha DB "ABCDEFGH",
.code
mov di,OFFSET alpha ; EDI points to the string
mov al,'F'
; search for the letter F
mov cx,8
; set the search count as length of alpha
cld
; direction = forward
repne scasb
; repeat while not equal
jnz quit
; quit if letter not found
dec di
; found: back up DI

JNZ was added after the loop to test for the possibility that the loop stopped because CX = 0
and the character in AL was not found.

STOSB, STOSW, and STOSD


The STOSB, STOSW, and STOSD instructions store the contents of AL/AX/EAX,
respectively, in memory at the offset pointed to by DI. DI is incremented or decremented
based on the state of the Direction flag. When used with the REP prefix, these instructions are
Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi
Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

useful for filling all elements of a string or array with a single value. For example, the
following code initializes each byte in string1 to 0FFh:
.data
Count DB 100
string1 DB Count DUP(?)
.code
mov al,0FFh
mov di,OFFSET string1
mov cx,Count
cld
rep stosb

;
;
;
;
;

value to be stored
DI points to target
character count
direction = forward
fill with contents of AL

LODSB, LODSW and LODSD


The LODSB, LODSW, and LODSD instructions load a byte or word from memory at ESI
into AL/AX/EAX, respectively. ESI is incremented or decremented based on the state of the
Direction flag. The REP prefix is rarely used with LODS because each new value loaded into
the accumulator overwrites its previous contents. Instead, LODS is used to load a single
value. In the next example, LODSB substitutes for the following two instructions (assuming
the Direction flag is clear):
mov al,[si] ; move byte into AL
inc si ; point to next byte
Array Multiplication Example The following program multiplies each element of a
doubleword array by a constant value. LODSD and STOSD work together:
; This program multiplies each element of an array
; of 32-bit integers by a constant value.
.data
array DW 1,2,3,4,5,6,7,8,9,10 ; test data
multiplier DW 10
; test data
.code
cld
; direction = forward
mov si,OFFSET array
; source index
mov di,esi
; destination index
mov cx,LENGTHOF array
; loop counter
L1: lodsd
; load [SI] into EAX
mul multiplier
; multiply by a value
stosd
; store EAX into [DI]
loop L1

Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi


Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

Examples
. MODEL SMALL
.STACK 64
.DATA
SOURCE DB '-COMPUTER-'
TARGET DB 'ELECTRONIC'
TARGET2 DB '----------'
.CODE
MAIN PROC FAR
CALL LEFTSIDE
CALL RIGHTSIDE
MOV AH,4CH
INT 21H
MAIN ENDP
LEFTSIDE
CLD
MOV
LEA
LEA
REP
RET
LEFTSIDE

PROC NEAR
; left to right
CX,10
SI, SOURCE
DI, TARGET
MOVSB
ENDP

RIGHTSIDE PROC NEAR


STD
;right to left
MOV CX,10
LEA SI,SOURCE+10
LEA DI,TARGET2+10
REP MOVSB
RET
RIGHTSIDE ENDP
END MAIN

if two string is same set bh=1 else bh=0


NAME1 DB ASSEMBLERS
NAME2 DB ASSEMBLERS
CLD
MOV CX,10
LEA SI,NAME1
LEA DI,NAME2
REPE CMPSB
JNE NOTEQUAL
MOV BH,01
JMP FINISH
NOTEQUAL:
MOV BH,0

while equal, continue comparision

Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi


Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

FINISH:
MOV AH,4CH
INT 21H

the following program find # character and replace with $


.MODEL SMALL
.STACK 64
.DATA
TEXT DB LDA #305A
.CODE
MAIN PROC FAR
MOV AX,@DATA
MOV ES,AX
CLD
MOV AL,#
MOV BH,$
MOV CX,9
LEA ES:DI,TEXT
REPNE SCASB
JNE EXT
MOV BYTE PTR [DI-1], BH
EXT:
MOV AH,4CH
INT 21H
ANA ENDP
END MAIN

Karabk niversitesi Uzaktan Eitim Aratrma ve Uygulama Merkezi


Mhendislik Fakltesi No: 215 Balklarkayas Mevkii 78050 Karabk TRKYE

You might also like