You are on page 1of 19

Analyzing MSOffice malware

with OfficeMalScanner

Version: 1.0

Last Update: 30th July 2009

Author: Frank Boldewin / www.reconstructer.org

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 1 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
Table of Contents

1 ABSTRACT ................................................................. 3

2 INTRODUCTION TO OFFICEMALSCANNER .................. 4

3 FEATURE OVERVIEW ................................................. 5

4 PRACTICAL USAGE .................................................... 9

5 MALHOST-SETUP ......................................................15

6 CONCLUSION ............................................................19

7 REFERENCES .............................................................19

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 2 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
1 Abstract
"If you know the enemy and know yourself, your victory will not
stand in doubt; if you know Heaven and know Earth, you may
make your victory complete."

Sun Tzu – Art of War

If we believe statistics and trends from industry giants like Symantec or


McAfee, the cybercrime sector is one of the most growing markets today.
We all know the classic bank robbery is history and even attacks against
the “system”, meaning governments, telecommunication infrastructure or
energy plants are easier than ever before, with powerful malicious tools
and botnets in hand. Further, browser-based attacks serving trojans who
steal everything that matters, be it banking data, passwords, credit card
infos, trustworthy documents or mail conversions, are being the order of
the day. Next to attacks with standard exploit packs and well-known
trojans, there are more and more targeted attacks, also known as spear
phishing. Infamous examples for this are the espionage attacks against
German-, UK- and US-Governments in September 2007 [1] or the
GhostNet case in March 2009 [2]. While the normal attacks against
machines connected to the internet are often done with so-called drive-by
attacks, spear phishing is accomplished by sending authentic mails to a
favoured few people. The mails contain attachments pretending to be
imported presentations or whitepapers. If victims open these attachments,
it is likely that they get owned with a special prepared exploit for standard
tools like MSOffice or Adobe Reader. If the target is important enough and
it is to be expected, that the security measures are high on the victim’s
site, than so-called 0days (unknown exploitable bugs) are very likely to be
deployed. Once installed on the victim’s boxes these trojans start stealing
as much information as possible. If the fresh installed malware uses a well
formed and proprietary rootkit it is getting hard to detect actions of the
Trojan code. So, next to 0-day preventing measures like hardened OSes
and applications it is important to have good forensic tools by hand, that
gives the targeted site a powerful weapon to analyze the malicious gift. In
the past, some good tools were developed by several people to analyze
PDF documents, but there are still no good tools available for the MSOffice
document format. This paper introduces a new tool called
OfficeMalScanner, which aims to be a forensic tool for the MSOffice
document format.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 3 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
2 Introduction to OfficeMalScanner
Some months ago, I had the need to analyze a malicious PowerPoint
document and started searching for good tools on the web. What I was
searching for was a document dissector, a shellcode fingerprint scanner
and a VB macro detacher. The free tools I found were:

• Officecat [3] - A command line utility that can be used to process


Microsoft Office Documents for the presence of potential exploit
conditions in the file. Unfortunately, this tool just prints out some
infos if the scanned file is malicious or not and what exploit has
been used (CVE and MS numbers).
• STG Docfile Viewer [4] – This tool is a very basic format parser and
was not much of help for my problem to get a clue where to find the
shellcode and maybe its embedded encrypted executables.

The next step I have started was to find some documentation about the
file format itself and how to parse its structure. This time I had much
more luck, as Microsoft was kind enough to release some very detailed
papers about the format specs here [5] and here [6]. This helped me a lot
to write my own forensic tool OfficeMalScanner [7]. In the following
pages, I will describe in detail what can be done with this forensic utility.
Be aware that the OfficeMalScanner only scans the older office binary file
formats. Office 2007 and newer uses a XML based structure and it is very
easy to look inside these files. You can open files with extensions like
docx, pptx, xlsx, docm, pptm and so forth with WinZip or Winrar and then
open these files again with a normal text editor to see what’s inside.
Solitary exceptions are files containing VB macros (docm, pptm and xlsm).
Next to the usual XML files, you should find a file called vbaproject.bin
This file contains the compressed VB macrocode, which is not XML, but in
old binary files format style. You can extract this file and then scan it with
OfficeMalScanner to uncompress the VB macrocode data. However, this
will be described in detail later in this paper.
The last notable tool before we start comes from Microsoft and was
released only some days ago. It is called OffVis [8] and is a very nice
MSOffice file format defragmentation util. Even if it is still in a “Beta”
status, yet I suggest you to give it a try as well.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 4 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
3 Feature Overview
Before I show you the practical usage of OfficeMalScanner check out the
features below:

The “SCAN” feature scans the entire malicious file for generic shellcode
patterns. Here is a list of all currently implemented checks.

GetEIP (4 Methods)

CALL NEXT
NEXT: POP reg
-------------------------------------------
JMP [0xEB] 1ST
2ND: POP reg
1ST: CALL 2ND
-------------------------------------------
JMP [0xE9] 1ST
2ND: POP reg
1ST: CALL 2ND
-------------------------------------------
FLDZ
FSTENV [esp-0ch]
POP reg

Find Kernel32 base (3 methods)

MOV reg, DWORD PTR FS:[30h]


---------------------------------------------
XOR reg_a,reg_a
MOV reg_a(low-byte), 30h
MOV reg_b, fs:[reg_a]
---------------------------------------------
PUSH 30h
POP reg_a
MOV reg_b, FS:[reg_a]

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 5 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
API Hashing

LOOP: LODSB
TEST al, al
JZ short OK
ROR EDI, 0Dh or ROR EDI, 07h
ADD EDI, EAX
JMP short LOOP
OK: CMP EDI, ...

Indirect function call

PUSH DWORD PTR [EBP+val]


CALL[EBP+val]

Suspicious strings

UrlDownloadToFile
GetTempPath
GetWindowsDirectory
GetSystemDirectory
WinExec
IsBadReadPtr
IsBadWritePtr
CreateFile
CloseHandle
ReadFile
WriteFile
SetFilePointer
VirtualAlloc
GetProcAddr
LoadLibrary

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 6 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
Easy decryption trick

LODS(x)
XOR or ADD or SUB or ROL or ROR
STOS(x)

Embedded OLE Data (unencrypted)

Signature: \xD0\xCF\x11\xE0\xA1\xB1\x1a\xE1

Function Prolog

PUSH EBP
MOV EBP, ESP
SUB ESP, <value> or ADD ESP, <value>

PE-File Signature (unencrypted)

Offset 0x0 == MZ
Offset 0x3c == e_lfanew
Offset e_lfanew == PE

The “BRUTE” feature is an easy XOR + ADD 0x00 – 0xFF buffer


decryption. Every time a buffer is decrypted, the scanner looks for an
embedded OLE signature or a valid PE-file. If it matches, both embedded
OLE- and PE-files are saved to disk.

The “DEBUG” mode displays found signatures as disassembly for detected


code patterns and in hexview style for detected strings, OLE- and PE-files.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 7 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
The malicious index rating can be used for automated analysis as
threshold values. Every suspicious trace increases the malicious index
counter depending on its hazard potential.

INDEX SCORING
Executables 4
Code 3
Strings 2
OLE 1

The “INFO” mode dumps OLE structures, offsets and length and saves
found VB-Macro code to disk.
To dump the VB macrocode to disk I use the Microsoft OLE API and some
tricky parsing, as well as macro decompression by using the
undocumented RtlDecompressBuffer() function from NTDLL.DLL

If you want to scan an entire directory of malicious documents, use the


small python script “SCANDIR.PY” supplied with the OfficeMalScanner
package.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 8 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
4 Practical usage
After a short overview of all features of the scanner, we are now ready to
show its practical usage. The figure 4.1 shows the “USAGE” screen if you
execute OfficeMalScanner without any parameters.

Figure 4.1:

As you can see, the usage of this tool is very easy and next to the
description of “options” and “switches”, also examples are given as well.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 9 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
In Figure 4.2, we just used the “SCAN” option. The output tells us, that
three different shellcode pattern types were found in the PowerPoint
binary. The findings are all in a range between 0x300 bytes, which gives
us a good feeling that this document is malicious.

Figure 4.2:

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 10 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
To ensure that the scanner hasn’t only reported false positives we can use
the “DEBUG” switch. In figure 4.3, we can clearly see that the scanner
was right in its findings.

Figure 4.3:

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 11 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
Now after we know that shellcode is included in this binary, we can check
for more stuff, like encrypted embedded OLE data or PE-files. This is done
by using the “BRUTE” option as seen in figure 4.4

Figure 4.4:

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 12 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
As we can see, the scanner found one embedded OLE file as well as three
different PE-files, which were encrypted with the key XOR 0x85. After
detection of these files, they were dumped to disk. The dumped
embedded OLE file can be re-scanned now with OfficeMalScanner to find
further malicious traces. The dumped PE-files can be loaded into IDA Pro
or a debugger for a detailed analysis or uploaded to online analysis
platforms like www.cwsandbox.org or www.virustotal.com now.

Next to typical shellcode based MSOffice exploits there also exist malicious
documents containing evil VB macrocode. To reveal such stuff we can use
the “INFO” option as seen in figure 4.5

Figure 4.5:

The VB macrocode is then dumped into a separate directory where we can


observe it with a normal text editor or a normal “type” command as
shown in figure 4.6

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 13 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
Figure 4.6:

A short look reveals such an interesting “Private Sub Shellcode()”


function, which stands for itself. It is fairly telltale that this VB macrocode
is malicious. Furthermore the start bytes 77 90 are 0x4D 0x5A in hex and
“MZ” in ASCII. This is another hint for a PE-File that might be dropped and
executed by this macrocode.

Watch the flash video “VB macrocode debugging” to learn how to debug
such malicious codes and drop the PE-file in a save way.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 14 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
5 MalHost-Setup

Another tool I have added to the OfficeMalScanner suite is called MalHost-


Setup. It is useful if you want to run the shellcode inside a malicious file
from a specific offset. Especially for MSOffice exploits, I have seen
shellcodes that work like the ones shown in Figure 5.1

Figure 5.1:

As you can see the shellcode enumerates through the file handle values,
first tries to detect a valid file handle using the GetFileSize function, and if
true, it checks for its file size. If it matches 0xec600, the shellcode knows
he is inside itself and drops an encrypted PE-file from some offset and
executes it.
Usually some MSOffice executable like winword.exe or powerpoint.exe is
the host for such documents, but if we try to avoid calling some MSOffice
product for testing, we can use MalHost-Setup.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 15 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
If we execute the utility without any parameters, we get a usage screen
as shown in figure 5.2

Figure 5.2:

MalHost-Setup wants the malicious file as input-file, a name for the


output-file that gets created and an offset where the shellcode starts.
Furthermore, if we want to debug the shellcode, a “WAIT” feature was
included which patches the beginning of the shellcode with 0xeb 0xfe
(loop forever). However, this is optional.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 16 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
To get a clue were the shellcode starts we often have a good orientation
point when we use OfficeMalScanner before and shellcode patterns like
FS:30 (find kernel-base trick) were found, because if we look from the
found offset some instructions above, the chance is very high to find the
start of the shellcode.
To disassemble the malicious file very easily OfficeMalScanner supplies the
analyst with a small tool called DisView. See Figure 5.3 how use it.

Figure 5.3:

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 17 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
If we found the right offset to the shellcode-start, we are ready to fire off
MalHost-Setup as shown in figure 5.4

Figure 5.4:

MalHost-Setup then takes the malicious file, attaches it as overlay behind


its execution engine and stores it on disk as separate output file. If you
execute outfile.exe now, be aware to start it in a safe environment,
otherwise it is likely that your system might be infected, if the shellcode-
start you selected was the right one. ;-)

As already mentioned above you can use the “WAIT” option to patch the
shellcode-start. Be sure to write down the original bytes MalHost-Setup
prints out to console for re-patching in the debugging session (see figure
5.5). If you start the 0xeb 0xfe patched outfile.exe now, it loops forever
and waits for debugger attaching, e.g. in Ollydbg just attach to
outfile.exe, press the “run” button and right after this the “pause” button
and you should be at the 0xeb 0xfe loop. Now just re-patch the bytes to
the original ones and start your debugging session.

Figure 5.5:

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 18 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009
6 Conclusion
With OfficeMalScanner, you got a tool to do forensics on MSOffice files,
which might be malicious even if I tested the scanner successfully with
thousands of malicious samples, it should be clear, that the bad guys still
might use more heavy obfuscation tricks in future, to avoid generic
shellcode detection. So if you find malicious samples on the web, which
OfficeMalScanner detects as “clean”, do not hesitate to send them to me
and I will try to find a generic way for detection. Next to this, future
releases will contain more effective crypto-analysis tricks to detect
encrypted PE-files. Further keep in mind, that this software was written in
the C-language, hence my code also might contain exploitable bugs. So if
you work with this tool, ensure you only use it in a safe environment.
Suggestions and constructive reviews are always welcome.

7 References

[1] The GhostNet case


http://en.wikipedia.org/wiki/GhostNet
[2] Chinese hacked into Pentagon
http://www.ft.com/cms/s/0/9dba9ba2-5a3b-11dc-9bcd-0000779fd2ac.html
[3] Officecat
http://www.snort.org/vrt/vrt-resources/officecat
[4] STG Docfile Viewer
http://support.microsoft.com/?scid=kb%3Ben-us%3B139545&x=16&y=9
[5] Microsoft Office Binary (doc, xls, ppt) File Formats
http://www.microsoft.com/interop/docs/OfficeBinaryFormats.mspx
[6] Microsoft Office Word 97-2007 binary file format specification
http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-
4342ED7AD886/WindowsCompoundBinaryFileFormatSpecification.pdf
[7] OfficeMalScanner
www.reconstructer.org/code/OfficeMalScanner.zip
[8] OffVis
http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=19a1a252-
c3af-4474-b33c-158c6e85115e

Thanks to Bruce Dang, Elia Florio, Michael Hale Ligh and Michael Sandee
for suggestions and ideas.

Analyzing MSOffice malware with OfficeMalScanner www.reconstructer.org Page 19 of 19


File: Analyzing MSOffice malware with OfficeMalScanner.pdf 30/07/2009

You might also like