You are on page 1of 7

Tracks Inspector: Putting Digital Investigations

in the Hands of Investigators


J. Henseler 1, J. Hofste2, A.Post3
1

Create-IT Applied Research Amsterdam University of Applied Sciences, j.henseler@hva.nl


2
Tracks Inspector, Fox-IT, Delft/ The Netherlands, jop.hofste@fox-it.com
3
Tracks Inspector, Fox-IT, Delft/ The Netherlands, post@fox-it.com

Abstract - With the pervasiveness of computers and mobile


devices, digital forensics becomes more important in law
enforcement. Investigators increasingly depend on the scarce
support of digital specialists which impedes efficiency of
criminal investigations. This paper describes the architecture
of Tracks Inspector, a commercially available product for
computer assisted discovery of digital evidence. Tracks
Inspector was designed to put digital investigation in the hands
of non-technical investigators. The design criteria aim to look
for "low hanging fruit" in the evidence without the help of
digital forensic experts. As a result we expect that backlogs
will be reduced and investigators can better explain to the
experts what they are looking for. Experts can then focus on
the challenging work. The architecture of Tracks Inspector is
scalable, robust, secure and supports cases with hundreds of
evidence units giving access to hundreds of users through a
simple web-based user interface.
Keywords digital forensics, technology assisted review, early
case assessment, cross drive analysis

I. INTRODUCTION

aw enforcement today relies on digital forensics in a


greater variety of criminal investigations. With the
pervasiveness of computers and mobile devices in society, the
occurrences and volume of digital information in cases are
exploding. Investigators who are intrinsically involved in
collecting and assessing evidence must depend on specialists,
unfamiliar with their cases, to process digital information.
This impedes and even prevents prosecuting cases since there
are too few digital forensics specialists and labs to support
caseloads.
Investigators typically investigate the evidence looking for
events and information about persons. This process is
essentially a review task that is similar to electronic reviews in
E-Discovery projects that are described by the EDRM model
[7]. Other research has revealed that technology assisted
review (TAR) can greatly improve the precision and recall of
relevant items [9].
Digital forensic experts acknowledge that automation and
artificial intelligence can be a solution to deal with the
increasing complexity and volume of digital evidence [5].
Automation is a necessary part of the solution of maintaining
consistency, increasing efficiency and optimizing how digital
investigators spend their time. But although these new

ISBN: 978-0-9891305-7-8 2014 SDIWC

techniques can be helpful, they also have their limitations.


Ultimately, a combination of human and computer intelligence
will be required.
Existing TAR solutions focus on document review with
full-text search and retrieval solutions enhanced with vectorspace clustering and predictive coding technologies. However,
these solutions tend to ignore the remaining rich variety of
multi-media files that is found on modern computers as well
as other tracks that are left by users, e.g., visited web sites and
documents recently opened by the user. Such tracks can only
be examined by digital forensics experts with specialist tools.
Inspired by this problem we identify a number of research
questions in the next section. Then, after a short survey of
existing digital forensics tools, we propose Tracks Inspector1
[10]. This is a commercial solution that enables investigators
without a technical background to easily investigate digital
evidence using a web browser. Tracks Inspector brings
simplicity, scalability and collaboration to the handling,
storage, processing, management and reporting of digital
evidence. While not intended to replace laboratory-quality
solutions such as FTK and EnCase, Tracks Inspector provides
a complementary solution to solve more cases and solve them
faster by reducing the workloads on digital specialists to only
the most complex cases.
II. RESEARCH QUESTIONS
Inspired by the problem outlined in the introduction, we would
like to design and implement a system which assists in the
review of digital evidence in such a way that it:
1.
2.
3.
4.
5.
6.

has an intuitive user interface with native language


support,
supports collaboration in a team of investigators
working on the same case,
assists in identification of user tracks on a computer,
mobile phone or digital storage media,
is scalable to meet increasing number of users,
number of evidence units and processing speed,
enables the investigation to start while a forensic
image of the evidence is created, and
produces a report in a human digestible format.

http://www.tracksinspector.com

III. EXISTING DIGITAL FORENSICS TOOLS


During the past ten years quite a number of digital forensic
tools have become available for investigating digital evidence.
In this paragraph we will discuss a small selection.
The SANS Investigative Forensic Toolkit is a forensic
application which supports a large variety of evidence images
and file systems. It is not one tool but it includes several
(external) software packages to analyze the data. It is
available2 for download and runs on the Ubuntu operating
system. This toolkit is a powerful collection of tools but
requires a trained expert to use these tools and to understand
their limitations.
The Digital Investigation Framework is an open source
solution. It includes a file browser and supports most file
systems. Furthermore it reconstructs and analyzes the
Windows registry and has an attractive timeline analysis
feature that visualizes operations on the computer along a time
line [2].
XIRAF is a prototype forensic warehouse system developed
by the Netherlands Forensic Institute (NFI) [1]. The prototype
has been further developed and is now offered as a hosted
solution to law enforcement in the Netherlands. XIRAF uses a
XML Database layer upon an Oracle database solution for the
storage and XQuery as the query language to retrieve
information. It provides a framework for feature extraction
and is accessible through a web-based interface. The user
interface allows users to construct powerful filters but is not
intuitive to use for non-technical users.
Forensic Toolkit (FTK) from Access Data [6] is a
commercially available solution. This is a forensic solution
which processes and renders are large variety of digital
evidence. FTK is popular with digital forensic experts.
Version 4.x is completely database driven and processing can
be distributed over multiple servers. Access Data ECA
provides a web-based front end to the traditional FTK backend
and provides a web-based user interface for early case
assessment. Unlike the other tools mentioned earlier, FTK
users can start analyzing a case while processing is in
progress. But processing does not start until a complete
forensic image of the evidence is presented.

browsing through digital evidence and associated meta data.


One of the unique features of Tracks Inspector is that it can
capture a forensic image of a hard disk and simultaneously
process the evidence on this disk and provide investigators
access to extracted data.
The interface of Tracks Inspector is intuitive with a look
and feel that is commonly found on modern internet websites.
The processing software recognizes email archives,
documents, pictures, video, audio and internet history files.
When a known operating system is detected, additional
features (e.g. list of installed software, user activity) are
extracted and processing starts with analyzing data in user
folders before analyzing other (system) folders.
The Tracks Inspector appliance is based on one or more
servers running Ubuntu. The framework is custom developed
but many components are based on open source programs for
mounting file systems, performance monitoring, extracting
meta data and converting content but also for storing
information in MySQL databases. In this paragraph we
describe the global system architecture as well the supported
media and file types.
A. Global system architecture
A schematic overview of the system is depicted in Figure 1.
The system consists of various processes that connect with
each other through remote procedure calls (RPC). All data
communication is based on serialized protobuf messages via
RPC so that daemons actually have the ability call other
daemons that can exist on remote servers. With a few
exceptions, daemons can be run on multiple servers and use
multi-threading which results in a scalable architecture. There
is virtually no restriction on the number of servers in a Tracks
Inspector appliance. The processes depicted in Figure 1 can be
described as follows:
The evidence monitor monitors connected evidence units
that are directly connected to that host or indirectly connected
via a remote evidence host. These can be physical devices like
compact discs / DVD's / hard disks / USB sticks and memory
cards. The evidence monitor can also monitor a designated
input folder in which forensic or raw images of physical
devices can be placed or a folder with logical files.

Guidance Software has developed Encase [4], one of the


most well-known software packages in the forensic world.
This software can be used for both forensic acquisition of data
as well as the analysis of forensic images. It is designed for
experts and does not have a web-based interface.

The evidence controller manages the connected evidence


units and assigns analysis tasks to processing units. This
information is available in the evidence database which
contains a list of tasks that need to be completed for a
particular evidence unit.

IV. TRACKS INSPECTOR

The evidence host is an optional process that is not required


to use Tracks Inspector. The evidence host typically runs on a
table top shuttle in a lab environment called physical evidence
station and can be used to connect evidence using a standard
write blocker. The evidence host listens to local input slots and

Tracks Inspector is a commercial solution developed by


Fox-IT. It is an appliance for capturing, processing and
2

http://computer-forensics.sans.org/community/downloads

ISBN: 978-0-9891305-7-8 2014 SDIWC

Figure 1: Tracks Inspector scalable architecture

connects the evidence to Tracks Inspector if it is connected


and if it can be mounted.
The processing host processes evidence nodes. Processing
an evidence node can either mean that a node must be
discovered or, for end nodes, that it needs to be analyzed.
Node discovery creates a hierarchical tree of sub nodes (e.g.
files in a file system, compressed folder or messages and
attachments in an email archive) which remain to be
processed. End node analysis extracts metadata and generates
conversion tasks for certain file formats (e.g. video files).
The evidence db stores all information about the evidence
units. A case typically contains 10 to 100 or even more
evidence units. Each evidence unit has its own evidence
database which contains all processed and extracted
information. The system is scalable because evidence
databases can be controlled on different servers by different
evidence monitors.

ISBN: 978-0-9891305-7-8 2014 SDIWC

The hashes host process controls hash lists. These lists can
be predefined, e.g. the NIST list, or they can be uploaded by
an administrator. All files which are hashed by processing, are
compared with the known hashes in the hashes host. This can
be used for example for default system files or known hashes
of illegal materials. If matches are found, the nodes are tagged
with a predefined tag.
The session host manages the user session at the front-end.
It bridges the gap between the front-end host and the evidence
database. The case and user information are cached at the
session host to reduce the amount of RPC calls.
The case host and its associated database manage the cases
that are added by the users in the front-end. Case
administrators can link evidence units to a specific case.
The user host stores all front-end user information, like the
session and the user privileges and roles. User privileges can
differ per case.

The front-end exists of a lightweight web server Lighttpd .


It is based on the Python web framework Django 4. This
framework is combined with HTML5 technology and offers
an intuitive web interface that works on popular browsers
including tablets and smart phones.
This architecture is designed for processing large amounts
of evidence data. Most of the functions in the architecture can
be executed on multiple servers. For example, a system can
have one evidence monitor and 100 processing units for fast
data processing. For a system with many users, it may be
useful to execute multiple front-end hosts etc. A few functions
such as the evidence controller, case host and user host only
have one instance.
B. Supported media
The main purpose of Tracks Inspector is to provide a quick
analysis of user data on digital evidence in order to find "low
hanging fruit". The system currently processes logical files
from four types of input media:
1. Physical devices
The evidence monitor or remote evidence host supports every
storage device that is connected to it and that is mountable.
For instance, an optical disc that is inserted in the CD / DVD
drive will automatically be processed. All connected USB
devices are read and monitored as well as hard disks coupled
through USB, IDE, (e)Sata etc.
2. Disk images
Forensic EnCase5 images as well as raw DD (Unix disk
dump) images are supported, these are low-level copies of raw
data of physical disks. Types like ISO, DAA or other disk
image file formats are also supported, provided that they can
be mounted by the Ubuntu operating system that is hosting the
evidence monitor process. If mounting fails the image is not
processed.
3. Logical folders
The third input evidence type is a folder which can contain
logical files and subfolders. This folder should be accessible to
the evidence monitor as a local folder or as a folder that is
mounted from a remote file server.
4. Third party data
Tracks Inspector can also accept digital evidence that has
been exported (typically in an XML format) from supported
third party applications. Currently exports from the two
market leaders in mobile phone forensics are supported. The
first one is UFED, developed by Cellebrite6. The second one is

http://www.lighttpd.net/
https://www.djangoproject.com/
5
http://www.guidancesoftware.com/
6
http://www.cellebrite.com

XRY, developed by MicroSystemation7. With this type of


input mobile phones can be added to cases as evidence units.
Once a device, (forensic) image file, logical folder or third
part data has been detected, it becomes an evidence unit in the
system. A case administrator should assign this evidence unit
to a case and start processing. Processing always commences
with node discovery in which the complete folder tree of the
evidence unit in the logical file system is recursively
traversed. After this tree is completed, general information
about the operating system (if present), user accounts and/or
third party XML is parsed and added to the evidence unit
database. Then file processing starts.
C. Supported file types
Currently, Tracks Inspector has support for most popular
file types (cf. Table 1). File extensions can be misleading and
therefore Tracks Inspector detects file type not by extension
but by analyzing the file signature. This is done using the file
magic command in Linux which is augmented with custom
rules to enhance the detection of certain complex MIME
types. The list of supported file types is continuously being
extended and Tracks Inspector relies on open source tools for
extracting full-text and metadata as well as for conversion.
Table 1: Tracks Inspector supported file types

Category
Audio
Compressed /
Archive Files
E-mail message
Spreadsheet
Document

Presentation
Picture

Video

Misc

Filetype
AAC, MP3, WAV, WAVE, WMA
EAR, GZ, JAR, RAR, REV, WAR, XPI,
ZIP
EML, MHT, MSG, OST, PST
ODS, XLS, XLST, XLT, XLTX
DOC, DOCM, DOCX, DOT, DOTM,
DOTX, ODT, OTT, PDF, RTF, WPD,
WP, WPn, WRI
POT, POTX, PPS, PPSX, PPT, PPTM,
PPTX
BMP, DIB, GIF, ICON, JIF, JFI, JFIF,
JPE, JPEG,JPG, PGA, PNG, SVG, TIF,
TIFF, XGA
3G2, 3GP, ASF, AVI, F4A, F4B, F4P,
F4V, FLV, M4A, M4B, M4R, M4P,
M4V, MK3D, MKA, MKS, MKV, MP4,
MOV, MPE, MPEG, MPG, QT, SWF,
WM, WMA, WMV
IE History le, Windows Registry les,
Chrome / FF History les (SQLite),
Digital Business card (VCF, VCARD)

Documents
Most common Microsoft / Open office document types for
word processing, spreadsheets and presentation are supported
and meta data, e.g. author field, is extracted.

ISBN: 978-0-9891305-7-8 2014 SDIWC

http://www.msab.com/xry/what-is-xry

E-mail archives
A variety of Email archive formats is supported. For
example Microsoft Office Outlook Pst/Ost, MBOX and
Thunderbird. In addition to these email archive formats, also
single email file formats such as MSG and EML are
supported. Lotus Notus NSF email is currently not supported
and needs to be converted prior to presenting it to Tracks
Inspector.
Data archives
Tracks Inspector processes (compressed) data archives as if
they are folders. The archives are extracted and the contents of
the archive are analyzed as a folder with subfolders. The
archives are scanned for password protection and
cryptography usage. Multi-part archives are also supported as
well as archives in archives and archives in email attachments,
emails in archives etc.
Images, Video and Audio files
Most well-known mime types for image formats are
supported. The EXIF (meta data) information is extracted as
well to give some side information about the image. Most
popular video and audio types are also supported including
typical mobile phone formats.
Internet history
The Internet history of the most modern browsers is
recognized and analyzed. This data will be extracted from the
history files, which are stored on a disk. For Internet Explorer
[12] the data is extracted per day, per week and overall. With
this information a detailed time line can be generated to give
an overview of the browsing behavior of users. This

ISBN: 978-0-9891305-7-8 2014 SDIWC

information is stored separately for each user account that was


encountered on the evidence. Similar information is extracted
from other browsers like Google Chrome and Mozilla Firefox.
V. USING TRACKS INSPECTOR
Tracks Inspector processes evidence units that have been
identified by the system and that have been assigned to a case
by a case administrator. This process has already been
outlined in the previous paragraph. During file processing,
general file metadata (e.g. filename, folder path, date created,
modified, last accessed) and specific file metadata (e.g. Office
document author, photo Exif data, video duration, email
header fields) as well as full-text content for supported file
types is extracted and stored in the evidence unit database.
This data becomes immediately searchable using the standard
MySQL full-text indexing capability.
In addition to this extraction, files are also converted to a
HTML5 compatible format so that they can be viewed by
investigators in their web browser. This eliminates the need
for downloading native files and installing custom 3rd party
software on desktop computers, increases ease of use and
reduces the risk of accidentally leaving confidential data on
computers used by investigators.
As soon as evidence is being processed, Tracks Inspector
presents various dashboards to monitor progress and to start
with the analysis. Figure 2 illustrates the evidence dashboard
that is presented to a user when accessing a case. This
particular example shows 12 evidence units. The first one is an
iPhone, second one a MacBook etc. The evidence units are
represented by rectangular shapes which show evidence name,
a short description and the number of objects discovered per
category. The second evidence unit has a padlock icon
because Tracks Inspector detected encrypted data in the file

Figure 2: Tracks Inspector dashboard

system during processing.


Other dashboards in Tracks Inspector reveal particular
details about certain media. For instance, a list of user
accounts with user login times, list of installed programs or
type of operating system. Particularly interesting dashboards
are the Case wide search, Case Identities and the Case
Analysis dashboards. The case wide search dashboard allows
users to search for keywords across all evidence units in the
case. The Case Identities dashboard displays identities that
have been extracted from document authors, user accounts,
mobile phone data etc. A summary of this dashboard and
underlying algorithms is described in [11]. The Case Analysis
dashboard provides a cross evidence-unit analysis of
identities. These three dashboards assist an investigator in
discovering interesting identities which in turn may help with
the prioritization of the evidence unit analysis.
The concept of evidence correlation in digital forensics is
not new. Garfinkel [8] has proposed a method to find
similarities among evidence units using cross-drive analysis
based on forensic feature extraction. An important difference
with the approach followed here is that Garfinkel extracts
features from raw disk data without considering the logical
structure of the data on a disk. The advantage of this approach
is that feature extraction is robust and is not complicated by
neither file system, operating system nor file format
interpretation. However, the drawback of this approach is that
it does not make use of valuable contextual information and
that it will have to analyze all disk sectors even if they are not
allocated to logical files. Although this type of feature
extraction can be considered useful, it is not useful when it is
intended to assist a user who has to understand the context
from which certain features have been extracted.
VI. RESULTS AND CONCLUSIONS
Beginning of 2012 a series of workshops was organized to
measure Tracks Inspector usability using the System Usability
Scale [3]. In total 36 non-technical investigators from Dutch
law enforcement and other government investigation
organizations were interviewed after they experienced Tracks
Inspector in a half day workshop. The system scored 88,1 out
of 100 points on the System Usability Scale which suggests
that Tracks Inspector is an easy to use application.
After these workshops Tracks Inspector has been tested by a
team of five non-technical investigators of Dutch Law
enforcement. The investigation was related to a human
trafficking case. A total of 49 evidence units had been seized
amounting to a total of 2 Tb. A Tracks Inspector base system
consisting of a single server with dual hex-core, 96 Gb RAM
memory and 10 Tb storage (6 x 2 Tb Sata 7,200 rpm disks in
RAID5 mode) was installed in their office. While loading the
evidence, the five investigators were able to analyze all
evidence units concurrently in two weeks time without any
specific training. Not only did they find relevant material to
support the allegation, they also found evidence for an

ISBN: 978-0-9891305-7-8 2014 SDIWC

additional criminal complaint. The public prosecutor on the


case felt comfortable with the standard report that was
produced based on all files that had been tagged as relevant by
the investigators. This report has been added to the case
binder. The office of the public prosecutor felt also
comfortable with the system because investigators could
eliminate privileged using the mark-as-privileged function in
the interface.
At the end of 2012 a multi-server Tracks Inspector system
was tested in a national Dutch pilot project with a 100 user
license. The multi-server system consists of 5 servers with a
total of 40 Tb of storage. The system is connected to the wide
area network which has well over 100 users in the region. This
test was positively concluded. The largest case currently on
the system consists of 80 evidence units. The scalability goals
in terms of number of users and processing speed of evidence
units is considered satisfactory and the 100 user license was
procured early in 2013.
Since mid 2013 a stand-alone server solution is in use by
the Police in De Pinte, Belgium and another one by the Police
in StMaarten. Both customers are lacking the support of
digital forensic experts and were suffering from backlogs up to
6 months and in some cases even longer. In both cases an ICT
administrator was trained in maintaining the server and in
using the physical evidence system for copying and processing
digital media. The investigators were trained in using the
system and in identifying its limitations and exceptions.
Tracks Inspector enabled investigators to reduce the existing
case backlog in a matter of weeks and were able to solve
simple investigations in days that would normally take months
because of lacking support from digital forensics teams.
Although these results are not based on scientific
experiments, we at least have practical evidence which
indicates that Tracks Inspector is a user friendly application
that enables non-technical investigators to analyze digital
evidence. We have demonstrated that the base system can be
scaled easily to a multi-server system to increase processing
speed and to handle 20-30 concurrent users. The standard
report produced by the system is considered readable, not only
by the investigators but also by the public prosecutor.
VII. FUTURE WORK
Future work on Tracks Inspector will focus on identifying
more tracks that are left by users on a computer and on
automated analysis based on evidence unit correlation to assist
users in intelligent review and prioritization of evidence units.
Feature extraction from evidence units is essential for
meaningful correlation and will be an ongoing task. Version
1.4 already has built in support for identity extraction, semiautomatic merging and alias detection which are presented as
a table formatted heat map. We plan on including more
sources for extracting identities, e.g., from cookies and
selected registry keys. Also other features will be considered
such as registration of USB devices, WiFi networks, common
user security identifiers (SID), multi-media analysis etc.

Future work will also focus on internet usage and attempt to


identify which features in addition to internet history urls are
of additional value to non-technical investigators. Lastly, we
intend to include other visualization methods (inspired by
existing research) such as a plotting events on a timeline and
plotting GPS-coordinates on map.
REFERENCES
[1]

Alink, W., Bhoedjang, R., Boncz, P., de Vries, A., 2006. Xirafxmlbased indexing and querying for digital forensics. Digital I
nvestigation 3, 5058..
[2] ArxSys, 2012. Open source digital investigation framework.
http://www.digital-forensic.org/. Last visited February 2013.
[3] Brooke, J., 1996. "SUS: a "quick and dirty" usability scale". In P. W.
Jordan, B. Thomas, B. A. Weerdmeester, & A. L. McClelland. Usability
Evaluation in Industry. London: Taylor and Francis.
[4] Bunting, S., Wei, W., 2006. EnCase computer forensics: the official
EnCE: EnCase certied examiner study guide. Sybex.
[5] Casey, E., 2012. Automation and artificial intelligence in digital
forensics. EAFS2012 Abstract published in
http://www.eafs2012.eu/sites/default/files/files/abstract book
eafs2012.pdf.
[6] Data,
Access.,
2012.
Forensic
toolkit
4
whitepaper.
http://accessdata.com/downloads/media/FTK DataSheet web.pdf.
Last visited February 2013.
[7] Doe, R., 2010. The e-dicsovery reference model (edrm). the review
stage.
[8] Garnkel, S., 2006. Forensic feature extraction and cross-drive analysis.
digital investigation 3, 7181.
[9] Grossman, M., Cormack, G., 2011. Technology-assisted review in ediscovery can be more eective and more ecient than exhaustive
manual review. Rich. JL & Tech. 17, 1116.
[10] Henseler, J., 2012. Swiping through digital evidence. EAFS2012
Abstract published in
http://www.eafs2012.eu/sites/default/les/les/abstract book
eafs2012.pdf.
[11] Henseler, J., Hofste, J., van Keulen, M., 2013. Computer assisted
extraction, merging and correlation of identities. Submitted to the ICAIL
2013.
[12] Wilson, C., 2009. Netanalysis forensic internet hisory analysis.

ISBN: 978-0-9891305-7-8 2014 SDIWC

You might also like