Professional Documents
Culture Documents
1. INTRODUCTION
2. HISTORICAL EXAMPLES
Finding images
To exercise our ability to test for steganographic content
automatically, we needed images that might contain hidden
messages. We picked images from eBay auctions (due to various
news reports)20,21 and discussion groups in the Usenet archive for
analysis.To get images from eBay auctions, a Web crawler that
could find JPEG images was the obvious choice. Unfortunately,
there were no open-source, image-capable Web crawlers available
when we started our research. To get around this problem, we
developed Crawl, a simple, efficient Web crawler that makes a
local copy of any JPEG images it encounters on a Web page.
Crawl performs a depth-first search and has two key features:
• Images and Web pages can be matched against regular
expressions; a match can be used to include or exclude Web pages
in the search.
• Minimum and maximum image size can be specified,
which lets us exclude images that are too small to contain
hidden messages. We restricted our search to images larger than 20
Kbytes but smaller than 400.We downloaded more than two
million images linked to eBay auctions. To automate detection,
Crawl uses stdout to report successfully retrieved images to
Stegdetect.After processing the two million images with
Stegdetect,we found that over 1 percent of all images seemed to
contain hidden content. JPHide was detected most often (see
Table 2).We augmented our study by analyzing an additional
one million images from a Usenet archive. Most of these
are likely to be false-positives. Stefan Axelsson applied the
base-rate fallacy to intrusion detection systems and showed
that a high percentage of false positives had a significant
effect on such a system’s efficiency.27 The situation is very
similar for Stegdetect.We can calculate the true-positive rate—the
probability that an image detected by Stegdetect really has
steganographic content—as follows where P(S) is the probability
of steganographic content in images, and P(¬S) is its complement.
P(D|S) is the probability that we’ll detect an image that has
steganographic content, and P(D|¬ S) is the false-positive rate.
Conversely,P(¬D|S) = 1 – P(D|S) is the false-negative rate.
To improve the true-positive rate, we must increase
the numerator or decrease the denominator. For a given
detection system, increasing the detection rate is not possible
without increasing the false-positive rate and vice versa. We
assume that P(S)—the probability that an image contains
steganographic content—is extremely low compared to P(¬ S), the
probability that an image contains no hidden message. As a result,
the false-positive rate P(D|¬S) is the dominating term in the
equation; reducing it is thus the best way to increase the true-
positive rate. Given these assumptions, the false-positive rate also
dominates the computational costs to verifying hidden content. For
a detection system to be practical, keeping the false-positive rate as
low as possible is important.
7. Conclusion
Computer forensic professionals need to be aware of
the difficulties in identifying the use of
steganography in any investigation. As with many
digital age technologies, steganography techniques
are becoming increasingly more sophisticated and
difficult to reliably detect. Once use is
detected or discovered, obtaining the ability to
recover the
embedded content is becoming difficult as well.
Acquiring
knowledge of current steganographic techniques,
along with
their associated data types, can provide a critical
advantage to an
investigator by adding valuable tools to their forensic
toolkit.
Finally, due to the relatively simple techniques capable of denying
the exploitation of a covert steganographic channel,companies may
wish to take precautionary measures. By enacting measures
discussed in this paper, they can ensure their proprietary and trade
secret information is not being shoplifted inside of the daily
podcast, shared in family photos, or distributed via the latest
YouTube video.
REFERENCES