You are on page 1of 27

ACKNOWLEDGEMENT

We would like to place on record my deep sense of gratitude to Mr Raju Pal, faculty, Jaypee
Institute of Information Technology, India for his generous guidance, help and useful
suggestions.

We express my sincere gratitude to Mr. Vivek Goel, Dept. of Computer Science Engineering,
Jaypee Institute Of Information Technology, India, for his stimulating guidance and continuous
encouragement throughout the course of present work.

Signature(s) of Students
Disha Patil

(9912103430)

Pratika Kochar

(9912103412)

Priyanka Agrawal (9912103425)


Vivek Mishra

1|Page

(9912103414)

TABLE OF CONTENTS
S.No

Ch. No

Topic

Page No.

1.

Abstract

2.

Table of Figures

3.

Abbreviations

4.

Introduction

5.

Background Study

6.

Requirement Analysis

7.

3.1

Software Requirements

11

8.

3.2

Hardware Requirements

11

9.

3.3

Functional requirements

11

10.

3.4

Non Functional Requirements

13

11.

3.5

UML Diagrams

12.

3.5.1

Use Case Diagram

14

13.

3.5.2

Class Diagram

15

14.

3.5.3

Sequence Diagram

16

15.

3.5.4

Data Flow Diagrams Level 0 and 1

16.

Detailed Design

19

17.

Implementation

21

18.

Testing

24

19.

Conclusion and Future Scope

25

20.

Gantt Chart Phase

26

21.

References

27

2|Page

17-18

ABSTRACT

Internet has grown rapidly in recent years which has increased the demand for techniques that
can ensure information security. In this project, we propose a text steganography technique that
uses html documents as the cover medium to hide secret messages. The use of html documents
has a benefit that the existence of a secret message will not be suspicious as html documents
are fundamental elements of the web and are used very commonly on the internet. We have
implemented our technique using Java. The technique proposed by us also integrates
cryptography with steganography by first encrypting the secret message and then hiding the
encrypted secret message in html cover medium. The integration of cryptography with
steganography provides an extra layer of security that ensures the safe and secure delivery of
message to the intended recipient.

3|Page

LIST OF FIGURES

S. No

Figure

Page no

1.

Hiding Process

13

2.

Extracting Process

13

3.

Use Case Diagram

15

4.

Class Diagram

16

5.

Sequence Diagram

17

6.

DFD Level 0

18

7.

DFD Level 1

19

8.

Conversion of Secret key

20

9.

Decryption while running applet

20

10.

Running
applet
Successful

21

11.

Running applet from webpage- Fail

4|Page

from

webpage-

21

ABBREVIATIONS

1. Applet: A browser based java software.


2. html: HyperText Markup Language.
3. jar: Java Archive
4. Jnlp: Java Network Launch Protocol
5. Jre: Java runtime environment
6. jsoup: java library for html parsing
7. xml: eXtensive Markup Language

5|Page

INTRODUCTION

Following the expansion of Internet, our life is highly influenced by Internet. Many of our daily
affairs, administrative activities, business transactions, educational systems and the likes are
done by Internet. An immediate result of the development is that many software, previously
offered offline, are now delivered online. Examples are simulation programs (like simulation
of physical experiments), administrative programs, and Internet games and so on.

Java Applets are software in Java language one may receive and execute through Internet.
Though all people have free access to the Internet, some owners of such software demand a
price before allowing their use. Moreover, some of them are not willing to see their programs
installed or executed in other web pages and instead want to confine the programs to their own
sites. To that effect, different methods have been offered.

As mentioned earlier, many Internet-based software take Java Applet format. Such software
either exists as a Class file or compressed in a "jar" file. One may download such files and put
them again on another web page unlike the desire of their original owners who like to confine
the software to their own pages. Solutions are needed to protect the Java Applet from being
copied by other users.

In this project, "Steganography in HTML web pages" method is used as a solution to protect
the Java Applet and make the copied one fail to run in other pages. Steganography is a method
of covert exchange of data, highlighted in recent years, chiefly aimed to hide data within a
cover media so that other individuals fail to realize their existence.
Most steganography works are done on images, video-clips, music and sounds. But few works
are carried out on the text steganography. Recently, data security has increased through a
combination of the steganography and other previously mentioned methods. Steganography
has other applications besides covert exchange of data, including in copyright protection,
preventing e-document forging and so on.

6|Page

BACKGROUND STUDY
Literature Survey

PAPER 1:
TITLE OF PAPER:

Java Applets Copy Protection By Steganography

AUTHORS:

Mohammad Shirali-Shehreza

YEAR OF PUBLICATION:

2006

PUBLISHING DETAILS:

In the proceedings of International Conference

on Intelligent Information Hiding and Multimedia


Signal Processing (IIH-MSP'06)

SUMMARY
This project offers a way to copy protect Java Applets from the host web page by using
"Steganography in HTML web pages" method. In this method, a special 8-character string like
"ABC123+D" is first hidden in the host web page by using a secret key. ID is one of the
attributes that usually takes place in any HTML tag in which every object receives a unique
ID.
In this method, data will be hidden in the ID attribute of tags First, bytes of the input data will
be transformed into a combination of letters and digits by the following function to able to use
bytes in tags ID attribute. The same special 8-character string and the intended secret key are
put into the desired Java Applet. The Java Applet is put in the above HTML page laden with
hidden data. After executing the Java Applet, it will extract the string hidden in the HTML
page by a secret key that Java Applet carries according an algorithm.
For reasons of newness, especially in using HTML pages as cover media and also easy
application, the method is highly favorable and can be extended more. Instead of hiding special
8-character string in the HTML pages, we can hide it in an image or other media in the HTML
page. Since Java language is used, this method does not rely on a particular platform and can

7|Page

be used on any platform. In general, this method is new and easy to use and enjoys high
flexibility and expansion potentials.
WEB LINK :
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4041744&url=http%3A%2F%2Fi
eeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4041744

PAPER 2 :
TITLE OF PAPER:

A New method of Steganography in HTML Files

AUTHORS:

Shirali-Shareza, M

YEAR OF PUBLICATION:

2005

PUBLISHING DETAILS:

Proc. Of the International Conference on


Computer, Information and System Sciences, and
Engineering (CISSE 2005)

SUMMARY
A method is introduced in this project for exchanging information by steganography on HTML
pages. The main idea in this method is to hide coded data in the ID attribute of the HTML
document tags.
Since numerous HTML pages are on the web, this method can be widely used. In this method,
data will be hidden in the ID attribute of tags First, bytes of the input data will be transformed
into a combination of letters and digits by a certain 6 function to able to use bytes in tags ID
attribute.
The tags are not chosen arbitrary but by using the ASCII Code of the secret key used module
3 gives the gap between the tags and after certain gap only tags may be used to hide data. The
part of the Web Application which makes use of this method should have a decoder algorithm
to identify the intended tags in which data is hidden and should implement the way in which
the information can be extracted.
WEB LINK :

8|Page

http://link.springer.com/chapter/10.1007%2F1-4020-5261-8_39

PAPER 3 :
TITLE OF PAPER:

A Novel Text Steganography Techniques Based


On HTML Documents

AUTHORS:

Mohit Garg

YEAR OF PUBLICATION:

2011

PUBLISHING DETAILS:

International Journal of Advanced Science and


Technology Vol. 35, October, 2011

SUMMARY
In this paper, a text steganography technique that uses html documents as the cover medium
to hide secret messages has been proposed. The use of html documents has a benefit that the
existence of a secret message will not be suspicious as html documents are fundamental
elements of the web and are used very commonly on the internet. The technique has been
implemented using C#.net technology.
The technique consists of three main components :
1. Key file generation
2. Hiding the message
3. Extracting the message
Key file generation: This is the most important component of the technique. The key file is
essentially the collection of key combinations that are stored in the form of rows and columns.
Each combination is actually an attribute pair that is candidate towards hiding a bit. These
combinations are derived from the html document.

9|Page

Hiding the message: To hide a message, first convert it in the binary fashion, in terms of bit
stream. Then scan the html document to find the attribute combinations that can be used to hide
a bit.

Extracting the message: The extraction component is quite simple. Firstly, it scans the
document to find the attribute pairs that hides a bit, using a similar procedure as used for hiding.
Once it finds the attribute pair, it compares the positions of the attributes according to the key
file. If primary attribute lies before (as determined by the positions) the secondary attribute,
then a bit 1 is recorded else a bit 0 is recorded.
WEB LINK :
http://www.sersc.org/journals/IJAST/vol35/11.pdf

10 | P a g e

REQUIREMENT ANALYSIS
3.1 SOFTWARE REQUIREMENTS

Java Development Kit 1.8 and Java Runtime Environment (Java 6 and above)

Any operating system is supported capable of running java application.

Front End: Java Swings, html webpage, Java Applets.

Technology : Netbeans IDE 8.0

Platform: Windows Vista/7/8, Mac OS X, Solaris, Ubuntu etc. (reason-java being


platform independent)

Language: Java, HTML


3.2 HARDWARE REQUIREMENTS

This product requires a keyboard for input of secret key. Other devices can also be
used provided they give same functionality as a keyboard.

3.3 FUNCTIONAL REQUIREMENTS

Step 1:
The first step applied to tool is the entering of the 8-character secret key to be encoded. This
is essential in order to generate id value of html tag elements which will contain the encoded
string (2-to-4 character conversion) of 8-char secret key.

Step 2:
Once the html page containing the applet is identified, the next step is to identify all the id
attributes of aforementioned webpage, according to said function, and update id=value pairs.

Step 3:
The Java Applet identifies the right set of ids from the html page where it is loaded and
decodes their value in the right sequence based on their relative positions.
11 | P a g e

12 | P a g e

3.4 NON- FUNCTIONAL REQUIREMENTS

1. Portability: The system is developed in a platform independent language ( Java) which


makes this system portable.

2. Usability: The system is user friendly regarding the usage of the system.
3. Adaptability: The system is adaptable for all operating systems.
4. Reliability: The product does not crash under the circumstance of user entering invalid
values. It shows appropriate message for every user generated message.

13 | P a g e

3.6 UML DIAGRAM

3.6.1 USE CASE DIAGRAM

Figure 3

14 | P a g e

3.6.2 CLASS DIAGRAM

Figure 4

15 | P a g e

3.6.3 SEQUENCE DIAGRAM

Figure 5
16 | P a g e

3.6.4 DATA FLOW DIAGRAM

DFD Level - 0

Figure 6

17 | P a g e

DFD Level 1

Figure 7

18 | P a g e

DETAILED DESIGN
Working of Program
1. Conversion of Secret Key

Figure 8
2. Decryption while running applet

Figure 9
19 | P a g e

3. Running Applet from intended webpage street.html


Successful Run.

Figure 10
4. Running from other/fake webpage buck.html
Applet failed

Figure 11

20 | P a g e

IMPLEMENTATION
DATA STRUCTURES AND ALGORITHMS

Step 1: How to encode secret key (2-to-4 character conversion)

Every byte possesses an ASCII code of 0 to 255.

To make the byte into two characters, at first save the first digit of the ASCII code.

The second and third digits of the ASCII code range from 0 to 25, exactly equivalent
to the number of English alphabet (A to Z).

Save the equivalent letter of alphabet of the two digits.

Do the same operation for the second byte.

Put the two saved letters and the two saved digits alongside each other. This work
makes the coded information ready.

The function to encode 2 character to 4 characters


private String encode(char[] mako) throws
FileNotFoundException, IOException {
char obj1 = mako[0];
char obj2 = mako[1]; //
int h1 = (int) obj1; //convert to ascii--;
int h2 = (int) obj2; //convert to ascii--;
int y1 = h1 / 10; //get first letter (~digit)
int y2 = h2 / 10;
int w1 = h1 % 10; //get second digit
int w2 = h2 % 10;
int analogous_int1 = y1 + 64; // analogous Position in numbers
for letter-->getting ASCII
int analogous_int2 = y2 + 64;
char character1 = (char) analogous_int1; //getting Letter From
ASCII
char character2 = (char) analogous_int2;
String msg = "" + character1 + character2 + w1 + w2;
raf.writeBytes(msg+"\r\n"); //this has been created for
reference while making html
21 | P a g e

document
return msg;
}
For decoding, this algorithm will be reversed.

Decoding function of Applet


int h4 = Integer.parseInt("" + cd.charAt(3));
int temp = (int) h1;
int temp1 = (int) h2;
int temp_integer = 64;
temp -= temp_integer;
temp *= 10; // gives first 2 char of ascii
temp1 -= temp_integer;
temp1 *= 10;
int ascii = temp + h3; //gives complete ascii
int ascii1 = temp1 + h4;
char letter = (char) ascii;
char letter1 = (char) ascii1;
return "" + letter + letter1;

Step 2: Giving IDs to HTML elements based on secret key conversion.

After making characters, the intended ID will be created for the tag by a combination
of the object name, the HTML page title, and finally four encoded characters.

At first choose the first two letters of the object name.

Then choose the first three letters of the HTML page title.

Now create the tags ID by putting the two first letters of the object name, a zero digit
(which I will explain why), first three letters of HTML page title and the four encoded
characters after them e.g., for a HTML Page buck.html the intended id value created is
<div id=di0bucGL75>
where GL75 is the encoded version of secret key.

Step 3: Running Applet and Extracting data in its init method

22 | P a g e

First the applet extracts data from the HTML File from where it is called using jsoup
library
String docbs = getDocumentBase().toString(); // Get file name
Document doc = jsoup.parse(docbs,UTF-8); // parse this document for use by java
Elements links = doc.select([id]); // selecting all HTML elements with ID Tag.

Now starting from 0th links, we move to every 4th,7th and 10th links to look for the
value of ID in these elements.

Such nodes are choosen according to (ASCII Code of Secret Key) module 3=ID Gap,
which gives a gap of 2 elements.

Now we take the substring(6,10) of the selected id-value and decode it using a function
func(String) which gives the conversion of encoded data to normal string and then the
applet tries to match it with secret key which is put into the applet which creation, and
if they successfully match then start() method is called from init() method else destroy()
method is called from it.

23 | P a g e

TESTING
All test cases in prescribed format
Test Case ID

Input

Expected Output

Status

Encryption
Function (2-to4
char conversion)

Any Value from


Keyboard

Encoded String of 2
letters followed by 2
digits

Pass

Decryption
Function (4-to-2
char conversion)

Complete encoded
string from part 1

Input Value entered


from keyboard in
part 1

Pass

Applet Load on
Real Webpage

Intended page URL

Applet Run

Pass

Applet Load on
Fake ID

Other page URL

Applet Fails to load

Pass

False Input

Entering more than 8


Characters in the key

Applet throws exception

Pass

24 | P a g e

CONCLUSION
For the text steganography various methods have been proposed. In this paper, we proposed a
novel approach of text steganography that uses the html tags and attributes to hide the secret
messages. The basic idea of the proposed technique is to hide the messages by changing the
order of attributes as the ordering of attributes does not affect the appearance of the html
documents. The html documents are fundamental elements of the web. These documents are
used very commonly on the internet and hence are less prone to arouse suspicion in the
intruder of the existence of the secret message. Moreover, any html document has a
considerable number of tags and attributes. Thus the capacity of the hiding process to hide
secret messages is also high in the proposed technique.

FUTURE SCOPE

Instead of hiding special 8-character string in the HTML pages, we can hide it in an
image or other media in the HTML page.

Also our applet uses Net Beans IDE to function, there is a scope of modifying the
code and running it on the browser directly so as to make the project more user
friendly.

Rather than hiding only 8 character string, the range can be extended and we can hide
strings of 12, 16 etc.. characters.

25 | P a g e

Gantt Chart Phase


1: Project initiation and Requirement gathering Phase

2: Planning, Estimating and Scheduling Phase

3: Modeling, Analysis and Design Phase

4: Coding and Unit testing Phase

5: Component integration and System testing

26 | P a g e

REFERENCES

M. Shirali-Shahreza, "A New Method for Steganography in HTML Files", Proc. Of


the Int. Joint Conf. on Computer, Information, and Systems Sciences, and Engineering
(CISSE 2005), pp. 247-251,December 10-20, 2005,.

M. Shirali-Shahreza, Java Applets Copy Protection using Steganography,


Proceedings of the 2006 International Conference on Intelligent Information Hiding
and Multimedia Signal Processing 2006

Goodman D., Dynamic HTML: The Definitive Reference, Second Edition, O'Reilly
& Associates Inc., 2002.

Herbert Schildt., Java: The Complete Reference, Eighth Edition, Oracle Press & Tata
McGraw-Hill Edition., 2003

The

JavaTM

Tutorial-

What

Applets

Can

and

cannot

http://docs.oracle.com/javase/tutorial/deployment/applet/security.html, 2015

27 | P a g e

do: