You are on page 1of 44

A Efficient Dynamic aware Public Auditing for Cloud Data storage with Fair

Arbitration
cloud computing makes storage outsourcing become a rising trend, which promotes the
secure remote data auditing a hot topic that appeared in the research literature. Recently some
research consider the problemof secure and efficient public data integrity auditing for shared
dynamic dataStatic and dynamic proof of storage schemes have been proposed for use in secure
cloud storage scenarios. In this setting, a client outsources storage of her data to a server, who
may, willingly or not, corrupt the data (e.g., due to hardware or software failures), or delete
infrequently accessed parts to save space. Most of the existing schemes only solve part of this
problem: The client may ask for a cryptographic proof of integrity from the server. But what
happens if this proof fails to verify? We argue that in such a case, both the client and the server
should be able to contact an official court, providing cryptographic proofs, so that the Judge can
resolve this dispute. We show that this property is stronger than what has been known as public
verifiability in the sense that official arbitration should handle a malicious client as well. We
clearly show this formalization difference, and then present multiple schemes that work for
various static and dynamic storage solutions in a generic way. We implement our schemes and
show that they are very efficient, diminishing the validity of arguments against their use, where
the overhead for adding the ability to resolve such disputes at a court is only 2 ms and 80 bytes
for each update on the stored data, using standard desktop hardware. Finally, we note that
disputes may arise in many other situations, such as when two parties exchange items (e.g., ecommerce) or agree on something (e.g., contract-signing). We show that it is easy to extend our
official arbitration protocols for a general case, including dynamic authenticated data structures
Efficient secure cloud storage protocols have been proposed since 2007, starting with provable
data possession (PDP) work of Ateniese et al. and proof of retrievability work of Juels and
Kaliski followed by many others In these scenarios, the client outsources storage of her files to a
server (e.g., Dropbox, Amazon S3, Google Drive, Microsoft Skydrive), but does not necessarily
fully trust the server. In particular, the client would like to have some warranty over the integrity
of her files. To achieve this, the client stores some metadata M , and may later challenge the
server on some random subset of blocks of her data, and obtain a cryptographic proof of integrity
in return. If this proof verifies against the clients metadata, with high probability, the clients
whole data is intact, and thus the client may rest in peace. 1 The problem begins when the

servers proof fails to verify. In this case, ideally, the client and the server should be able to
resolve their dispute at a court. By obtaining cryptographic evidence from both parties, the Judge
should be able to rule the correct decision. Previously, a related property named public
verifiability has been proposed. Unfortunately, we show that public verifiability is not enough for
the Judge to rule, by showing that a dishonest client may frame an honest server in the publiclyverifiable version of PDP (see Section 3). The main reason is that, public verifiability property
was intended to enable third parties to perform verification, whereas in our model, the client is
assumed to be potentially malicious, since there are potential monetary gains behind framing an
honest server. For example, Amazon may offer a warranty in case of data loss, and a malicious
client may want to obtain this warranty nevertheless. Note that such a new business model for
cloud storage with warranty is desirable, since it may bring enterprise customers. Thus, we
formalize official arbitration different from public verifiability to ensure that the resulting
protocol can be used by the Judge officially. Next, we observe that there is a generic and easy
solution for this issue in the static setting. In a static setting, the clients data does not change
over time (even by the client herself). The basic idea is to obtain the servers signature on the
metadata M once when the file is uploaded. At this time, if the servers signature fails to verify,
the client may assume that her files are not backed up. If the signature verifies, and later some
dispute occurs, the client can present this metadata M , together with the servers signature on it,
to the Judge. At this point, through the use of secure cloud storage proofs (i.e., the servers proof
that the file is intact), the Judge can arbitrate between the client and the server
The development of cloud computing motivates enterprises and organizations to
outsource their datato third-party cloud service providers (CSPs), which will improve the storage
limitation of resource constrain local devices. Recently, some commercial cloud storage services,
such as the simple storage service (S3) on-line data backup services of Amazon and some
practical cloud based software Google Drive , Dropbox , Mozy Bitcasa and Memopal [6], have
been built for cloud application. Since the cloud servers may return an invalid result in some
cases, such as server hardware/software failure, human maintenance and malicious attack [7],
new forms of assurance of data integrity and accessibility are required to protect the security and
privacy of cloud..

Cloud Computing has been envisioned as the next generation information technology
(IT) architecture for enterprises, due to its long list of unprecedented advantages in the IT
history: on-demand self-service, ubiquitous network access, location independent resource
pooling, rapid resource elasticity, usage-based pricing and transference of risk As a disruptive
technology with profound implications, Cloud Computing is transforming the very nature of how
businesses use information technology. One fundamental aspect of this paradigm shifting is that
data is being centralized or outsourced to the Cloud. From users perspective, including both
individuals and IT enterprises, storing data remotely to the cloud in a flexible on-demand manner
brings appealing benefits: relief of the burden for storage management, universal data access
with location independence, and avoidance of capital expenditure on hardware, software, and
personnel maintenances, etc . While Cloud Computing makes these advantages more appealing
than ever, it also brings new and challenging security threats towards users outsourced data.
Since cloud service providers (CSP) are separate administrative entities, data outsourcing is
actually relinquishing users ultimate control over the fate of their data. As a result, the
correctness of the data in the cloud is being put at risk due to the following reasons. First of all,
although the infrastructures under the cloud are much more powerful and reliable than personal
computing devices, they are still facing the broad range of both internal and external threats for
data integrity . Examples of outages and security breaches of noteworthy cloud services appear
from time to time Secondly, there do exist various motivations for CSP to behave unfaithfully
towards the cloud users regarding their outsourced data status. For examples, CSP might reclaim
storage for monetary reasons by discarding data that has not been or is rarely accessed, or even
hide data loss incidents to maintain a reputation In short, although outsourcing data to the cloud
is economically attractive for long-term large-scale storage, it does not immediately offer any
guarantee on data integrity and availability. This problem, if not properly addressed, may impede
the success of cloud architecture

ABSTRACT:
Cloud users no longer physically possess their data, so how to ensure the integrity of their
outsourced data becomes a challenging task. Recently proposed schemes such as provable data
possession and proofs of retrievability are designed to address this problem, but they are
designed to audit static archive data and therefore lack of data dynamics support. Moreover,
threat models in these schemes usually assume an honest data owner and focus on detecting a
dishonest cloud service provider despite the fact that clients may also misbehave. This paper
proposes a public auditing scheme with data dynamics support and fairness arbitration of
potential disputes. In particular, we design an index switcher to eliminate the limitation of index
usage in tag computation in current schemes and achieve efficient handling of data dynamics. To
address the fairness problem so that no party can misbehave without being detected, we further
extend existing threat models and adopt signature exchange idea to design fair arbitration
protocols, so that any possible dispute can be fairly settled. The security analysis shows our
scheme is provably secure, and the performance evaluation demonstrates the overhead of data
dynamics and dispute arbitration are reasonable.

SYSTEM ANALYSIS

EXISTING SYSTEM:

First of all, earlier auditing schemes usually require the CSP to generate a deterministic
proof by accessing the whole data file to perform integrity check.

Secondly, some auditing schemes provide private verifiability that require only the data
owner who has the private key to perform the auditing task, which may potentially overburden
the owner due to its limited computation capability.

Thirdly, PDP and PoR intend to audit static data that are seldom updated, so these
schemes do not provide data dynamics support. But from a general perspective, data update is a
very common requirement for cloud applications.

DISADVANTAGES OF EXISTING SYSTEM:

Providing data dynamics support is the most challenging. This is because most existing
auditing schemes intend to embed a blocks index i into its tag computation, which serves to
authenticate challenged blocks. However, if we insert or delete a block, block indices of all
subsequent blocks will change, then tags of these blocks have to be re-computed. This is
unacceptable because of its high computation overhead.

Current research usually assumes an honest data owner in their security models, which
has an inborn inclination toward cloud users. However, the fact is, not only the cloud, but also
cloud users, have the motive to engage in deceitful behaviors.

In Existing System no integrity auditing scheme with public verifiability, efficient data
dynamics and fair disputes arbitration.

Existing system has the limitation of index usage in tag computation

In Existing System tag re-computation caused by block update operations

In Existing System both clients and the CSP potentially may misbehave during auditing
and data update

PROPOSED SYSTEM:

We address this problem by differentiating between tag index (used for tag computation)
and block index (indicate block position), and rely an index switcher to keep a mapping between
them. Upon each update operation, we allocate a new tag index for the operating block and
update the mapping between tag indices and block indices. Such a layer of indirection between
block indices and tag indices enforces block authentication and avoids tag re-computation of
blocks after the operation position simultaneously. As a result, the efficiency of handling data
dynamics is greatly enhanced.

Furthermore and important, in a public auditing scenario, a data owner always delegates
his auditing tasks to a TPA who is trusted by the owner but not necessarily by the cloud.

Our work also adopts the idea of signature exchange to ensure the metadata correctness
and protocol fairness, and we concentrate on combining efficient data dynamics support and fair
dispute arbitration into a single auditing scheme.

To address the fairness problem in auditing, we introduce a third-party arbitrator(TPAR)


into our threat model, which is a professional institute for conflicts arbitration and is trusted and
payed by both data owners and the CSP. Since a TPA can be viewed as a delegator of the data
owner and is not necessarily trusted by the CSP, we differentiate between the roles of auditor and
arbitrator. Moreover, we adopt the idea of signature exchange to ensure metadata correctness and
provide dispute arbitration, where any conflict about auditing or data update can be fairly
arbitrated.

Generally, this paper proposes a new auditing scheme to address the problems of data
dynamics support, public verifiability and dispute arbitration simultaneously.
ADVANTAGES OF PROPOSED SYSTEM:

The proposed system solves the data dynamics problem in auditing by introducing an
index switcher to keep a mapping between block indices and tag indices, and eliminate the
passive effect of block indices in tag computation without incurring much overhead.

The proposed system extend the threat model in current research to provide dispute
arbitration, which is of great significance and practicality for cloud data auditing, since most
existing schemes generally assume an honest data owner in their threat models.

The proposed system provides fairness guarantee and dispute arbitration in our scheme,
which ensures that both the data owner and the cloud cannot misbehave in the auditing process
or else it is easy for a third-party arbitrator to find out the cheating part
Algorithm:
Proof verification algorithm:
Verification proof algorithm is one aspect of testing a product's fitness for purpose.
Validation is the complementary aspect. Often one refers to the overall checking process.
Proof verification algorithm for any valid input it produces the result required
by the algorithms specification.
Asymmetric signature algorithm:
Asymmetric algorithms use different keys for encryption and decryption, and the
decryption key cannot be derived from the encryption key. Asymmetric algorithms are important
because they can be used for transmitting encryption keys or other data securely even when the
parties have no opportunity to agree on a secret key in private.
Attacks:
Network Attack:
A network attack can be defined as any method, process, or means used to maliciously
attempt to compromise network security
The individuals performing network attacks are commonly referred to as network
attackers, hackers, or crackers.

MODULE DESCRIPTION
MODULE
MODULE DESCRIPTION

1. CLIENT MODULE
In this module, a client makes use of
providers resources to store, retrieve and share data with multiple users. A client
can be either an individual or an enterprise. Client can check the uploaded file he
can neglate or upload the file Client can view the deduplicate file based on this
client can delete the unwanted datas.
2. CSP MODULE:
In this module csp can view all the user
details, client uploads details, clients details And clients activities regarding the A
Secure Client Side Deduplication Scheme in Cloud Storage Environments

3. Data Upload and validation:


To efficiently upload an encrypted data with cloud. A semi-trusted proxy can transform an
encryption of a message to another encryption of the same message without knowing the
message. To user upload our files for our selected location of the database.Every user can be
upload their data are Encrypted format to store the database.Then user want to use the file
download and view our file for Decrypted format using secret keys.
The shared data are signed by a group of users. Therefore, disputes between the two parties are
unavoidable to a certain degree. So an arbitrator for dispute settlement is indispensable for a fair
auditing scheme. We extend the threat model in existing public schemes by differentiating
between the auditor (TPAU) and the arbitrator (TPAR) and putting different trust assumptions on
them. Because the TPAU is mainly a delegated party to check clients data integrity, and the
potential dispute may occur between the TPAU and the CSP, so the arbitrator should be an
unbiased third party who is different to the TPAU.

As for the TPAR, we consider it honest-but-curious. It will behave honestly most of the time but
it is also curious about the content of the auditing data, thus the privacy protection of the auditing
data should be considered. Note that, while privacy protection is beyond the scope of this paper,
our scheme can adopt the random mask technique proposed for privacy preservation of auditing
data, or the ring signatures in to protect the identityprivacy of signers for data shared among a
group of users.

4. Data Arbitration:
The shared data are signed by a group of users. Therefore, disputes between the two parties are
unavoidable to a certain degree. So an arbitrator for dispute settlement is indispensable for a fair auditing
scheme. We extend the threat model in existing public schemes by differentiating between the auditor
(TPAU) and the arbitrator (TPAR) and putting different trust assumptions on them. Because the TPAU is
mainly a delegated party to check clients data integrity, and the potential dispute may occur between the
TPAU and the CSP, so the arbitrator should be an unbiased third party who is different to the TPAU.
As for the TPAR, we consider it honest-but-curious. It will behave honestly most of the time but it is also
curious about the content of the auditing data, thus the privacy protection of the auditing data should be
considered. Note that, while privacy protection is beyond the scope of this paper, our scheme can adopt
the random mask technique proposed for privacy preservation of auditing data, or the ring signatures in
to protect the identityprivacy of signers for data shared among a group of users.

5. Auditing Cloud server:


Public auditing schemes mainly focus on the delegation of auditing tasks to a third party auditor
(TPA) so that the overhead on clients can be offloaded as much as possible. However, such
models have not seriously considered the fairness problem as they usually assume an honest
owner against an untrusted CSP. Since the TPA acts on behalf of the owner, then to what extent
could the CSP trust the auditing result? What if the owner and TPA collude together against an
honest CSP for a financial compensation. In this sense, such models reduce the practicality and
applicability of auditing schemes

6. Clustering Data:
Compared to these schemes, our work is the first to combine public verifiability, data
dynamics support and dispute arbitration simultaneously. Other extensions to both PDPs and
PoRs. Introduced a mechanism for data integrity auditing under the multiserver scenario, where
data are encoded with network code. Ensure data possession of multiple replicas across the
distributed storage scenario. They also integrate forward error-correcting codes into PDP to
provide robust data possession utilize the idea of proxy re-signatures to provide efficient user
revocations, where the shared data are signed by a group of users.

Archtecture:

REVIEW OF LITERATURE

1. A Review of Secure Authorized data Arbiration with Encrypted Data for Cloud Storage
Bhavanashri Shivaji Raut, Prof. H. A. Hingoliwala
Cloud Storage System are becoming increasingly popular with the continuous and exponential
Increase of the number of users and the size of data. Data deduplication becomes more and more a
necessity for cloud storage providers. Data deduplication is one of the important data compression
technique for eliminating duplicate copies of repeating data . It has been widely used in the cloud storage
to reduce the amount of storage space and save bandwidth. The advantage of deduplication unfortunately
come with high cost in terms of new security and privacy challenges .The proposed scheme in this paper
not only the reduces the cloud storage capacity but also improves the speed of data deduplication. To
protect confident
2. Control Framework for Secure Cloud Computing
Harshit Srivastava1, Sathish Alampalayam Kumar
The Information Technology (IT) industry, which is going to impact the businesses of any size and yet the
security issue continues to pose a big threat on it. The security and privacy issues persisting in cloud
computing have proved to be an obstacle for its widespread adoption. In this paper, we look at these
issues from a business perspective and how they are damaging the reputation of big companies. There is a
literature review on the existing issues in cloud computing and how they are being tackled by the Cloud
Service Providers (CSP). We propose a governing body framework which aims at solving these issues by
establishing relationship amongst the CSPs in which the data about possible threats can be generated
based on the previous attacks on other CSPs. The Governing Body will be responsible for Data Center
control, Policy control, legal control, user awareness, performance evaluation, solution architecture and
providing motivation for the entities involved.

The literature also differentiates cloud computing offerings by scope. In private clouds; services
are provided exclusively to trusted users via a single-tenant operating environment. Essentially,
an organizations data centre delivers cloud computing services to clients who may or may not be
in the premises . Public clouds are the opposite: services are offered to individuals and

organizations who want to retain elasticity and accountability without absorbing the full costs of
in-house infrastructures. Public cloud users are by default treated as untrustworthy. There are
also hybrid clouds combining both private and public cloud service offerings

3.Ashish Bhagat Ravi Kant Sahu


Cloud data security is a major concern for the client while using the cloud services provided by
the service provider. There can be some security issues and conflicts between the client and the
service provider. To resolve those issues, a third party can be used as an auditor. In this paper, we
have analysed various mechanisms to ensure reliable data storage using cloud services. It mainly
focuses on the way of providing computing resources in form of service rather than a product
and utilities are provided to users over internet. The cloud is a platform where data owner
remotely store their data in cloud. The main goal of cloud computing concept is to secure and
protect the data which come under the property of users. The security of cloud computing
environment is exclusive research area which requires further development from both academic
and research communities. In the corporate world there are a huge number of clients which is
accessing the data and modifying the data. In the cloud, application and services move to
centralized huge data center and services and management of this data may not be trustworthy,
into cloud environment the computing resources are under control of service provider and the
third-party-auditor ensures the data integrity over out sourced data. Third-partyauditor not only
read but also may be change the data. Therefore a mechanism should be provided to solve the
problem. We examine the problem contradiction between client and CSP, new potential security
scheme used to solve problem. The purpose of this paper is to bring greater clarity landscape
about cloud data security and their solution at user level using encryption algorithms which
ensure the data owner and client that their data are intact.

3. Yunchuan Sun, Junsheng Zhang, Yongping Xiong, and Guangyu Zhu stated
that

Data security has consistently been a major issue in information technology. In the
cloud computing environment, it becomes particularly serious because the data is located
in different places even in all the globe. Data security and privacy protection are the two
main factors of users concerns about the cloud technology. Though many techniques on
the topics in cloud computing have been investigated in both academics and industries,
data security and privacy protection are becoming more important for the future
development of cloud computing technology in government, industry, and business. Data
security and privacy protection issues are relevant to both hardware and software in the
cloud architecture. This study is to review different security techniques and challenges
from both software and hardware aspects for protecting data in the cloud and aims at
enhancing the data security and privacy protection for the trustworthy cloud environment.
In this paper, we make a comparative research analysis of the existing research work
regarding the data security and privacy protection techniques used in the cloud
computing.

4. K. Ren, C. Wang, and Q. Wang, Security Challenges for the Public Cloud review that
Cloud Service Providers (CSP). We propose a governing body framework which aims at solving
these issues by establishing relationship amongst the CSPs in which the data about possible threats can be
generated based on the previous attacks on other CSPs. The Governing Body will be responsible for Data
Center control, Policy control, legal control, user awareness, performance evaluation, solution
architecture and providing motivation for the entities involved.

5. Takabi. H, Joshi.J.B.D and Ahn.G Cloud Data Storage SecurityUsing Arbiraton Technique
stated
Data partitioning function is used to make the storage easy in cloud. Partitioning happens alphabetical
order by using index method. It retrieves first two letters and checks it in folder with present having same
letter. If it is not present then creates a folder and store the file in that folder .The partitioned files are
encrypted, that is encoded with the public key and stored in cloud. Partitioning takes place automatically

when the data is fed for storing in cloud. Original file is also reconstructed when there is need to access
the same.

SYSTEM SPECIFICATION
Hardware Requirements:
System

: Pentium IV 2.4 GHz.

Hard Disk

: 40 GB.

Monitor

: 14 Colour Monitor.

Mouse

: Optical Mouse.

Ram

: 512 Mb.

Software Requirements:
Operating system : Windows 7.
Coding Language : VB.net
Data Base

: SQL server

DATABASE DESIGN
Database system is designed to store large volume of data. A major purpose of database
system is to provide user with an abstract view of data. That is the system hides certain details of
how the data are stored and maintained.
Database
A Database is a collection of tables. Each table can be defined as a collection of
fields. The fields of a table can have different constraints like unique, primary key, not null, etc.,
Table Name User Login
FIELD NAME

DATA TYPE

SIZE

DESCRIPTION

UNAME

VARCHAR

20

USER NAME

PASSWORD

VARCHAR

20

USER PASSWORD

FIELD NAME

DATA TYPE

SIZE

DESCRIPTION

FILEID

INT

FILENAME

VARCHAR

Table Name File Info

FILE ID
20

FILENAME

FILEPATH

VARCHAR

20

FILEPATH

FILESIZE

VARCHAR

20

FILESIZE

DATE

DATE

DATE

User Details
FIELD NAME

DATA TYPE

SIZE

DESCRIPTION

UID

INT

USERNAME

VARCHAR

20

USERNAME

EMAILID

VARCHAR

20

EMAILID

MOBNO

VARCHAR

20

MOBILENO

FIELD NAME

DATA TYPE

SIZE

DESCRIPTION

ClientID

INT

ClientName

VARCHAR

20

CLIENTNAEM

EMAILID

VARCHAR

20

EMAILID

MOBNO

VARCHAR

20

MOBILENO

DATA TYPE

SIZE

DESCRIPTION

USER ID

Client Details

ClientID

Mode
FIELD NAME

Mode

Bit

MODE

3. SOFTWARE PROFILE
Java Technology
Java technology is both a programming language and a platform.
The Java Programming Language
The Java programming language is a high-level language that can be
characterized by all of the following buzzwords:

Simple

Architecture neutral

Object oriented

Portable

Distributed

High performance

Interpreted

Multithreaded

Robust

Dynamic

Secure

With most programming languages, you either compile or interpret a program so


that you can run it on your computer. The Java programming language is unusual
in that a program is both compiled and interpreted. With the compiler, first you
translate a program into an intermediate language called Java byte codes the
platform-independent codes interpreted by the interpreter on the Java platform. The
interpreter parses and runs each Java byte code instruction on the computer.
Compilation happens just once; interpretation occurs each time the program is
executed. The following figure illustrates how this works.

FIGURE 2- WORKING OF JAVA


You can think of Java bytecodes as the machine code instructions for the
Java Virtual Machine (Java VM). Every Java interpreter, whether its a
development tool or a Web browser that can run applets, is an implementation of
the Java VM. Java bytecodes help make write once, run anywhere possible. You
can compile your program into bytecodes on any platform that has a Java compiler.
The bytecodes can then be run on any implementation of the Java VM. That means
that as long as a computer has a Java VM, the same program written in the Java
programming language can run on Windows 2000, a Solaris workstation, or on an
iMac.

The Java Platform


A platform is the hardware or software environment in which a program
runs. Weve already mentioned some of the most popular platforms like Windows
2000, Linux, Solaris, and MacOS. Most platforms can be described as a
combination of the operating system and hardware. The Java platform differs from
most other platforms in that its a software-only platform that runs on top of other
hardware-based platforms.
The Java platform has two components:

The Java Virtual Machine (Java VM)

The Java Application Programming Interface (Java API)


Youve already been introduced to the Java VM. Its the base for the Java
platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software components that provide
many useful capabilities, such as graphical user interface (GUI) widgets. The Java
API is grouped into libraries of related classes and interfaces; these libraries are
known as packages. The next section, What Can Java Technology Do?, highlights
what functionality some of the packages in the Java API provide.
The following figure depicts a program thats running on the Java platform.
As the figure shows, the Java API and the virtual machine insulate the program
from the hardware.

FIGURE 3- THE JAVA PLATFORM


Native code is code that after you compile it, the compiled code runs on a
specific hardware platform. As a platform-independent environment, the Java
platform can be a bit slower than native code. However, smart compilers, welltuned interpreters, and just-in-time bytecode compilers can bring performance
close to that of native code without threatening portability.
What Can Java Technology Do?
The most common types of programs written in the Java programming
language are applets and applications. If youve surfed the Web, youre probably
already familiar with applets. An applet is a program that adheres to certain
conventions that allow it to run within a Java-enabled browser.
However, the Java programming language is not just for writing cute,
entertaining applets for the Web. The general-purpose, high-level Java
programming language is also a powerful software platform. Using the generous
API, you can write many types of programs.
An application is a standalone program that runs directly on the Java
platform. A special kind of application known as a server serves and supports
clients on a network. Examples of servers are Web servers, proxy servers, mail
servers, and print servers. Another specialized program is a servlet. A servlet can
almost be thought of as an applet that runs on the server side. Java Servlets are a
popular choice for building interactive web applications, replacing the use of CGI

scripts. Servlets are similar to applets in that they are runtime extensions of
applications. Instead of working in browsers, though, servlets run within Java Web
servers, configuring or tailoring the server.
How does the API support all these kinds of programs? It does so with
packages of software components that provide a wide range of functionality. Every
full implementation of the Java platform gives you the following features:

The essentials: Objects, strings, threads, numbers, input and output, data
structures, system properties, date and time, and so on.

Applets: The set of conventions used by applets.


Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram
Protocol) sockets, and IP (Internet Protocol) addresses.
Internationalization: Help for writing programs that can be localized for users
worldwide. Programs can automatically adapt to specific locales and be displayed
in the appropriate language.
Security: Both low level and high level, including electronic signatures, public and
private key management, access control, and certificates.
Software components: Known as JavaBeansTM, can plug into existing component
architectures.
Object serialization: Allows lightweight persistence and communication via
Remote Method Invocation (RMI).
Java Database Connectivity (JDBCTM): Provides uniform access to a wide range
of relational databases.
The Java platform also has APIs for 2D and 3D graphics, accessibility, servers,
collaboration, telephony, speech, animation, and more. The following figure
depicts what is included in the Java 2 SDK.

FIGURE 4 JAVA 2 SDK

SWING

The Swing toolkit includes a rich set of components for building


GUIs and adding interactivity to Java applications. Swing includes all the
components you would expect from a modern toolkit: table controls, list controls,
tree controls, buttons, and labels. Swing is far from a simple component toolkit,
however. It includes rich undo support, a highly customizable text package,
integrated internationalization and accessibility support. To truly leverage the
cross-platform capabilities of the Java platform, Swing supports numerous look
and feels, including the ability to create your own look and feel. The ability to
create a custom look and feel is made easier with Synth, a look and feel
specifically designed to be customized. Swing wouldn't be a component toolkit

without the basic user interface primitives such as drag and drop, event handling,
customizable painting, and window management.
Swing is part of the Java Foundation Classes (JFC). The JFC also include other
features important to a GUI program, such as the ability to add rich graphics
functionality and the ability to create a program that can work in different
languages and by users with different input devices.

ODBC
Microsoft Open Database Connectivity (ODBC) is a standard programming
interface for application developers and database systems providers. Before ODBC
became a de facto standard for Windows programs to interface with database
systems, programmers had to use proprietary languages for each database they
wanted to connect to. Now, ODBC has made the choice of the database system
almost irrelevant from a coding perspective, which is as it should be. Application
developers have much more important things to worry about than the syntax that is
needed to port their program from one database to another when business needs
suddenly change.
Through the ODBC Administrator in Control Panel, you can specify the
particular database that is associated with a data source that an ODBC application
program is written to use. Think of an ODBC data source as a door with a name on
it. Each door will lead you to a particular database. For example, the data source
named Sales Figures might be a SQL Server database, whereas the Accounts

Payable data source could refer to an Access database. The physical database
referred to by a data source can reside anywhere on the LAN.
The ODBC system files are not installed on your system by Windows 95.
Rather, they are installed when you setup a separate database application, such as
SQL Server Client or Visual Basic 4.0. When the ODBC icon is installed in
Control Panel, it uses a file called ODBCINST.DLL. It is also possible to
administer your ODBC data sources through a stand-alone program called
ODBCADM.EXE. There is a 16-bit and a 32-bit version of this program, and each
maintains

separate

list

of

ODBC

data

sources.

From a programming perspective, the beauty of ODBC is that the


application can be written to use the same set of function calls to interface with any
data source, regardless of the database vendor. The source code of the application
doesnt change whether it talks to Oracle or SQL Server. We only mention these
two as an example. There are ODBC drivers available for several dozen popular
database systems. Even Excel spreadsheets and plain text files can be turned into
data sources. The operating system uses the Registry information written by ODBC
Administrator to determine which low-level ODBC drivers are needed to talk to
the data source (such as the interface to Oracle or SQL Server). The loading of the
ODBC drivers is transparent to the ODBC application program. In a client/server
environment, the ODBC API even handles many of the network issues for the
application programmer.
The advantages of this scheme are so numerous that you are probably thinking
there must be some catch. The only disadvantage of ODBC is that it isnt as
efficient as talking directly to the native database interface. ODBC has had many

detractors make the charge that it is too slow. Microsoft has always claimed that
the critical factor in performance is the quality of the driver software that is used.
In our humble opinion, this is true. The availability of good ODBC drivers has
improved a great deal recently. And anyway, the criticism about performance is
somewhat analogous to those who said that compilers would never match the
speed of pure assembly language. Maybe not, but the compiler (or ODBC) gives
you the opportunity to write cleaner programs, which means you finish sooner.
Meanwhile, computers get faster every year.

JDBC
In an effort to set an independent database standard API for Java, Sun
Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a
generic SQL database access mechanism that provides a consistent interface to a
variety of RDBMSs. This consistent interface is achieved through the use of plugin database connectivity modules, or drivers. If a database vendor wishes to have
JDBC support, he or she must provide the driver for each platform that the
database and Java run on.
To gain a wider acceptance of JDBC, Sun based JDBCs framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread support
on a variety of platforms. Basing JDBC on ODBC will allow vendors to bring
JDBC drivers to market much faster than developing a completely new
connectivity solution.

JDBC was announced in March of 1996. It was released for a 90 day public review
that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification
was released soon after.
The remainder of this section will cover enough information about JDBC for
you to know what it is about and how to use it effectively. This is by no means a
complete overview of JDBC. That would fill an entire book.

JDBC Goals
Few software packages are designed without goals in mind. JDBC is one
that, because of its many goals, drove the development of the API. These goals, in
conjunction with early reviewer feedback, have finalized the JDBC class library
into a solid framework for building database applications in Java.
The goals that were set for JDBC are important. They will give you some
insight as to why certain classes and functionalities behave the way they do. The
eight design goals for JDBC are as follows:
1.

SQL Level API


The designers felt that their main goal was to define a SQL interface for Java.
Although not the lowest database interface level possible, it is at a low enough
level for higher-level tools and APIs to be created. Conversely, it is at a high
enough level for application programmers to use it confidently. Attaining this goal
allows for future tool vendors to generate JDBC code and to hide many of
JDBCs complexities from the end user.

2.

SQL Conformance

SQL syntax varies as you move from database vendor to database vendor. In an
effort to support a wide variety of vendors, JDBC will allow any query statement
to be passed through it to the underlying database driver. This allows the
connectivity module to handle non-standard functionality in a manner that is
suitable for its users.
3.

JDBC must be implemental on top of common database interfaces


The JDBC SQL API must sit on top of other common SQL level APIs. This
goal allows JDBC to use existing ODBC level drivers by the use of a software
interface. This interface would translate JDBC calls to ODBC and vice versa.

4.

Provide a Java interface that is consistent with the rest of the Java system
Because of Javas acceptance in the user community thus far, the designers feel
that they should not stray from the current design of the core Java system.

5.

Keep it simple
This goal probably appears in all software design goal listings. JDBC is no
exception. Sun felt that the design of JDBC should be very simple, allowing for
only one method of completing a task per mechanism. Allowing duplicate
functionality only serves to confuse the users of the API.

6.

Use strong, static typing wherever possible


Strong typing allows for more error checking to be done at compile time; also,
less error appear at runtime.

7.

Keep the common cases simple

Because more often than not, the usual SQL calls used by the programmer are
simple SELECTs, INSERTs, DELETEs and UPDATEs, these queries should be
simple to perform with JDBC. However, more complex SQL statements should
also be possible.
Networking
TCP/IP stack
The TCP/IP stack is shorter than the OSI one:

FIGURE 5 TCP/IP STACK


TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a
connectionless protocol.
IP datagrams
The IP layer provides a connectionless and unreliable delivery system. It
considers each datagram independently of the others. Any association between

datagram must be supplied by the higher layers. The IP layer supplies a checksum
that includes its own header. The header includes the source and destination
addresses. The IP layer handles routing through an Internet. It is also responsible
for breaking up large datagram into smaller ones for transmission and reassembling
them at the other end.
TCP
TCP supplies logic to give a reliable connection-oriented protocol above IP.
It provides a virtual circuit that two processes can use to communicate.
Internet addresses
In order to use a service, you must be able to find it. The Internet uses an
address scheme for machines so that they can be located. The address is a 32 bit
integer which gives the IP address. This encodes a network ID and more
addressing. The network ID falls into various classes according to the size of the
network address.
Network address
Class A uses 8 bits for the network address with 24 bits left over for other
addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network
addressing and class D uses all 32.
Subnet address
Internally, the UNIX network is divided into sub networks. Building 11 is
currently on one sub network and uses 10-bit addressing, allowing 1024 different
hosts.

Host address
8 bits are finally used for host addresses within our subnet. This places a
limit of 256 machines that can be on the subnet.
Total address

FIGURE 6 - IP ADDRESSING
The 32 bit address is usually written as 4 integers separated by dots.
Port addresses
A service exists on a host, and is identified by its port. This is a 16 bit
number. To send a message to a server, you send it to the port for that service of the
host that it is running on. This is not location transparency! Certain of these ports
are "well known".
Sockets
A socket is a data structure maintained by the system to handle network
connections. A socket is created using the call socket. It returns an integer that is
like a file descriptor. In fact, under Windows, this handle can be used with Read
File and Write File functions.
#include <sys/types.h>
#include <sys/socket.h>

int socket(int family, int type, int protocol);


Here "family" will be AF_INET for IP communications, protocol will be zero, and
type will depend on whether TCP or UDP is used. Two processes wishing to
communicate over a network create a socket each. These are similar to two ends of
a pipe - but the actual pipe does not yet exist.
Create a server socket that listens for a client to connect
socket(int af, int type, int protocol)
This method creates the socket
bind(SOCKET s, const struct sockaddr FAR * name, int namelen)
Associates a local address with a socket. This routine is used on an unconnected
datagram or stream socket, before subsequent connects or listens. When a socket is
created with socket, it exists in a name space (address family), but it has no name
assigned. bind establishes the local association (host address/port number) of the
socket by assigning a local name to an unnamed socket. In the Internet address
family, a name consists of several components. For SOCK_DGRAM and
SOCK_STREAM, the name consists of three parts: a host address, the protocol
number (set implicitly to UDP or TCP, respectively), and a port number which
identifies the application. If an application does not care what address is assigned
to it, it may specify an Internet address equal to INADDR_ANY, a port equal to 0,
or both. If the Internet address is equal to INADDR_ANY, any appropriate
network interface will be used; this simplifies application programming in the
presence of multi- homed hosts. If the port is specified as 0, the Windows Sockets
implementation will assign a unique port to the application with a value between
1024 and 5000. The application may use getsockname after bind to learn the

address that has been assigned to it, but note that getsockname will not necessarily
fill in the Internet address until the socket is connected, since several Internet
addresses may be valid if the host is multi-homed. If no error occurs, bind returns
0. Otherwise, it returns SOCKET_ERROR, and a specific error code may be
retrieved by calling WSAGetLastError.
listen(SOCKET s, int backlog )
Establishes a socket to listen to a incoming connection To accept connections, a
socket is first created with socket, a backlog for incoming connections is specified
with listen, and then the connections are accepted with accept. listen applies only
to sockets that support connections, i.e. those of type SOCK_STREAM. The
socket s is put into "passive'' mode where incoming connections are acknowledged
and queued pending acceptance by the process. This function is typically used by
servers that could have more than one connection request at a time: if a connection
request arrives with the queue full, the client will receive an error with an
indication of WSAECONNREFUSED. listen attempts to continue to function
rationally when there are no available descriptors. It will accept connections until
the queue is emptied. If descriptors become available, a later call to listen or accept
will re-fill the queue to the current or most recent "backlog'', if possible, and
resume listening for incoming connections.
accept (SOCKET s, struct sockaddr FAR * addr, int FAR * addrlen)
This routine extracts the first connection on the queue of pending connections on s,
creates a new socket with the same properties as s and returns a handle to the new
socket. If no pending connections are present on the queue, and the socket is not
marked as non- blocking, accept blocks the caller until a connection is present. If
the socket is marked non-blocking and no pending connections are present on the

queue, accept returns an error as described below. The accepted socket may not be
used to accept more connections. The original socket remains open. The argument
addr is a result parameter that is filled in with the address of the connecting entity,
as known to the communications layer. The exact format of the addr parameter is
determined by the address family in which the communication is occurring.
The addrlen is a value-result parameter; it should initially contain the amount of
space pointed to by addr; on return it will contain the actual length (in bytes) of the
address returned. This call is used with connection-based socket types such as
SOCK_STREAM. If addr and/or addrlen are equal to NULL, then no information
about the remote address of the accepted socket is returned.
closesocket(SOCKET s)
closes a socket
Making client connection with server
In order to create a socket that connects to an other socket uses most of the
functions from the previous code with the exception of a struct called HOSTENT .
HOSTENT:
This struct is used to tell the This struct is used to tell the socket to which computer
and port to connect to. These struct can appear as LPHOSTENT, but it actually
means that they are pointer to HOSTENT.
Client key function
Most of the functions that have been used for the client to connect to the server are
the same as the server with the exception of a few. I will just go through the
different functions that have been used for the client.

gethostbyname(const char* FAR name)


gethostbyname returns a pointer to a hostent structure as described under
gethostbyaddr. The contents of this structure correspond to the hostname name.
The pointer which is returned points to a structure which is allocated by the
Windows Sockets implementation. The application must never attempt to modify
this structure or to free any of its components. Furthermore, only one copy of this
structure is allocated per thread, and so the application should copy any
information which it needs before issuing any other Windows Sockets API calls. A
gethostbyname implementation must not resolve IP address strings passed to it.
Such a request should be treated exactly as if an unknown host name were passed.
An application with an IP address string to resolve should use inet_addr to convert
the string to an IP address, then gethostbyaddr to obtain the hostent structure.
Part 2 - Send / recieve
Up to this point we have managed to connect with our client to the server. Clearly
this is not going to be enough in a real-life application. In this section we are going
to look into more details how to use the send/recv functions in order to get some
communication going between the two applications.
Factually this is not going to be difficult because most of the hard work has been
done setting up the server and the client app. before going into the code we are
going to look into more details the two functions send(SOCKET s, const char FAR
* buf, int len, int flags) send is used on connected datagram or stream sockets and
is used to write outgoing data on a socket. For datagram sockets, care must be
taken not to exceed the maximum IP packet size of the underlying subnets, which
is given by the iMaxUdpDg element in the WSAData structure returned by
WSAStartup.

If the data is too long to pass atomically through the underlying protocol the error
WSAEMSGSIZE is returned, and no data is transmitted.
recv(SOCKET s, const char FAR * buf, int len, int flags)
For sockets of type SOCK_STREAM, as much information as is currently
available up to the size of the buffer supplied is returned. If the socket has been
configured for in- line reception of out-of-band data (socket option
SO_OOBINLINE) and out-of-band data is unread, only out-of-band data will be
returned. The application may use the ioctlsocket SIOCATMARK to determine
whether any more out-of-band data remains to be read.
part 3 - Read unknow size of data from client
Us mentioned earlier in part 2, we are going to expand on the way that we receive
data. The problem we had before is that if we did not know the size of data that we
where expecting, then the would end up with problems.
In order to fix this here we create a new function that receive a pointer to the client
socket, and then read a char at the time, placing each char into a vector until we
find the '\n' character that signifies the end of the message.

TABLE DESIGN
Tables are logical structures maintained by the database manager.
Tables are made up of columns and rows.
At the intersection of every column and row is a specific data item
called a value. A column is a set of values of the same type or one of
its subtypes. A row is a sequence of values arranged so that the nth
value is a value of the nth column of the table.

An application program can determine the order in which the rows are
populated into the table, but the actual order of rows is determined by
the

database

manager, and

typically

cannot

be

controlled.

Multidimensional clustering (MDC) provides some sense of


clustering, but not actual ordering between the rows.
When designing tables, you need to be familiar with certain concepts,
determine the space requirements for tables and user data, and
determine whether you will take advantage of certain features, such as
space compression and optimistic locking.

TESTING

Software Testing is the process of executing a program or system with the


intent of finding errors. Software testing is any activity aimed at evaluating
an attribute or capability of a program or system and determining that it
meets its required results.

Types of Testing Done


The development process involves various types of testing. Each test type
addresses a specific testing requirement. The most common types of testing
involved in the development process are:
Unit Test.
System Test
Functional Test

Integration Test
Unit Testing
The first test in the development process is the unit test. The source code is
normally divided into modules, which in turn are divided into smaller units
called units. These units have specific behavior. The test done on these units
of code is called unit test. Unit test depends upon the language on which the
project is developed. Unit tests ensure that each unique path of the project
performs accurately to the documented specifications and contains clearly
defined inputs and expected results. Functional and reliability testing in an
Engineering environment. Producing tests for the behavior of components
(nodes and vertices of a product to ensure their correct behavior prior to
system integration.
System Testing

Several modules constitute a project. If the project is long-term project,


several developers write the modules. Once all the modules are integrated,
several errors may arise. The testing done at this stage is called system test.
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable
results. System testing is based on process descriptions and flows,
emphasizing pre-driven process links and integration points. Testing a
specific hardware/software installation. This is typically performed on a
COTS (commercial off the shelf) system or any other system comprised of
disparate parts where custom configurations and/or unique installations are
the norm.

Functional Testing

Functional test can be defined as testing two or more modules together with
the intent of finding defects, demonstrating that defects are not present,
verifying that the module performs its intended functions as stated in the
specification and establishing confidence that a program does what it is
supposed to do.

Integration Testing
Integration testing is an additional step that is used when different subsystems are being developed simultaneously by independent developers. It
verifies that the parameters passed between sub-systems are being handled
correctly. Testing in which modules are combined and tested as a group.
Modules are typically code modules, individual applications, source and
destination applications on a network, etc. Integration Testing follows unit
testing and precedes system testing. Testing after the product is code
complete. Betas are often widely distributed or even distributed to the public
at large in hopes that they will buy the final product when it is released.

DATA FLOW DIAGRAM


Data flow diagram is one of the structure analysis tools. It is a way of expressing system
requirements in a graphical form. A data flow diagram clarifies system requirements and
identifies major transformation that happens in system design.

Symbols in Data Flow Diagram

A parallelogram defines the input given to the system

A square defines a source or destination of system data.

An arrow identifies data flow in motion. It is a pipeline through


which in sequence flow.

A circle represents a process that transforms incoming data flow

into

outgoing Dataflow.

A rectangle is a data-store at the recent or temporary repository of data.

Level 1:
Download

Cloud Server
LOGIN User

Level 2:

Request Send

Client

View Upload Files


Cloud Server

Enter Authentication Code

Level 3:

CSP

File Name
Cloud Server

Auditing.

Level 4:

Cloud Server
Person Attacker

Join group

Encryption

Decryption

Key

TESTING
Software Testing is the process of executing a program or system with the intent of
finding errors. Software testing is any activity aimed at evaluating an attribute or
capability of a program or system and determining that it meets its required results.
Types of Testing Done
The development process involves various types of testing. Each test type
addresses a specific testing requirement. The most common types of testing
involved in the development process are:
Unit Test.
System Test
Functional Test

Integration Test
Unit Testing
The first test in the development process is the unit test. The source code is
normally divided into modules, which in turn are divided into smaller units called
units. These units have specific behavior. The test done on these units of code is
called unit test. Unit test depends upon the language on which the project is
developed. Unit tests ensure that each unique path of the project performs
accurately to the documented specifications and contains clearly defined inputs and
expected results. Functional and reliability testing in an Engineering environment.
Producing tests for the behavior of components (nodes and vertices of a product to
ensure their correct behavior prior to system integration.

System Testing
Several modules constitute a project. If the project is long-term project,
several developers write the modules. Once all the modules are integrated, several
errors may arise. The testing done at this stage is called system test. System testing
ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. System testing is based on
process descriptions and flows, emphasizing pre-driven process links and
integration points. Testing a specific hardware/software installation. This is
typically performed on a COTS (commercial off the shelf) system or any other
system comprised of disparate parts where custom configurations and/or unique
installations are the norm.
Functional Testing

Functional test can be defined as testing two or more modules together with
the intent of finding defects, demonstrating that defects are not present, verifying
that the module performs its intended functions as stated in the specification and
establishing confidence that a program does what it is supposed to do.
Integration Testing
Integration testing is an additional step that is used when different subsystems are being developed simultaneously by independent developers. It verifies
that the parameters passed between sub-systems are being handled correctly.
Testing in which modules are combined and tested as a group. Modules are
typically code modules, individual applications, source and destination
applications on a network, etc. Integration Testing follows unit testing and
precedes system testing. Testing after the product is code complete. Betas are often
widely distributed or even distributed to the public at large in hopes that they will
buy the final product when it is released.