You are on page 1of 6

International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 1496



Secured Search Datapreservation Using Ascent Plugnge
Method
K.Selvasheela.AP/CSE
#1
, N.P.Rajeswari,AP/CSE
*2


ANNA UNIVERSITY CHENNAI, VEERAMMAL ENGINEERING COLLEGE ANNA UNIVERSITY CHENNAI, VEERAMMAL ENGINEERING COLLEGE
Dindigul(DT),Tamilnadu,INDIA Dindigul(DT),Tamilnadu,INDIA


1
K. Sel vesheel a. aut hor @sheel aar unme@yahoo. co. i n
2
N. P. Raj eswar i . aut hor @gr eat npr @gmai l . com
ABSTRACT:
The increasing ability to track and collect large
amounts of data with the use of current hardware
and software technology has lead to immense
challenge and consequent interest in the
development of data mining algorithms which
preserve user security and privacy in a large
distributed system. Secure data aggregation with
privacy preserving feature is a demanding task.
Privacy preservation is becoming a necessity for
data generated for individual purpose as well as for
organizational purpose. In this paper, we develop a
scheme for secure multiparty data aggregation with
the help of modular arithmetic concept.
Specifically, we consider a scenario in which two or
more parties owning confidential data need to share
only for aggregation purpose to a third party,
without revealing any unnecessary information.
More generally, data aggregation needs to take
place by the server or aggregator without acquiring
the content of the individual data. Our work is
motivated by the need to both protect privileged
information and confidentiality.
I. INTRODUCTION:
Data mining, the extraction of hidden predictive
information from large databases, is a powerful new
technology with great potential to help companies
focus on the most important information in their
data warehouses. Data mining tools predict future
trends and behaviors, allowing businesses to make
proactive, knowledge-driven decisions. Generally,
data mining (sometimes called data or knowledge
discovery) is the process of analyzing data from
different perspectives and summarizing it into
useful information - information that can be used to
increase revenue, cuts costs, or both. Data mining
software is one of a number of analytical tools for
analyzing data. It allows users to analyze data from
many different dimensions or angles, categorize it,
and summarize the relationships identified.
Technically, data mining is the process of finding
correlations or patterns among dozens of fields in
large relational databases. Our work is motivated by
the need to both protect privileged information and
enable its use for research or other purposes.
However, data mining algorithms are typically
complex and, furthermore, the input usually
consists of massive data sets.
SCOPE OF THE PROJECT
Ascent plunge aims to minimize a target function in
order to reach a local minimum. Here propose a
preliminary formulation of ascent plunge with data
privacy preservation. The 1inear regression method
is used for securely performing ascent plunge
method over vertically partitioned data. For
multiple parties, the secure set intersection
cardinality has been proposed which is defined as
finding of customers necessary details without
International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013


ISSN: 2231-2803 http://www.ijcttjournal.org Page 1497

accessing their private data. Data are usually
assumed to be horizontally or vertically partitioned
so that no single party holds overall data. In
horizontally partitioned data the parties have the
same attribute for different objects while in
vertically partitioned data the parties have different
attributes for same set of objects. For the
horizontally portioned data, the approach used is the
linear regression while for vertically portioned data
the approach used is least square approach.
PROBLEM DEFINITION:
A powerful new technology with great potential to
help companies focus on the most important
information in their data warehouses. mining tools
predict future trends and behaviors, allowing
businesses to make proactive, knowledge-driven
decisions.knowledge discovery is the process of
analyzing data from different perspectives and
summarizing it into useful information -
information that can be used to increase revenue,
cuts costs, or both. Data mining software is one of a
number of analytical tools for analyzing data. It
allows users to analyze data from many different
dimensions or angles, categorize it, and summarize
the relationships identified. Technical process of
finding correlations or patterns among dozens of
fields in large relational databases. Our work is
motivated by the need to both protect privileged
information and enable its use for research or other
purposes. However, data mining algorithms are
typically complex and, furthermore, the input
usually consists of massive data sets. This is a
method of preliminary formulation of ascent plunge
with data seclusion preservation. To consider a
scenario in which two parties owning confidential
databases wish to run a data mining algorithm on
the union of their databases, without revealing any
unnecessary information.
II. PROPOSED SYSTEM
ARCHITECTURE:
In this system, by using linear regression algorithm
the specific attribute of the dataset can be retrieved.
The mining of attribute is confined securely and it
can access by the key, which is generated from
DSA algorithm.The ascent plunge methods targets
to minimize the exact function in order to reach the
minimum.We propose a two approaches, stochastic
and Least Square Approach under different
assumptions.Four protocols are proposed for two
approaches involves in secure building block for
both horizontally partitioned data and vertically
partitioned data.These protocols allow us to
determine a secure protocol for the applications.
III. MODULE DESCRIPTION:
The Project mainly focuses on four
modules, which are completely inter-related to each
other. The descriptions about the modules are given
below
1.Member Muster
2.Setting Security Authorization
3.Provide sanctuary
4.Viewing statistics
1.MEMBER MUSTER:
In first phase the member registration is carried out
by providing his personal where provided and those
details were stored in database. One powerful way
to manage the user access and managing his
database is the registration. And registration process
is carried out by using valuator. The valuator
mainly performs the functions of verifying the
users details and his input values.
2.SETTING ENTRANCE
AUTHORIZATION:
This phase the users datas where allowed
to access by the third party if he have the access
permission. Admin will set the field that should be
International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013


ISSN: 2231-2803 http://www.ijcttjournal.org Page 1498

accessed by the third party. He will set the access
permission to access the field by the thirdparty and
third party may be his client.Admin sets the access
permission to access the data of the user by his
client. The user and to maintain his activity in order
to avoid any malfunction.The authorization
processes were mainly carried out by entering his
user id and password access.
3.PROVIDING SANCTUARY:
In order to provide secured access of data
admin provide more security to the user admin sets
the key to each and every user. The key will be the
secondary field to access the field of that particular
user. The third party can access the data of the
specified users particular field only after providing
the key generated by admin. The key of the
particular user is sent to his client. The key which
performs plays the major role in our concept in
order to avoid accessing of the data other than the
third party. The key is transformed to his client by
any other personalmedia for future access. This will
avoid the accessing of data without prior knowledge
to user and admin.
4. VIEWING STATISTICS:
The third party accesses the data of the user by
entering the key sent by the admin. The data can be
accessed either of the two methods, they are
horizontal partitioning and vertical partitioning. The
horizontal partitioning the objects are same and they
have different attributes, where as in vertical
partitioning the attributes are different where the
objects where same. In order to access that data the
method involved is the linear regression method.
This models which depend linearly on their
unknown parameters are easier to fit than models
which are non-linearly related to their parameters
and because the statistical properties of the resulting
estimators are easier to determine.
IV. METHODS AND ALGORITHMS:
DIGITAL SIGNATURE ALGORITHM:
Digital signature is a sort of Cryptography keeping
communications private. It is converting messages
or data into a different form, such that no one read
them without having access to the key. The
message may be converted using a code or
cipher. It deals with encryption, decryption and
authentication.A digital signature is represented in a
computer as a string of binary digits. A digital
signature is computed using a set of parameters and
authenticates the integrity of the signed data and the
identity of the signatory. An algorithm provides the
capability to generate and verify signature.
Signature generation makes use of a private key to
generate a digital signature. Signature verification
makes use of a public key, which corresponds to,
but is not the same as, the private key. Each user
possesses a private and public key pair. Public keys
are assumed to be known to the public in general.
Private keys are never shared. Anyone can verify
the signature of a user by employing that user
public key. Only the possessor of the user private
key can perform signature generation.
Digital signature use:
As organizations move away from paper
documents with ink signatures or authenticity
stamps, digital signatures can provide added
assurances of the evidence to provenance, identity,
and status of an electronic document approval by a
signatory. A digital signature scheme typically
consists of three algorithms:
1. A key generation algorithm that selects a private
key uniformly at random from a set of possible
private keys. The algorithm outputs the private key
and a corresponding public key.
2. A signing algorithm that, given a message and a
private key,produces a signature.A signature
verifying algorithm that, given a message, public
International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013


ISSN: 2231-2803 http://www.ijcttjournal.org Page 1499

key and a signature, either accepts or rejects the
message's claim to authenticity.


Figure:1.Method of Digital signatures Creation
There are two types of Cryptography-
1. Secret key or Symmetric Cryptography
2. Public key or Asymmetric Cryptography
In Symmetric Cryptography the sender and
receiver of a message know and use the same
secret key to encrypt the message, and the receiver
uses same key to decrypt the message.Asymmetric
(or public key) Cryptography involves two related
keys, one of which only the owner knows (the
'private key') and the other which anyone can know
(the 'public key').




Figure :2. Flow of Digital Signature

SHA(SECURE HASH ALGORITHM):
The National Software Reference Library (NSRL)
Reference Data Set (RDS) is built on file signature
generation technology that is used primarily in
cryptography.
SHA-0: It was withdrawn shortly after
publication due to an undisclosed "significant
flaw" and replaced by the slightly revised version
SHA-1.
SHA-1: A 160-bit hash function which
resembles the earlier MD5 algorithm. This was
designed by the National Security Agency (NSA)
to be part of the Digital Signature Algorithm
SHA-2: A family of two similar hash functions,
with different block sizes, known as SHA-256 and
SHA-512. They differ in the word size; SHA-256
uses 32-byte (256 bits) words where SHA-512 uses
64-byte (512 bits) words.
A hash function is used in the signature generation
process to obtain a condensed version of data,
called a message digest (figure 1). The message
digest is then input to the digital signature algorithm
to generate the digital signature. The digital
signature is sent to the intended verifier along with
the message. The verifier the message and signature
verifies the signature by using the sender's public
key.
The SHA algorithm use:
1. Enforce some reasonable minimum password
requirements.
2. Change passwords frequently.
3.Use the strongest hash you can get - SHA-
256 was suggested here.
4.Combine the password with afixed salt (same for
your whole database).
5.Combine the result of previous step with a unique
salt that is stored and attached to this record.
6.Run the hash algorithm multiple times - like
1000+ times. Ideally include a different salt each
time with the previous hash. Speed is your enemy
and multiple iterations reduces the
International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013


ISSN: 2231-2803 http://www.ijcttjournal.org Page 1500

speed.

Figure :3. Process of Secure Hash Algorithm

Limitations of SHA algorithm:
1. Never store a plain text password (which means
you can never display or transmit it either.)
2. Never transmit the stored representation of a
password over an unsecured line (either plain text,
encoded or hashed).
3. Speed is your enemy.
4. Regularly reanalyze and improve your process as
hardware and cryptanalysis improves.
5. Cryptography and process is a very small part of
the solution.
STOCHASTIC APPPROACH:
Stochastic Approach for Link-Structure
Analysis. The approach is based upon the theory of
Markov chains, and relies on the stochastic
properties of random walk performed on our
collection of pages. It follows the Meta algorithm.
LINEAR REGRESSION METHOD:
Linear regression is a method of
organizing data. Sometimes it is appropriate to
show data as points on a graph, and then try to draw
a straight line through the data. Linear regression is
an algorithm for drawing such a line. Linear
regression typically uses the least squares method to
determine which line best fits the data. R-Squared
is a measure of how well the data points match the
resulting line.Linear regression attempts to model
the relationship between two variables by fitting a
linear equation to observed data. One variable is
considered to be an explanatory variable, and the
other is considered to be a dependent variable. For
example, a modeler might want to relate the weights
of individuals to their heights using a linear
regression model.
LEAST SQUARE APPROACH:
A Least square" is determined by squaring the
distance between a data point and the regression
line. The least squares approach limits the distance
between a function and the data points that a
function is trying to explain. It is used in regression
analysis, often in nonlinear regression modeling in
which a curve is fit into a set of data. A key
attribute is the unique, distinguishing characteristic
of the entity.
CONCLUSION:
The project will fulfill the entire information
requirement of the system and it is developed with a
view of the requirements and satisfaction. The
proposed Metric Preserving Transformation stores
relative information at the server with respect to a
object. The system resulted in regular and timely
preparation of the required outputs. We have laid
down the foundations for further research in the
area of Privacy-Preserving Data Mining (PPDM).
Although our work described in this preliminary
and conceptual in nature, it is a vital prerequisite for
the development and deployment of some
techniques. We showed that the protocols are
correct and privacy preserving.
REFERENCES:
[1]T. Bozkaya and Z.M. O zsoyoglu, Indexing
Large Metric Spaces for Similarity Search Queries,
International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013


ISSN: 2231-2803 http://www.ijcttjournal.org Page 1501

ACM Trans. Database Systems,vol. 24, no. 3, pp.
361-404, 1999.
[2]H. Hacigumu s, B.R. Iyer, C. Li, and S.
Mehrotra, Executing SQLover Encrypted Data in
the Database-Service-Provider Model,Proc. ACM
SIGMOD Intl Conf. Management of Data, pp. 216-
227,2002.
[3]G.R. Hjaltason and H. Samet, Index-Driven
Similarity Search inMetric Spaces, ACM Trans.
Database Systems, vol. 28, no. 4,pp. 517-580, 2003.
[4]G. Ghinita, P. Kalnis, A. Khoshgozaran, C.
Shahabi, and K.L. Tan,Private Queries in Location
Based Services: Anonymizers AreNot Necessary,
Proc. ACM SIGMOD Intl Conf. Management
ofData, pp. 121-132, 2008.
[5]M.L. Yiu, G. Ghinita, C.S. Jensen, and P. Kalnis,
OutsourcingSearch Services on Private Spatial
Data, Proc. IEEE 25th Intl Conf.Data Eng.
(ICDE), pp. 1140-1143, 2009.
[6]M.L. Yiu, I. Assent, C.S. Jensen, and P. Kalnis,
OutsourcedSimilarity Search on Metric Data
Assets, DB Technical ReportTR-28, Aalborg
Univ., 2010.
[7]W.K. Wong, D.W. Cheung, B. Kao, and N.
Mamoulis, Secure k-NN Computation on
Encrypted Databases, Proc. 35th ACMSIGMOD
Intl Conf. Management of Data, pp. 139-152,
2010.
[8]M.L. Yiu, G. Ghinita, C.S. Jensen, and P. Kalnis,
OutsourcingSearch Services on Data, Proc. IEEE
25th Intl Conf.Data Eng. (ICDE), pp. 1140-1143,
2010.
[9]ManLungYIU,IraAssen,,ChristianS.jensen,Fello
wOutsorced Similarity Search on Metric Data
AssetsIEEEConf.EngVol 24,2012

You might also like