The increasing ability to track and collect large
amounts of data with the use of current hardware
and software technology has lead to immense
challenge and consequent interest in the
development of data mining algorithms which
preserve user security and privacy in a large
distributed system. Secure data aggregation with
privacy preserving feature is a demanding task.
Privacy preservation is becoming a necessity for
data generated for individual purpose as well as for
organizational purpose. In this paper, we develop a
scheme for secure multiparty data aggregation with
the help of modular arithmetic concept.
Original Title
Secured Search Datapreservation Using Ascent Plugnge
Method
The increasing ability to track and collect large
amounts of data with the use of current hardware
and software technology has lead to immense
challenge and consequent interest in the
development of data mining algorithms which
preserve user security and privacy in a large
distributed system. Secure data aggregation with
privacy preserving feature is a demanding task.
Privacy preservation is becoming a necessity for
data generated for individual purpose as well as for
organizational purpose. In this paper, we develop a
scheme for secure multiparty data aggregation with
the help of modular arithmetic concept.
The increasing ability to track and collect large
amounts of data with the use of current hardware
and software technology has lead to immense
challenge and consequent interest in the
development of data mining algorithms which
preserve user security and privacy in a large
distributed system. Secure data aggregation with
privacy preserving feature is a demanding task.
Privacy preservation is becoming a necessity for
data generated for individual purpose as well as for
organizational purpose. In this paper, we develop a
scheme for secure multiparty data aggregation with
the help of modular arithmetic concept.
ANNA UNIVERSITY CHENNAI, VEERAMMAL ENGINEERING COLLEGE ANNA UNIVERSITY CHENNAI, VEERAMMAL ENGINEERING COLLEGE Dindigul(DT),Tamilnadu,INDIA Dindigul(DT),Tamilnadu,INDIA
1 K. Sel vesheel a. aut hor @sheel aar unme@yahoo. co. i n 2 N. P. Raj eswar i . aut hor @gr eat npr @gmai l . com ABSTRACT: The increasing ability to track and collect large amounts of data with the use of current hardware and software technology has lead to immense challenge and consequent interest in the development of data mining algorithms which preserve user security and privacy in a large distributed system. Secure data aggregation with privacy preserving feature is a demanding task. Privacy preservation is becoming a necessity for data generated for individual purpose as well as for organizational purpose. In this paper, we develop a scheme for secure multiparty data aggregation with the help of modular arithmetic concept. Specifically, we consider a scenario in which two or more parties owning confidential data need to share only for aggregation purpose to a third party, without revealing any unnecessary information. More generally, data aggregation needs to take place by the server or aggregator without acquiring the content of the individual data. Our work is motivated by the need to both protect privileged information and confidentiality. I. INTRODUCTION: Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Our work is motivated by the need to both protect privileged information and enable its use for research or other purposes. However, data mining algorithms are typically complex and, furthermore, the input usually consists of massive data sets. SCOPE OF THE PROJECT Ascent plunge aims to minimize a target function in order to reach a local minimum. Here propose a preliminary formulation of ascent plunge with data privacy preservation. The 1inear regression method is used for securely performing ascent plunge method over vertically partitioned data. For multiple parties, the secure set intersection cardinality has been proposed which is defined as finding of customers necessary details without International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013
accessing their private data. Data are usually assumed to be horizontally or vertically partitioned so that no single party holds overall data. In horizontally partitioned data the parties have the same attribute for different objects while in vertically partitioned data the parties have different attributes for same set of objects. For the horizontally portioned data, the approach used is the linear regression while for vertically portioned data the approach used is least square approach. PROBLEM DEFINITION: A powerful new technology with great potential to help companies focus on the most important information in their data warehouses. mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions.knowledge discovery is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technical process of finding correlations or patterns among dozens of fields in large relational databases. Our work is motivated by the need to both protect privileged information and enable its use for research or other purposes. However, data mining algorithms are typically complex and, furthermore, the input usually consists of massive data sets. This is a method of preliminary formulation of ascent plunge with data seclusion preservation. To consider a scenario in which two parties owning confidential databases wish to run a data mining algorithm on the union of their databases, without revealing any unnecessary information. II. PROPOSED SYSTEM ARCHITECTURE: In this system, by using linear regression algorithm the specific attribute of the dataset can be retrieved. The mining of attribute is confined securely and it can access by the key, which is generated from DSA algorithm.The ascent plunge methods targets to minimize the exact function in order to reach the minimum.We propose a two approaches, stochastic and Least Square Approach under different assumptions.Four protocols are proposed for two approaches involves in secure building block for both horizontally partitioned data and vertically partitioned data.These protocols allow us to determine a secure protocol for the applications. III. MODULE DESCRIPTION: The Project mainly focuses on four modules, which are completely inter-related to each other. The descriptions about the modules are given below 1.Member Muster 2.Setting Security Authorization 3.Provide sanctuary 4.Viewing statistics 1.MEMBER MUSTER: In first phase the member registration is carried out by providing his personal where provided and those details were stored in database. One powerful way to manage the user access and managing his database is the registration. And registration process is carried out by using valuator. The valuator mainly performs the functions of verifying the users details and his input values. 2.SETTING ENTRANCE AUTHORIZATION: This phase the users datas where allowed to access by the third party if he have the access permission. Admin will set the field that should be International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013
accessed by the third party. He will set the access permission to access the field by the thirdparty and third party may be his client.Admin sets the access permission to access the data of the user by his client. The user and to maintain his activity in order to avoid any malfunction.The authorization processes were mainly carried out by entering his user id and password access. 3.PROVIDING SANCTUARY: In order to provide secured access of data admin provide more security to the user admin sets the key to each and every user. The key will be the secondary field to access the field of that particular user. The third party can access the data of the specified users particular field only after providing the key generated by admin. The key of the particular user is sent to his client. The key which performs plays the major role in our concept in order to avoid accessing of the data other than the third party. The key is transformed to his client by any other personalmedia for future access. This will avoid the accessing of data without prior knowledge to user and admin. 4. VIEWING STATISTICS: The third party accesses the data of the user by entering the key sent by the admin. The data can be accessed either of the two methods, they are horizontal partitioning and vertical partitioning. The horizontal partitioning the objects are same and they have different attributes, where as in vertical partitioning the attributes are different where the objects where same. In order to access that data the method involved is the linear regression method. This models which depend linearly on their unknown parameters are easier to fit than models which are non-linearly related to their parameters and because the statistical properties of the resulting estimators are easier to determine. IV. METHODS AND ALGORITHMS: DIGITAL SIGNATURE ALGORITHM: Digital signature is a sort of Cryptography keeping communications private. It is converting messages or data into a different form, such that no one read them without having access to the key. The message may be converted using a code or cipher. It deals with encryption, decryption and authentication.A digital signature is represented in a computer as a string of binary digits. A digital signature is computed using a set of parameters and authenticates the integrity of the signed data and the identity of the signatory. An algorithm provides the capability to generate and verify signature. Signature generation makes use of a private key to generate a digital signature. Signature verification makes use of a public key, which corresponds to, but is not the same as, the private key. Each user possesses a private and public key pair. Public keys are assumed to be known to the public in general. Private keys are never shared. Anyone can verify the signature of a user by employing that user public key. Only the possessor of the user private key can perform signature generation. Digital signature use: As organizations move away from paper documents with ink signatures or authenticity stamps, digital signatures can provide added assurances of the evidence to provenance, identity, and status of an electronic document approval by a signatory. A digital signature scheme typically consists of three algorithms: 1. A key generation algorithm that selects a private key uniformly at random from a set of possible private keys. The algorithm outputs the private key and a corresponding public key. 2. A signing algorithm that, given a message and a private key,produces a signature.A signature verifying algorithm that, given a message, public International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013
key and a signature, either accepts or rejects the message's claim to authenticity.
Figure:1.Method of Digital signatures Creation There are two types of Cryptography- 1. Secret key or Symmetric Cryptography 2. Public key or Asymmetric Cryptography In Symmetric Cryptography the sender and receiver of a message know and use the same secret key to encrypt the message, and the receiver uses same key to decrypt the message.Asymmetric (or public key) Cryptography involves two related keys, one of which only the owner knows (the 'private key') and the other which anyone can know (the 'public key').
Figure :2. Flow of Digital Signature
SHA(SECURE HASH ALGORITHM): The National Software Reference Library (NSRL) Reference Data Set (RDS) is built on file signature generation technology that is used primarily in cryptography. SHA-0: It was withdrawn shortly after publication due to an undisclosed "significant flaw" and replaced by the slightly revised version SHA-1. SHA-1: A 160-bit hash function which resembles the earlier MD5 algorithm. This was designed by the National Security Agency (NSA) to be part of the Digital Signature Algorithm SHA-2: A family of two similar hash functions, with different block sizes, known as SHA-256 and SHA-512. They differ in the word size; SHA-256 uses 32-byte (256 bits) words where SHA-512 uses 64-byte (512 bits) words. A hash function is used in the signature generation process to obtain a condensed version of data, called a message digest (figure 1). The message digest is then input to the digital signature algorithm to generate the digital signature. The digital signature is sent to the intended verifier along with the message. The verifier the message and signature verifies the signature by using the sender's public key. The SHA algorithm use: 1. Enforce some reasonable minimum password requirements. 2. Change passwords frequently. 3.Use the strongest hash you can get - SHA- 256 was suggested here. 4.Combine the password with afixed salt (same for your whole database). 5.Combine the result of previous step with a unique salt that is stored and attached to this record. 6.Run the hash algorithm multiple times - like 1000+ times. Ideally include a different salt each time with the previous hash. Speed is your enemy and multiple iterations reduces the International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013
Limitations of SHA algorithm: 1. Never store a plain text password (which means you can never display or transmit it either.) 2. Never transmit the stored representation of a password over an unsecured line (either plain text, encoded or hashed). 3. Speed is your enemy. 4. Regularly reanalyze and improve your process as hardware and cryptanalysis improves. 5. Cryptography and process is a very small part of the solution. STOCHASTIC APPPROACH: Stochastic Approach for Link-Structure Analysis. The approach is based upon the theory of Markov chains, and relies on the stochastic properties of random walk performed on our collection of pages. It follows the Meta algorithm. LINEAR REGRESSION METHOD: Linear regression is a method of organizing data. Sometimes it is appropriate to show data as points on a graph, and then try to draw a straight line through the data. Linear regression is an algorithm for drawing such a line. Linear regression typically uses the least squares method to determine which line best fits the data. R-Squared is a measure of how well the data points match the resulting line.Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using a linear regression model. LEAST SQUARE APPROACH: A Least square" is determined by squaring the distance between a data point and the regression line. The least squares approach limits the distance between a function and the data points that a function is trying to explain. It is used in regression analysis, often in nonlinear regression modeling in which a curve is fit into a set of data. A key attribute is the unique, distinguishing characteristic of the entity. CONCLUSION: The project will fulfill the entire information requirement of the system and it is developed with a view of the requirements and satisfaction. The proposed Metric Preserving Transformation stores relative information at the server with respect to a object. The system resulted in regular and timely preparation of the required outputs. We have laid down the foundations for further research in the area of Privacy-Preserving Data Mining (PPDM). Although our work described in this preliminary and conceptual in nature, it is a vital prerequisite for the development and deployment of some techniques. We showed that the protocols are correct and privacy preserving. REFERENCES: [1]T. Bozkaya and Z.M. O zsoyoglu, Indexing Large Metric Spaces for Similarity Search Queries, International Journal of Computer Trends and Technology (IJCTT) - volume4 Issue5May 2013
ACM Trans. Database Systems,vol. 24, no. 3, pp. 361-404, 1999. [2]H. Hacigumu s, B.R. Iyer, C. Li, and S. Mehrotra, Executing SQLover Encrypted Data in the Database-Service-Provider Model,Proc. ACM SIGMOD Intl Conf. Management of Data, pp. 216- 227,2002. [3]G.R. Hjaltason and H. Samet, Index-Driven Similarity Search inMetric Spaces, ACM Trans. Database Systems, vol. 28, no. 4,pp. 517-580, 2003. [4]G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.L. Tan,Private Queries in Location Based Services: Anonymizers AreNot Necessary, Proc. ACM SIGMOD Intl Conf. Management ofData, pp. 121-132, 2008. [5]M.L. Yiu, G. Ghinita, C.S. Jensen, and P. Kalnis, OutsourcingSearch Services on Private Spatial Data, Proc. IEEE 25th Intl Conf.Data Eng. (ICDE), pp. 1140-1143, 2009. [6]M.L. Yiu, I. Assent, C.S. Jensen, and P. Kalnis, OutsourcedSimilarity Search on Metric Data Assets, DB Technical ReportTR-28, Aalborg Univ., 2010. [7]W.K. Wong, D.W. Cheung, B. Kao, and N. Mamoulis, Secure k-NN Computation on Encrypted Databases, Proc. 35th ACMSIGMOD Intl Conf. Management of Data, pp. 139-152, 2010. [8]M.L. Yiu, G. Ghinita, C.S. Jensen, and P. Kalnis, OutsourcingSearch Services on Data, Proc. IEEE 25th Intl Conf.Data Eng. (ICDE), pp. 1140-1143, 2010. [9]ManLungYIU,IraAssen,,ChristianS.jensen,Fello wOutsorced Similarity Search on Metric Data AssetsIEEEConf.EngVol 24,2012