You are on page 1of 33

INTRODUCTION

RAIN technology originated in a research project at the California Institute of Technology (Caltech), in collaboration with NASA's Jet Propulsion Laboratory and the Defense Advanced Research Projects Agency (DARPA). The name of the original research project was RAIN, which stands for Reliable Array of Independent Nodes. The main purpose of the RAIN project was to identify key software building blocks for creating reliable distributed applications using off-the-shelf hardware. The focus of the research was on high-performance, fault-tolerant and portable clustering technology for space-borne computing. Led by Caltech professor Shuki Bruck, the RAIN research team in 1998 formed a company called Rainfinity. Rainfinity, located in Mountain View, Calif., is already shipping its first commercial software package derived from the RAIN technology, and company officials plan to release several other Internet-oriented applications. The RAIN project was started four years ago at Caltech to create an alternative to the expensive, specialpurpose computer systems used in space missions. The Caltech researchers wanted to put together a highly reliable and available computer system by distributing processing across many low-cost commercial hardware and software components. To tie these components together, the researchers created RAIN software, which has three components: A component that stores data across distributed processors and retrieves it even if some of the processors fail. A communications component that creates a redundant network between multiple processors and supports a single, uniform way of connecting to any of the processors. A computing component that automatically recovers and restarts applications if a processor fails.

Fig 1. RAIN Testbed at Caltech. (C = Computer, S = Switch.) A computing component that automatically recovers and restarts applications if a processor fails. The general hardware platform for the RAIN system is a heterogeneous cluster of computing including storage nodes connected through multiple interfaces through a network of switches. A diagrammatic representation of a possible system configuration is shown in Figure. Our testbed at Caltech consists of 10 Pentium workstations running the Linux operating system, each with two network interfaces. These are connected via four
1

eight-way Myrinet switches [7]. Note, however, that the RAIN software is not tied to a particular hardware platform, operating system, or network type. Rainfinity is shipping its first product, Rainwall, which is software that runs on a cluster of PCs or workstations and creates a distributed Internet gateway for hosting applications such as a firewall. When Rainwall detects a hardware or software failure, it automatically shifts traffic to a healthy gateway without disruption of service. Two important assumptions were made, and these two assumptions reflect the differentiations between RAIN and a number of existing solutions both in the industry and in academia: The most general share-nothing model is assumed. There is no shared storage accessible from all computing nodes. The only way for the computing nodes to share state is to communicate via a network. This differentiates RAIN technology from existing back-end server clustering solutions such as SUNcluster, HP MC Serviceguard or Microsoft Cluster Server. The distributed application is not an isolated system. The distributed protocols interact closely with existing networking protocols so that a RAIN cluster is able to interact with the environment. Specifically, technological modules were created to handle high-volume networkbased transactions. This differentiates it from traditional distributed computing projects such as Beowulf.

In short, the RAIN project intended to marry distributed computing with networking protocols. It became obvious that RAIN technology was well-suited for Internet applications. During the RAIN project, key components were built to fulfill this vision. A patent was filed and granted for the RAIN technology. Rainfinity was spun off from Caltech in 1998, and the company has exclusive intellectual property rights to the RAIN technology.

Rainfinity is shipping its first product, Rainwall, which is software that runs on a cluster of PCs or workstations and creates a distributed Internet gateway for hosting applications such as a firewall. When Rainwall detects a hardware or software failure, it automatically shifts traffic to a healthy gateway without disruption of service. Two important assumptions were made, and these two assumptions reflect the differentiations between RAIN and a number of existing solutions both in the industry and in academia: The most general share-nothing model is assumed. There is no shared storage accessible from all computing nodes. The only way for the computing nodes to share state is to communicate via a network. This differentiates RAIN technology from existing back-end server clustering solutions such as SUN cluster, HP MC Service guard or Microsoft Cluster Server.
2

The distributed application is not an isolated system. The distributed protocols interact closely with existing networking protocols so that a RAIN cluster is able to interact with the environment. Specifically, technological modules were created to handle high-volume network-based transactions. This differentiates it from traditional distributed computing projects such as Beowulf. In short, the RAIN project intended to marry distributed computing with networking protocols. It became obvious that RAIN technology was well-suited for Internet applications. During the RAIN project, key components were built to fulfill this vision. A patent was filed and granted for the RAIN technology. Rainfinity was spun off from Caltech in 2000, and the company has exclusive intellectual property rights to the RAIN technology.

BACKGROUND
Cloud computing provides on-demand services delivered via the Internet, and has many positive characteristics such as convenience, rapid deployment, cost-efficiency, and so on. However, we have shown [6] that such off-premises services cause clients to be worried about the confidentiality, integrity and availability of their data. In previous work [7], we identified five deployment models of cloud services designed to ease users security concerns: The Separation Model separates storage of data from processing of data, at different providers. The Availability Model ensures that there are at least two providers for each of the data storage and processing tasks, and defines a replication service to ensure that the data stored at the various storage providers remains consistent at all times. The Migration Model defines a cloud data migration service to migrate data from on storage provider to another.

The Tunnel Model defines a data tunneling service between a data processing service and a data storage service, introducing a layer of separation where a data processing service is oblivious of the location (or even identity) of a data storage service. The Cryptography Model extends the tunnel model by encrypting the content to be sent to the storage provider, thus ensuring that the stored data is not intelligible to the storage provider. By use of these deployment models, we have shown [1] that through duplication and separation of duty, we can alleviate availability and integrity concerns, and to some extent also confidentiality by implementing encrypted storage. However, even with encrypted storage, we still have to trust the encryption provider with all our data. Furthermore, if the data needs to be processed in the cloud, the cloud processing provider in general also needs to have access. The main motivation for confidentiality control in the cloud is currently various privacy-related legislation forbidding the export of sensitive data out of a given jurisdiction, e.g. the Privacy legislation in the EU [4]. The current solution to this problem has been to sidestep it: By offering geolocalized cloud services, where a customer may request the cloud provider to ensure that the sensitive data is only stored and processed on systems that are physically located in a geographically defined area, e.g., within the borders of the European Union. However, this is rapidly becoming a moot point, since cloud service providers typically run global operations, and although data might physically reside in one jurisdiction, it will in principle be accessible from anywhere in the world. Although misappropriation of data by cloud providers has not been documented, Jensen et al. [8] show that current cloud implementations may be vulnerable to attack, and the first examples of Cloud compromises have surfaced [9]. Ristenpart et al. [10] demonstrate that even supposedly secret information such as where a given virtual machine is running may be inferred by an attacker, highlighting another attack path. Furthermore, insider malfeasors can be a challenge for any organization, and an incident at Google shows they are as vulnerable as anyone [11].

Krautheim [12] proposes to achieve cloud security through the introduction of Trusted Platform Modules (TPM) in all datacenter equipment. It is not clear, however, how the user could verify that a TPM is indeed present in any given cloud infrastructure. You might argue that the cloud provider could assert, and have an auditor confirm that they are using a TPM, but this is really not much better than todays situation where providers are asserting that they will treat your data properly, and all their certifications is a testament to them staying true to their words.

EXISTING PROBLEMS ON INTERNET:


1. Single points of failures: They are devices that have no inherent redundancy or backup. 2. Bottlenecks: They are devices that do not have enough processing power to handle the amount of traffic they receive. These two problems hinder the reliability and performance of the network.
n short, the RAIN project intended to marry distributed computing with networking protocols. It became obvious that RAIN technology was wellsuited for Internet applications. During the RAIN project, key components were built to fulfill this vision. A patent was filed and granted for the RAIN technology. Rainfinity was spun off from Caltech in 1998, and the company has exclusive intellectual property rights to the RAIN technology. After the formation of the company, the RAIN technology has been further augmented, and additional patents have been filed.

ARCHITECHTURE OF RAIN TECHNOLOGY


The RAIN technology incorporates a number of unique innovations as its core modules: The guiding concepts that shaped the architecture are as follows:

NETWORK-APPLICATIONS:
The architecture goals for clustering data network applications are different from clustering data storage applications. Similar goals apply in the telecom environment that provides the Internet backbone infrastructure, due to the nature of applications and services being clustered.

Shared-Nothing:
The shared-storage cluster is the most widely used for database and application servers that store persistent data on disks. This type of cluster typically focuses on the availability of the database or application service, rather than performance. Recovery from failover is generally slow, because restoring application access to disk-based data takes minutes or longer, not seconds. Telecom servers deployed at the edge of the network are often diskless, keeping data in memory for performance reasons, and tolerate low failover time. Therefore, a new type of share-nothing cluster with rapid failure detection and recovery is required. The only way for the shared-nothing cluster to share is to communicate via the network.

Scalability:
While the high-availability cluster focuses on recovery from unplanned and planned downtimes, this new type of cluster must also be able to maximize I/O performance by load balancing across multiple computing nodes. Linear scalability with network throughput is important. In order to maximize the total throughput, load load-balancing decisions must be made dynamically by measuring the current capacity of each computing node in real-time. Static hashing does not guarantee an even distribution of traffic.

Peer-to-Peer:
A dispatcher-based, master-slave cluster architecture suffers from scalability by introducing a potential bottleneck. A peer-to-peer cluster architecture is more suitable for latency-sensitive data network applications processing shortlived sessions. A hybrid architecture should be considered to offset the need for more control over resource management. For example, a cluster can assign multiple authoritative computing nodes that process traffic in the round-robin order
6

for each network interface that is clustered to reduce the overhead of traffic forwarding Reliable transport ensures the reliable communication between the nodes in the cluster. This transport has a built-in acknowledgement scheme that ensures reliable packet delivery. It transparently uses all available network links to reach the destination. When it fails to do so, it alerts the upper layer, therefore functioning as a failure detector. This module is portable to different computer platforms, operating systems and networking environments. Consistent global state sharing protocol provides consistent group membership, optimized information distribution and distributed group-decision making for a RAIN cluster. This module is at the core of a RAIN cluster. It enables efficient group communication among the computing nodes, and ensures that they operate together without conflict. Always On IP maintains pools of "always-available" virtual IPs. These virtual IPs are nothing but the logical addresses that can move from one node to another for load sharing or fail-over. Usually a pool of virtual IPs is created for each subnet that the RAIN cluster is connected to. A pool can consist of one or more virtual IPs. Always On IP guarantees that all virtual IP addresses representing the cluster are available as long as at least one node in the cluster is operational. In other words, when a physical node fails in the cluster, its virtual IP will be taken over by another healthy node in the cluster. Local and global fault monitors monitor, on a continuous or event-driven basis, the critical resources within and around the cluster: network connections, Rainfinity or other applications residing on the nodes, remote nodes or applications. It is an integral part of the RAIN technology, guaranteeing the healthy operation of the cluster.

Fig 2. RAIN Architecture

Fig 3. RAIN Software Architecture Secure and central management offers a browser-based management GUI for centralized monitoring and configuration of all nodes in the RAIN clusters. The central management GUI connects to any node in the cluster to obtain a single-system view of the entire cluster. It actively monitors the status, and can send operation and configuration commands to the entire cluster.

FEATURES OF RAIN TECHNOLOGY:


Features of RAIN system include scalability, dynamic reconfigurability, and high availability. Through software implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. In addition to reliability, the RAIN architecture permits efficient use of network resources, such as multiple data paths and redundant storage, with graceful degradation in the presence of faults. The RAIN project leads to many novel features in an attempt to contract with faults in nodes, networks, and data storage.

Communication:
As the network is frequently a single point of failure, RAIN provides fault tolerance in the network through the following mechanisms: Bundled interfaces: Nodes are permitted to have multiple interface cards. This not only adds fault tolerance to the network, but also gives improved bandwidth. Link monitoring: To correctly use multiple paths between nodes in the presence of faults, we have developed a linkstate monitoring protocol that provides a consistent history of the link state at each endpoint. Fault-tolerant interconnects topologies: Network partitioning is always a problem when a cluster of computers must act as a whole. We have designed network topologies that are resistant to partitioning as network elements fail. Data storage: Fault tolerance in data storage over multiple disks is achieved through redundant storage schemes. Novel error-correcting codes have been developed for this purpose. These are array codes that encode and decode using simple XOR operations. Traditional RAID codes generally only allow mirroring or parity as options. Array codes exhibit optimality in the storage requirements as well as in the number of update operations needed. Although some of the original motivations for these codes come from traditional RAID systems , these schemes apply equally well to partitioning data over disks on distinct nodes or even partitioning data over remote geographic locations.

GROUP MEMBERSHIP: Tolerating faults in an asynchronous distributed system is a challenging task. Reliable group Membership service ensures that processes in a group maintain a consistent view of the global membership. In order for a distributed application to work correctly in the presence of faults, a certain level of problems in an asynchronous distributed system such as consensus, group membership, commit and atomic broadcast that have been extensively studied by researchers. In the RAIN system, the group membership protocol is the critical building block. It is a difficult task especially when change in membership occurs, either due to failures or voluntary joins and withdrawals. In fact under the classical asynchronous environment, the group membership problem has been proven impossible to solve in the presence of any failures. The underlying reason for the impossibility is that according to the classical definition of asynchronous environment, processes in the system share no common clock and there is no bound on the message delay. Under this definition it is impossible to implement a reliable fault detector, for no fault detector can distinguish between a crashed mode and a very slow mode. Since the establishment of this theoretic result researchers have been striving to circumvent this impossibility. Theorists have modified the specification while practitioners have built a number of real systems that achieve a level of reliability in their particular environment.

Novel features:
The group membership in the RAIN system differs from that of other systems in several respects: Firstly, it is based exclusively on unicast messages, a practical model given the nature of the Internet. With this model the total ordering of packets is not relevant. Compared to broadcast messages unicast messages are more efficient in terms of CPU overhead. Secondly, the protocol does not require the system to freeze during reconfiguration. We do make the assumption the mean time to failure of a system is greater than the convergence time of the protocol. With this Assumption the RAIN system tolerates node and link failures, both permanent and transient. In general it is not possible to distinguish a slow node from a dead node in an asynchronous environment. It is inevitable for a group membership protocol to exclude a live node, if it is slow, from the membership. Our protocol allows such a node to rejoin the cluster automatically.

10

MECHANISM
The key to this fault management service is a token-based group membership protocol. The protocol consists of two mechanisms, a token mechanism and a 911 mechanism. The two mechanisms are described in detail in the next two sections.

Token mechanism:
The nodes in the membership are ordered in a logical ring. A token is a message that is being passed at a regular interval from one node to next node in the ring. The reliable packet communication layer is used for the transmission of the token, and guarantees that the token will eventually reach the destination. The token carries the authoritative knowledge of the membership when a node receives a token; it updates its local membership information according to the token. The token is also used for failure detection. There are two variants for failure detection protocol in this token mechanism. The aggressive detection protocol achieves fast detection time but is more prone to incorrect decisions viz, it may temporarily exclude a node only in the presence of link failures. The conservative detection protocol excludes a node only when its communication has failed from all nodes in the connected component. The conservative failure detection protocol has slower detection time than the other detection protocol.

(a)

(b)

(c) Fig (a) Token movement with no link failure. (b) Token movement with one link failure and
11

aggressive failure detection. (c) Token movement with one link failure and conservative failure detection. Having described the token mechanism, few questions remain. What if a node fails when it processes the token and consequently the token is lost? Is it possible to add a new node to the system? How does the system recover from the transient failures? All of these questions can be answered by the 911 mechanism.

Token regeneration:
To deal with the token loss problem, a time out has been set on each node in the membership. If a node does not receive a token for a certain period of time, it enters the STARVING mode. The node suspects that the token has been lost and sends out a 911 message to the next node in the ring. The 911 message is a request for a right to regenerate the token, and is to be provided by all the live nodes in the membership. It is imperative to allow one and only one node to regenerate the token when a token regeneration is needed. To guarantee this mutual exclusivity, we utilize the sequence number on the token. Every time a token is being passed from one node to another, the sequence number on it is increased by one. The primary function of the sequence number is to allow the receiving node to discard the out of sequence tokens. The sequence number also plays an important role in the token regeneration mechanism. Each node makes a local copy of the token every time that the node receives it. When a node needs to send a 911 message to request the regeneration of token, it adds this message to the sequence number that is on its last local copy of the token. This sequence number will be compared to all the sequence numbers on the local copies of the token on the other live nodes. The 911 requests will be denied by any node, which possesses a more recent copy of the token. In the event that the token is lost, every live node sends out a 911 request after its STARVING timeout expires. Only the node with the latest copy of the token will receive the right to regenerate the token.

Dynamic scalability:
The 911 message is not only used as a token regeneration request, but also as a request to join the group. When a new node wishes to participate in the membership, it sends a 911 message to any node in the cluster. The receiving node notices that the originating node of this 911 is not a member of the distributed system, and therefore, treats it as a join request. The next time that it receives the token, it adds the new node to the membership, and sends the token to the new node. The new node becomes a part of the system.

12

Link failures and transient failures:


The unification of the token regeneration request and the join request facilitates the treatment of the link failures in the aggressive failure detection protocol. Using the example in figure (b), node B has been removed from the membership because of the failure between A and B. node B does not receive the token for a while and it enters the STARVING mode and sends out a 911 message to node C. node C notices that node B is not a part of the membership and therefore treats the 911 as a join request. The ring is changed to ABCD and node B joins the membership. Transient failures are treated with the same mechanism. When a transient failure occurs a node is removed from the membership. After the node recovers it sends out a 911 message. The 911 message is treated as a join request and the node is added back into the cluster. In the same fashion, wrong decisions made in a local failure detector can be corrected, guaranteeing that all non-faulty nodes in the primary connected component eventually stay in the primary membership. Putting together the token and 911 mechanisms, we have a reliable group membership protocol. Using this protocol it is easy to build the fault management service. It is also possible to attach to the token application dependant synchronization information.

13

COMMUNICATION TOPOLOGIES Bundled Interfaces:


Nodes are permited to have multiple interface card which increases fault tolerance and bandwidth.

Link Monitoring:
A link-state monitoring protocol that provides a consistence history of the link state at each endpoint.

Fault-Tolerant interconnect Topologies:


Network partioning is always a problem when a cluster of computers must act as whole. The Problem: Given n switches connected to m nodes in a ring , what is the best way to minimize the possibility of partioning the nodes when failures occur?

A naive approach: At first glance above shown figure may seem the solution to our problem. In this construction we simply connect the compute nodes to the nearest switches in a regular fashion. If we are using this approach we are relying entirely on fault tolerance in the switching network. A ring is 1-fault tolerant for connectivity, so we can lose one switch without upset. A second switch failure can partition the switches and the computing nodes as s This prompts the study of whether we can use the multiple connections of the compute nodes to make the compute nodes more resistant partitioning. In other words we want a

14

construction where the connectivity of the nodes is maintained even after the switch network has become partitioned.

FIG: NAVE APPROACH

Diameter Solution: The intuitive driving idea behind this construction is to connect the compute nodes to the switching network in the most non-local way possible. i.e. connect a compute node to switches that are maximally distant from each other. This idea can be applied to arbitrary compute nodes, degree dc where each connection for a node is as far as possible from its neighbors. We call this solution Diameter solution because maximally distant switches in a ring are on opposite sides of the ring. So a compute node of degree2connected between them forms a diameter and hence this name. . Diameter construction for n is (a) odd, (b) even. In the diameter solution, we actually use the switches that are one less than the diameter apart to permit n compute nodes to be connected to n switches with each compute node connected to a unique pair of switches. Diameter construction with compute nodes of degree dc=2 connected to a ring of n switches of degree ds= 4 can tolerate 3 faults of any kind (switch, link, or node) without partitioning the network. This construction is optimal in the sense that no construction connecting n computing nodes of degree dc=2 to a ring of switches of degree ds = 4 can tolerate an arbitrary 4 faults without partitioning the nodes into sets of non-constant size.

15

FIG: DIAMETER SOLUTION

16

DATA STORAGE Erasure-correcting Code:


Erasure correcting codes are mathematical means of representing data so that lost information can be recovered. With an (n, k) erasure correcting code, we represent k symbols of original data with n symbols of encoded data. With an m erasure correcting code , original data can be recovered even if m symbols of encoded data lost A code is set to be Maximum Distance Separable(MDS) if m=n-k. The only operation needed for encoding and decoding are exclusive OR(XOR) operations

Distributed Store/Retrieve Operations:


Suppose, we have n nodes for store operation , we encode data of size d into n symbols, each of size d by k We store one symbol per node. For retrieve operations we collect the symbols from any k nodes and decode them to obtain original data.

Features:
Original data can be recovered with upon n-k nodes failure. It provides dynamic reconfigurability. Features of Data Storage: Original data can be recovered with up to n-k nodes failure. It provides dynamic reconfigurability.

17

AN EXAMPLE: GROUP MEMBERSHIP


Critical building block Tolerating faults is a very tedious. Group Membership ensures that all processes maintain a membership. consistent view of global

This results in the following: link/node failures dynamic reconfiguration

The key to fault detection Token based group membership protocol Token mechanism and
911 mechanism.

Token carries:

group membership list

sequence number and Only unicast messages.


18

IF THERE IS A FAILURE:

19

IF THERE ARE MORE NO.OF FAILURES:

20

DYNAMIC STABILITY:

RAIN PLATFORM:

21

SIMULATIONS
In this section we describe our efforts to simulate the performance of the protocol. The source code and detailed settings are available from the authors upon request.

Simulation environment
We implemented the protocol utilising a the Nessi simulation framework [30] written in the Python programming language. The main motivation for selecting a Python-based framework was the flexibility and ease of use that the programming language offers despite its obvious performance penalty as compared to other C and C++ based simulation frameworks. The Nessi Framework is not currently actively maintained and lacks a few basic elements such as support for the Internet Protocol (IP), Address Resolution Protocol (ARP) and Transport layer protocols such as TCP and UDP. The framework offer a stack consisting of everything up to and including the Data Link Layer of the OSI reference model as well as some application layer traffic generators. We therefore had to make some simplifying assumptions to cater for the fact that our protocol is designed to run on top of TCP/IP or UDP/IP.

Simulation implementation
Since our implementation aimed at demonstrating the delay and throughput of the protocol, we simplified the described protocol somewhat. The Mixnet is modelled as a number of nodes connected through a Point-to-Point (P2P) links. Each node contains three network interface cards and is randomly connected to other mixnet nodes through these cards. The Mixnet does not setup a path or route information in the network, instead we specify a hopcount that determines the number of times a packet should be forwarded by the network. Forwarding is done by randomly selecting one of the network interface cards attached to the host and then forwarding over the data link layer to the host connected to the other end. When the hopcount reaches zero, the mixnet node forwards the packet to the IRC-node. The IRC-node is connected to the mixnet through an Ethernet bus, and forwards packets to the agents on another Ethernet bus. By utilising the MAC-address as an application level address, our implementation circumvents the problems of not having a network layer protocol. The user and C&C are represented as Traffic generating
22

sources attached to a mixnet-node, whereas the agents are implemented only as Traffic sinks. Hence, the protocol is only implemented in one direction.

Simulation setup
The default values of all simulation parameters are given in Table 2 and if not explicitly stated otherwise, these are the values used for all simulation runs. We have attempted to keep the values as realistic as possible, without exhausting the resource requirements of the PC running the simulations. This particularly includes memory allocation, which can be quite challenging in Python.

23

SECURITY ANALYSIS
Malicious users can either collect individual data segments or re-assemble the complete data to compromise the data confidentiality.

Compromization by a single data segment


Malicious users can randomly pick up individual data segments if they have excessive access privilege to the cloud storage services. Any individual data segment that is picked by a malicious user should disclose no information on the original data, according to the criteria of data segmentation. Therefore, it is not possible for malicious users to compromise the confidentiality by any single data segments.

Compromization by re-assembling
Malicious users can also re-assemble the original data. The re-assembling consists of a few steps as follows. 1. Picking all related data segments. 2. Permuting the data segments. Picking all the related data segments requires a malicious user to have excessive access privileges to access all the involved cloud storage services, and also requires that the malicious user to pick up all the data segments from all the involved cloud storage services. Suppose that for each miM, where M is the set of all the cloud storage services, the set of all data segments stored by the cloud storage service mi is Ni. The number of data segments stored in Mis TotalSegs where

TotalSegs=

i1MNi(12)

24

To be able to re-assemble the original data, a malicious user must be able to pick all the segments and permutate the segments into the right order. If the malicious user does not know the number of segments the original data has been split into, the total number of possible re-assembled data is NumOfAllReassembled, where

NumOfAllReassembled= =

i=1UPiTotalSegs(13)

i=1UPiMi1Ni(14)

Where Ris the upper limit of the number of segments that a data is likely to be split into. If the malicious user knows, s, the number of segments the original data has been split into, the total number of possible re-assembled data is NumOfReassembled, where

NumOfReassembled= =

i=1sPiTotalSegs(15)

i=1sPiMi1Ni(16)

Both cases require a large amount of computation to brute-force search the complete space. Hence it is not trivial for a malicious user to compromise the data confidentiality by reassembling the original data. From a complexity point of view, assuming that there are in total npieces of data stored in the cloud, let a malicious user try to illegally access a file, which has been split into kpieces and kept in the cloud. The malicious user must first re-assemble the whole file, taking two steps. 1. Step 1: All k pieces must be retrieved corrected out from the n pieces. The probability to retrieve the correct pieces is as follows.

p1=1Ckn=k!n(n1)(nk+1)
25

2. Step 2: Re-order all the k pieces into the correct order, given the k pieces. The probability of putting all kpieces in the right order without any knowledge of the original data is

p2=1Pkk=1k!
Hence, the probability of re-assembling the file correctly is

p=p1p2=k!n(n1)(nk+1)1k! =1n(n1)(nk+1)
Assuming that there is a very large number of pieces in the cloud and each file is split into small enough chunks, n and k are both large enough to ensure that the probability p is small enough to counter attacks. The cost for an attacker is far from only the computation complexity of re-assembling the k pieces. Due to the distributed characteristics of the proposed storage system, the system contains a very large amount of data and the data are distributed across various networks. The attacker attempting to re-assemble a file by brute force will have to have an extremely large storage space to keep all the retrieved data pieces (both the correct ones and the wrong ones), and it also has to afford the cost for the network bandwidth to transfer such an amount of data across the network.

Protocol analysis
As can be seen from Table 1, pain has been taken to avoid reusing commands that otherwise might have made it possible to replay messages from A to B as a message from B to C. This is according to Principle 1 of Abadi and Needham [28], and to some extent also Principle 3, since it ensures that every message will only be handled by the same type of actor as intended. However, full adherence with Principle 3 may be difficult to achieve in a setting of anonymous communication. The explicit naming of commands is also in accordance with Principle 10, since it allows for unambiguous encoding of each message.

26

Summary of all protocol commands


Since data is sent encrypted from the C&C node to the Cloud Processing provider, it cannot be observed by an adversary, who cannot determine the symmetric key used to encrypt the response, and thus cannot do a suppress-replay attack to replace the result with a bogus result. We are assuming the C&C node has verified public keys to the providers, which means only the selected provider sees the data, but as long as the relationship between user and data is kept secret, it does not really matter exactly which Cloud Processing provider handles the data. By applying the Scyther tool [29], we find (unsurprisingly) that the assumption that data and session keys from the C&C node are kept confidential holds (see Listing 1), but unless the public key of the processing provider has been verified, we cannot assume that it remains confidential. Since the Scyther tool does not support verifying privacy/anonymity claims, it cannot be used to verify the full protocol.

27

ADVANTAGES OF RAIN TECHNOLOGY


RAIN technology is the most scalable software cluster technology for the Internet marketplace today. There is no limit on the size of a RAIN cluster. Within a RAIN cluster, there is no master-slave relationship or primary-secondary pairing. All nodes are active and can participate in load balancing. Any node can fail-over to any node. FAULT TOLERANCE: A RAIN cluster can tolerate multiple node failures, as long as at least one node is healthy. It employs highly efficient consistent state sharing decisionmaking protocols, so that the entire cluster can function as one system. A RAIN cluster is a true distributed computing system that is resilient to faults. It behaves well in the presence of node, link and application failures, as well as transient failures. When there are failures in the system, a RAIN cluster gracefully degrades its performance to exclude the failed node, but maintains the overall functionality. ADDITION OF NEW NODES: New nodes can be added into the cluster on the fly to participate in load sharing, without taking down the cluster. With RAIN, online maintenance without downtime is possible. Part of the cluster can be taken down for maintenance, while the other part maintains the functionality. RAIN also allows online addition of new nodes for the growth of the cluster to provide higher performance and higher levels of fault tolerance. It is very simple to deploy and manage a RAIN cluster. RAIN technology addresses the scalability problem on the layer where it is happening, without the need to create additional layers in the front. One element in the RAIN architecture is the management module, which allows the user to monitor and configure the entire cluster by connecting to any one of the nodes. The consistent state-sharing module will help propagate the configuration throughout the cluster. PORTABILITY: This software-only technology is open and highly portable. It works with a variety of hardware and software environments. Currently it has been ported to Solaris, NT and Linux. It supports a heterogeneous environment as well, where the cluster can consist of nodes of different operating systems with different configurations. There is no distance limitation to RAIN technology. It supports clusters of geographically distributed nodes. It can work with many different Internet applications. RAIN technology has been implemented as Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers [12]. With RAIN technology at the core, Rainfinity has created a family of Internet Reliability Software solutions that address the availability and performance requirements of the Internet Infrastructure. Each solution is focused on critical elements or functions of the Internet Infrastructure, such as firewalls, web servers, and traffic management. They bring the unlimited scalability and built-in reliability that mission-critical Internet environment require.

SCALABILITY:

28

APPLICATIONS OF RAIN TECHNOLOGY


We consider several applications implemented on RAIN platform based on the communication, fault management and data storage building blocks: a video server (RAIN Video), a web server (SNOW), and a distributed checkpointing system (RAINCheck).

1) High availability video server:


There has been considerable research in the areas of fault-tolerant internet and multimedia servers. Examples are the SunSCALR project at Sun Microsystems [15], For this RAIN Video application, a collection of videos are written and encoded to all n nodes in the system with distributed store operations. After this Each node runs a client application that attempts to display a video, as well as a server application that supplies encoded video data.. 2) The RAIN Video system For our RAINVideo application, a collection of videos are written and encoded to all n nodes in the system with distributed store operations. Each node runs a client application that attempts to display a video, as well as a server application that supplies encoded video data. For each block of video data a client performs a distributed retrieve operation to obtain encoded symbols from k of the servers. It then decodes the block of video data and displays it. If we break network connections or take down nodes, some of the servers may no longer be accessible. However the videos continue to run without interruption provided that each client can access at least k servers. Snapshots of the demo are shown in figure. There are 10 computers each with two Myrinet network interfaces, and four 8-way Myrinet network switches.

3) High availability web server:


SNOW is meant for Strong Network of Web Servers. It implements the concept project that demonstrates the features of the RAIN system. The main purpose is to develop a highly available Fault-Tolerant Distributed Web Server Cluster that minimizes the risk of down time for mission critical Internet and intranet applications. The SNOW project uses several key building blocks of the RAIN technology. First, it considers the reliable communication layer is used to handle all of the messages, which passes between the servers in the SNOW system. Secondly, the token-based fault management module is used to establish the set of servers participating in the cluster.

29

4) A client node displaying a video in the RAIN Video System SNOW also uses the distributed state sharing mechanism enabled by the RAIN system. The state information of the web servers, namely, the queue of http requests is shared reliably and consistently among the SNOW nodes. High availability and performance are achieved without external load balancing devices. The SNOW system is also readily scalable. In contrast the commercially available Microsoft Wolfpack is only available for up to two nodes per cluster.

5) Distributed check pointing mechanism:


A checkpoint and rollback/recovery mechanism [2] on the RAIN platform based on the distributed store and retrieve operations. The scheme runs in conjunction with a leader election protocol[13]. This protocol ensures that there is a unique node designated as leader in every connected set of nodes. As each job executes, a checkpoint of the state is taken periodically. The state is encoded and written to all accessible nodes with a distributed store operation. If a node fails or becomes inaccessible, the leader assigns the nodes job to other nodes.

30

FUTURE SCOPE OF RAIN TECHNOLOGY


Development of APIs for using the various building blocks. We should standardize the packaging of the various components to make them more practical for use by outside groups. The implementation of a real distributed file system using the partitioning scheme developed here. In addition to making the building blocks more accessible to others, it would help in accessing the performance benefits and penalties from partitioning data in such a manner. The Group Communication Protocols are being extended to address more challenging scenarios. For example, we are currently working on the hierarchical design that extends the scalability of the protocol.

This divide-and-conquer approach may be suitable for privacy-conscious home users and small businesses, but the ultimate holy grail is absolute confidentiality in the cloud, and thus a deliverance from trust. Only then can cloud computing deliver on the dream of computing power as a utility akin to power, water and gas. A further refinement of our approach that removes the necessity to trust the C&C node is therefore a natural challenge.

We have implemented a simple proof-of-concept prototype [43], so the next step will be to implement a large-scale prototype to gauge performance impacts on typical cloud applications. One particular challenge in this respect is to determine the optimal slicing strategy for arbitrary data. It is likely that a trade-off between security and efficiency will have to be made in order to capitalize on the advantages of the Cloud Computing paradigm. The prototype will be targeted toward a sensitive but unclassified application, representing a realistic use case.

31

CONCLUSION
The goal of the RAIN project has been to build a test-bed for various building blocks that address fault management, communication and storage in a distributed environment. The creation of such building blocks is important for the development of a fully functional distributed computing system. One of the fundamental driving ideas behind this work has been to consolidate assumptions required to get around the difficult parts of distributed computing into several basic building blocks. We feel the ability to provide basic, probably correct services are essential to building a real fault-tolerant system. In other words difficult proofs should be confined to a few basic components of the system. Components of the system built on top of these reliable components should then be easier to develop and easier to establish as correct in their own right. Building blocks that we consider important and that are discussed in this paper are those providing reliable communication, group membership and reliable storage. Simply put, RAIN allows for the grouping of an unlimited number of nodes, which can then function as one single giant node, sharing load or taking over if one or more of the nodes ceases to function correctly. The RAIN technology incorporates many important unique innovations in its core elements, which deliver important advantages: unlimited scalability high performance built-in reliability simple deployment and management flexibility of software for integration in a variety of hardware and software environments

32

REFERENCES:

[1] Y. Amir, et al., 1995,The Totem Single-Ring Ordering and Membership Protocol, ACM Trans. Computer Systems, vol. 13, no. 4. [2] E.N. Elnozahy and W. Zwaenepoel, May 1992, Manetho Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit, IEEE Trans. Computers, vol. 41, no. 5. [3] A. Beguelin,et al., 1997, Application Level Fault Tolerance in Heterogeneous Networks of Workstations, J. Parallel and Distributed Computing, vol. 43, no. 2, pp. 147155. [4] M. Ben-Or, Aug. 1983, Another Advantage of Free Choice: Completely Asynchronous Agreement Protocols, Proc. Second ACM Symp. Principles of Distributed Computing. [5] [5] K.P. Birman and B.B. Glade,1995, Reliability Through Consistency, IEEE Software, vol. 12, no. 3. [6] M. Blaum,et al., Feb. 1995, EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures,IEEE Trans. Computers, vol. 44, no. 2. [7] N.J. Boden,et al., 1995 , Myrinet: A Gigabit per Second Local Area Network, IEEE Micro, vol. 15, no. 1. [8] D. Dolev and D. Malki, 1996,The Transis Approach to High Availability Cluster Communication, Comm. ACM, vol. 39, no. 4. [9] A. Singhai,et al., 1998, The SunSCALR Framework for Internet Servers, Proc. IEEE 28th Int'l Symp. Fault-Tolerant Computing. [10] T.D. Chandra, et al., 1996, On the Impossibility of Group Membership, Proc. 15th ACM Symp. Principles of Distributed Computing.

33

You might also like