You are on page 1of 11

CLOUD COMPUTING ABSTRACT Innovation is necessary to ride the inevitable tide of change.

Indeed, the success of the transformation of IBM to an On Demand Business depends on driving the right balance of productivity, collaboration, and innovation to achieve sustained, organic top line growth and bottom line profitability. Enterprises strive to reduce computing costs. Many start by consolidating their IT operations and later introducing virtualization technologies. Cloud computing takes these steps to a new level and allows an organization to further reduce costs through improved utilization, reduced administration and infrastructure costs, and faster deployment cycles. The cloud is a next Generation platform that provides dynamic resource pools, virtualization, and high availability. Cloud computing describes both a platform and a type of application. A cloud computing platform dynamically provisions, configures, reconfigures, and deprovisions servers as needed. Cloud applications are applications that are extended to be accessible through the Internet. These cloud applications use large data centers and powerful servers that host Web applications and Web services. Cloud computing infrastructure accelerates and fosters the adoption of innovations: Enterprises are increasingly making innovation their highest priority. They realize they need to seek new ideas and unlock new sources of value. Driven by the pressure to cut costs and grow simultaneouslythey realize that its not possible to succeed simply by doing the same things better. They know they have to do new things that produce better results. Cloud computing enables innovation. It alleviates the need of innovators to find resources to develop, test, and make their innovations available to the user community. Innovators are free to focus on the innovation rather than the logistics of finding and managing resources that enable the innovation. Cloud computing helps leverage innovation as early as possible to deliver business value to IBM and its customers. Fostering innovation requires unprecedented flexibility and responsiveness. The enterprise should provide an ecosystem where innovators are not hindered by excessive processes, rules, and resource constraints. In this context, a cloud computing service is a necessity. It comprises an automated framework that can deliver standardized services quickly and cheaply. Cloud computing infrastructure allows enterprises to achieve more efficient use of their IT hardware and software investments: Cloud computing increases profitability by improving resource utilization. Pooling resources into large clouds drives down costs and increases utilization by delivering resources only for as long as those resources are needed. Cloud computing allows individuals, teams, and organizations to streamline procurement processes and eliminate the need to duplicate certain computer administrative skills related to setup, configuration, and support. This paper introduces the value of implementing cloud computing. The paper defines clouds, explains the business benefits of cloud computing, and outlines cloud architecture and its major components. Readers will discover how a business can use cloud computing to foster innovation and reduce IT costs.

Audio CAPTCHA: Existing solutions assessment and a new implementation for VoIP telephony
ABSTRACT SPam over Internet Telephony (SPIT) is a potential source of future annoyance in Voice over IP (VoIP) systems. A typical way to launch a SPIT attack is the use of an automated procedure (i.e., bot), which generates calls and produces unsolicited audio messages. A known way to protect against SPAM is a Reverse Turing Test, called CAPTCHA (Completely Automated Public Turing Test to Tell Computer and Humans Apart). In this paper, we evaluate existing audio CAPTCHA, as this type of format is more suitable for VoIP systems, to help them fight bots. To do so, we first suggest specific attributes-requirements that an audio CAPTCHA should meet in order to be effective. Then, we evaluate this set of popular audio CAPTCHA, and demonstrate that there is no existing implementation suit-able enough for VoIP environments. Next, we develop and implement a new audio CAPTCHA, which is suitable for SIP-based VoIP telephony. Finally, the new CAPTCHA is tested against users and bots and demonstrated to be efficient.

Virus and Anti Viruses


Viruses: A virus is basically an executable file which is designed such that first of all it should be able to infect documents, then it has to have the ability to survive by replicating itself and then it should also be able to avoid detection. Computer viruses can be classified into several different types. File or program viruses: They infect program files like files with extensions like .EXE, .COM, .BIN, .DRV and .SYS. Some file viruses just replicate while others destroy the program being used at that time. Boot Sector Viruses (MBR or Master Boot Record): Boot sector viruses can be created without much difficulty and infect either the Master boot record of the hard disk or the floppy drive. Polymorphic Viruses: They are the most difficult viruses to detect. They have the ability to mutate this means that they change the viral code known as the signature each time it spreads or infects etc. Antiviruses: The ideal solution to the threat of viruses is prevention. Do not allow a virus is get into the system in first place. This goal is in general difficult to achieve, although prevention can reduce the no: of successful viral attacks. The next best approach is to be able to do the following. Detection, Identification, Removal. Basic techniques are Scanners: Scanners are programs that scan the executable objects (files and boot sectors) for the presence of code sequences that are present in the known viruses. Monitors: The monitoring programs are memory resident programs, which constantly monitor some functions of the operating system. Integrity Checking: A program, which can detect that the other executable objects have been modified, will be able to detect the infection. Such programs are usually called integrity checkers.

Cell Phone Virus and Security


ABSTRACT As cell phones become a part and parcel of our life so do the threats imposed to them is also on the increase. Like the internet, today even the cell phones are going online with the technologies like the edge, GPRS etc. This online network of cellphones has exposed them to the high risks caused by malwares viruses, worms and Trojans designed for mobile phone environment. The security threat caused by these malwares are so severe that a time would soon come that the hackers could infect mobile phones with malicious software that will delete any personal data or can run up a victim s phone bill by making toll calls. All these can lead to overload in mobile networks, which can eventually lead them to crash and then the financial data stealing which poises risk factors for smart phones. As the mobile technology is comparatively new and still on the developing stages compared to that of internet technology, the anti virus companies along with the vendors of phones and mobile operating systems have intensified the research and development activities on this growing threat, with a more serious perspective.

Sixth Sense Technology


ABSTRACT Sixth Sense Technology is a mini-projector coupled with a camera and a cell phone which acts as the computer and connected to the Cloud, all the information stored on the web. Sixth Sense can also obey hand gestures. The camera recognizes objects around a person instantly, with the micro-projector overlaying the information on any surface, including the object itself or hand. Also can access or manipulate the information using fingers make a call by Extend hand on front of the projector and numbers will appear for to click know the time by Draw a circle on wrist and a watch will appear. take a photo by Just make a square with fingers, highlighting what want to frame, and the system will make the photo which can later organize with the others using own hands over the air. The device has a huge number of applications, it is portable and easily to carry as can wear it in neck. The drawing application lets user draw on any surface by observing the movement of index finger. Mapping can also be done anywhere with the features of zooming in or zooming out. The camera also helps user to take pictures of the scene is viewing and later can arrange them on any surface. Some of the more practical uses are reading a newspaper reading a newspaper and viewing videos instead of the photos in the paper. Or live sports updates while reading the newspaper. The device can also tell arrival, departure or delay time of air plane on tickets. For book lovers it is nothing less than a blessing. Open any book and find the Amazon ratings of the book. To add to it, pick any page and the device gives additional information on the text, comments and lot more add on feature.

CLUSTER COMPUTING report.doc (Size: 62.5 KB / Downloads: 627)

ABSTRACT
A computer cluster is a group of loosely coupled computers that work together closely so that in many respects it can be viewed as though it were a single computer. Clusters are commonly connected through fast local area networks. Clusters are usually deployed to improve speed and/or reliability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or reliability. Cluster computing has emerged as a result of convergence of several trends including the availability of inexpensive high performance microprocessors and high speed networks, the development of standard software tools for high performance distributed computing.Clusters have evolved to support applications ranging from ecommerce, to high performance database applications. Clustering has been available since the 1980s when it was used in DEC's VMS systems. IBM's sysplex is a cluster approach for a mainframe system. Microsoft, Sun Microsystems, and other leading hardware and software companies offerclustering packages that are said to offer scalability as well as availability. Cluster computing can also be used as a relatively lowcost form of parallel processing for scientific and other applications that lend themselves to parallel operations. INTRODUCTION Computing is an evolutionary process. Five generations of development history with each generation improving on the previous ones technology, architecture, software, applications, and representative systemsmake that clear. As part of this evolution, computing requirements driven by applications have always outpaced the available technology. So, system designers have always needed to seek faster, more cost effective computer systems. Parallel and distributed computing provides the best solution, by offering computing power that greatly exceeds the technological limitations of single processor systems. Unfortunately, although the parallel and distributed computing concept has been with us for over three decades, the high cost of multiprocessor systems has blocked commercial success so far. Today, a wide range of applications are hungry for higher computing power, and even though single processor PCs and workstations now can provide extremely fast processing; the even faster execution that multiple processors can achieve by working concurrently is still needed. Now, finally, costs are falling as well. Networked clusters of commodity PCs and workstations using off-the-shelf processors and communication platforms such as Myrinet, Fast Ethernet, and Gigabit Ethernet are becoming increasingly cost effective and popular. This concept, known as cluster computing, will surely continue to flourish: clusters can provide enormous computing power that a pool of users can share or that can be collectively used to solve a single application. In addition, clusters do not incur a very high cost, a factor that led to the sad demise of massively parallel machines. 6 Clusters, built using commodity-off-the-shelf (COTS) hardware components and free, or commonly used, software, are playing a major role in solving large-scale science, engineering, and commercial applications. Cluster computing has emerged as a result of the convergence of several trends, including the availability of inexpensive high performancemicroprocessors and high speed networks, the development of standard software tools for high performance distributed computing, and the increasing need of computing power for computational science and commercial applications. CLUSTER HISTORY The first commodity clustering product was ARCnet, developed by Datapoint in 1977. ARCnet wasn't a commercial success and clustering didn't really take off until DEC released their VAXcluster product in the 1980s for the VAX/VMS operating system. The ARCnet and VAXcluster products not only supported parallel computing, but also shared file systems and peripheral devices. They were

supposed to give you the advantage of parallel processing while maintaining data reliability and uniqueness. VAXcluster, now VMScluster, is still available on OpenVMS systems from HP running on Alpha and Itanium systems. The history of cluster computing is intimately tied up with the evolution of networking technology. As networking technology has become cheaper and faster, cluster computers have become significantly more attractive. How to run applications faster? There are 3 ways to improve performance: Work Harder Work Smarter Get Help Era of Computing Rapid technical advances the recent advances in VLSI technology software technology grand challenge applications have become the main driving force Parallel computing CLUSTERS Extraordinary technological improvements over the past few years in areas such as microprocessors, memory, buses, networks, and software have made it possible to assemble groups of inexpensive personal computers and/or workstations into a cost effective system that functions in concert and posses tremendous processing power. Cluster computing is not new, but in company with other technical capabilities, particularly in the area of networking, this class of machines is becoming a highperformance platform for parallel and distributed applications Scalable computing clusters, ranging from a cluster of (homogeneous or heterogeneous) PCs or workstations to SMP (Symmetric Multi Processors), are rapidly becoming the standard platforms for highperformance and large-scale computing. A cluster is a group of independent computer systems and thus forms a loosely coupled multiprocessor system as shown in figure. 10 However, the cluster computing concept also poses three pressing research challenges: A cluster should be a single computing resource and provide a single system image. This is in contrast to a distributed system where the nodes serve only as individual resources. It must provide scalability by letting the system scale up or down. The scaled-up system should provide more functionality or better performance. The systems total computing power should increase proportionally to the increase in resources. The main motivation for a scalable system is to provide a flexible, cost effective Information-processing tool. The supporting operating system and communication Mechanism must be efficient enough to remove the performance Bottlenecks. The concept of Beowulf clusters is originated at the Center of Excellence in Space Data and Information Sciences (CESDIS), located at the NASA Goddard Space Flight Center in Maryland. The goal of building a Beowulf cluster is to create a cost effective parallel computing system from commodity components to satisfy specific computational requirements for the earth and space sciences community. The first Beowulf cluster was built from 16 IntelDX4TM processors connected by a channel bonded 10 Mbps Ethernet and it ran the Linux operating system. It was an instant success, demonstrating the concept of using a commodity cluster as an alternative 11 choice for high-performance computing (HPC). After the success of the first Beowulf cluster, several more were built by CESDIS using several generations and families of processors and network. Beowulf is a concept of clustering commodity computers to form a parallel, virtual supercomputer. It is easy to build a unique Beowulf cluster from components that you consider most appropriate for your applications. Such a system can provide a cost-effective way to gain features and benefits (fast and reliable services) that have historically been found only on more expensive proprietary shared memory systems. The typical architecture of a cluster is shown in Figure 3. As the figure illustrates, numerous design choices exist for building a Beowulf cluster. 12 WHY CLUTERS? The question may arise why clusters are designed and built when perfectly good commercial supercomputers are available on the market. The answer is that the latter is expensive. Clusters are surprisingly powerful. The supercomputer has come to play a larger role in business applications. In areas from data mining to fault tolerant performance clustering technology has become increasingly important. Commercial products have their place, and there are perfectly good reasons to buy a commerciallyproduced supercomputer. If it is within our budget and our applications can keep machines busy all the time, we will also need to have a data center to keep it in. then there is the budget to keep up with the maintenance and upgrades that will be required to keep our investment up to par. However, many who have a need to harness supercomputing power dont buy supercomputers because they cant afford them. Also it is impossible to upgrade them. Clusters, on the other hand, are cheap and easy way to take off-the-shelf components and combine them into a single supercomputer. In some areas of research clusters are actually faster than commercial supercomputer. Clusters also have the distinct advantage in that they are simple to build using

components available from hundreds of sources. We dont even have to use new equipment to build a cluster. Price/Performance 13 The most obvious benefit of clusters, and the most compelling reason for the growth in their use, is that they have significantly reduced the cost of processing power. One indication of this phenomenon is the Gordon Bell Award for Price/Performance Achievement in Supercomputing, which many of the last several years has been awarded to Beowulf type clusters. One of the most recent entries, the Avalon cluster at Los Alamos National Laboratory, "demonstrates price/performance an order of magnitude superior to commercial machines of equivalent performance." This reduction in the cost of entry to high-power computing (HPC) has been due to co modification of both hardware and software over the last 10 years particularly. All the components of computers have dropped dramatically in that time. The components critical to the development of low cost clusters are: 1. Processors - commodity processors are now capable of computational power previously reserved for supercomputers, witness Apple Computer's recent add campain touting the G4 Macintosh as a supercomputer. 2. Memory - the memory used by these processors has dropped in cost right with the processors. 3. Networking Components - the most recent group of products to experience co modification and dramatic cost decreases is networking hardware. High- Speed networks can now be assembled with these products for a fraction of the cost necessary only a few years ago. 4. Motherboards, busses, and other subsystems - all of these have become commodity products, allowing the assembly of affordable computers from off the shelf components 14 COMPARING OLD AND NEW Today, open standards-based HPC systems are being used to solve problems from High-end, floatingpoint intensive scientific and engineering problems to data intensive tasks in industry. Some of the reasons why HPC clusters outperform RISC based systems Include: Collaboration Scientists can collaborate in real-time across dispersed locations- bridging isolated islands of scientific research and discovery- when HPC clusters are based on open source and building block technology. Scalability HPC clusters can grow in overall capacity because processors and nodes can be added as demand increases. Availability Because single points of failure can be eliminated, if any one system component goes Down, the system as a whole or the solution (multiple systems) stay highly available. Ease of technology refresh Processors, memory, disk or operating system (OS) technology can be easily updated, And new processors and nodes can be added or upgraded as needed. Affordable service and support Compared to proprietary systems, the total cost of ownership can be much lower. This includes service, support and training. 15 Vendor lock-in The age-old problem of proprietary vs. open systems that use industryaccepted standards is eliminated. System manageability The installation, configuration and monitoring of key elements of proprietary systems is usually accomplished with proprietary technologies, complicating system management. The servers of an HPC cluster can be easily managed from a single point using readily available network infrastructure and enterprise management software. Reusability of components Commercial components can be reused, preserving the investment. For example, older nodes can be deployed as file/print servers, web servers or other infrastructure servers. Disaster recovery Large SMPs are monolithic entities located in one facility. HPC systems can be collocated or geographically dispersed to make them less susceptible to disaster. 16 LOGICAL VIEW OF CLUSTER A Beowulf cluster uses multi computer architecture, as depicted in figure. It features a parallel computing system that usually consists of one or more master nodes and one or more compute nodes, or cluster nodes, interconnected via widely available network interconnects. All of the nodes in a typical Beowulf cluster are commodity systems- PCs, workstations, or servers-running commodity software such as Linux. 17 The master node acts as a server for Network File System (NFS) and as a gateway to the outside world. As an NFS server, the master node provides user file space and other common system software to the compute nodes via NFS. As a gateway, the master node allows users to gain access through it to the compute nodes. Usually, the master node is the only machine that is also connected to the outside world using a second network interface card (NIC). The sole task of the compute nodes is to execute parallel jobs. In most cases, therefore, the compute nodes do not have keyboards, mice, video cards, or monitors. All access to the client nodes is

18 provided via remote connections from the master node. Because compute nodes do not need to access machines outside the cluster, nor do machines outside the cluster need to access compute nodes directly, compute nodes commonly use private IP addresses, such as the 10.0.0.0/8 or 192.168.0.0/16 address ranges. From a users perspective, a Beowulf cluster appears as a Massively Parallel Processor (MPP) system. The most common methods of using the system are to access the master node either directly or through Telnet or remote login from personal workstations. Once on the master node, users can prepare and compile their parallel applications, and also spawn jobs on a desired number of compute nodes in the cluster. Applications must be written in parallel style and use the message-passing programming model. Jobs of a parallel application are spawned on compute nodes, which work collaboratively until finishing the application. During the execution, compute nodes use 10 standard message-passing middleware, such as Message Passing Interface (MPI) and Parallel Virtual Machine (PVM), to exchange information. ARCHITECTURE A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected standalone computers cooperatively working together as a single, integrated computing resource A node: a single or multiprocessor system with memory, I/O facilities, & OS generally 2 or more computers (nodes) connected together 19 in a single cabinet, or physically separated & connected via a LAN appear as a single system to users and applications provide a cost-effective way to gain features and benefits Three principle features usually provided by cluster computing are availability, scalability and simplification. Availability is provided by the cluster of computers operating as a single system by continuing to provide services even when one of the individual computers is lost due to a hardware failure or other reason. Scalability is provided by the inherent ability of the overall system to allow new components, such as computers, to be assed as the overall system's load is increased. The simplification comes from the ability of the cluster to allow administrators to manage the entire group as a single system. This greatly simplifies the 20 management of groups of systems and their applications. The goal of cluster computing is to facilitate sharing a computer load over several systems without either the users of system or the administrators needing to know that more than one system is involved. The Windows NT Server Edition of the Windows operating system is an example of a base operating system that has been modified to include architecture that facilitates a cluster computing environment to be established. Cluster computing has been employed for over fifteen years but it is the recent demand for higher availability in small businesses that has caused an explosion in this field. Electronic databases and electronic malls have become essential to the daily operation of small businesses. Access to this critical information by these entities has created a large demand for cluster computing principle features. 21 There are some key concepts that must be understood when forming a cluster computing resource. Nodes or systems are the individual members of a cluster. They can be computers, servers, and other such hardware although each node generally has memory and processing capabilities. If one node becomes unavailable the other nodes can carry the demand load so that applications or services are always available. There must be at least two nodes to compose a cluster structure otherwise they are just called servers. The collection of software on each node that manages all cluster specific activity is called the cluster service. The cluster service manages all of the resources, the canonical items in the system, and sees then as identical opaque objects. Resources can be such things as physical hardware devices, like disk drives and network cards, logical items, like logical disk volumes, TCP/IP addresses, applications, and databases. 22 When a resource is providing its service on a specific node it is said to be on-line. A collection of resources to be managed as a single unit is called a group. Groups contain all of the resources necessary to run a specific application, and if need be, to connect to the service provided by the application in the case of client systems. These groups allow administrators to combine resources into larger logical units so that they can be managed as a unit. This, of course, means that all operations performed on a group affect all resources contained within that group. Normally the development of a cluster computing system occurs in phases. The first phase involves establishing the underpinnings

into the base operating system and building the foundation of the cluster components. These things should focus on providing enhanced availability to key applications using storage that is accessible to two nodes. The following stages occur as the demand increases and should allow for much larger clusters to be formed. These larger clusters should have a true distribution of applications, higher performance interconnects, widely distributed storage for easy accessibility and load balancing. Cluster computing will become even more prevalent in the future because of the growing needs and demands of businesses as well as the spread of the Internet. Clustering Concepts Clusters are in fact quite simple. They are a bunch of computers tied together with a network working on a large problem that has been broken down into smaller pieces. There are a number of different strategies we can use to tie them together. There are also a number of different software packages that can be used to make the software side of things work. 23 Parallelism The name of the game in high performance computing is parallelism. It is the quality that allows something to be done in parts that work independently rather than a task that has so many interlocking dependencies that it cannot be further broken down. Parallelism operates at two levels: hardware parallelism and software parallelism. Hardware Parallelism On one level hardware parallelism deals with the CPU of an individual system and how we can squeeze performance out of sub-components of the CPU that can speed up our code. At another level there is the parallelism that is gained by having multiple systems working on a computational problem in a distributed fashion. These systems are known as fine grained for parallelism inside the CPU or having to do with the multiple CPUs in the same system, or coarse grained for parallelism of a collection of separate systems acting in concerts. CPU Level Parallelism A computers CPU is commonly pictured as a device that operates on one instruction after another in a straight line, always completing one-step or instruction before a new one is started. But new CPU architectures have an inherent ability to do more than one thing at once. The logic of CPU chip divides the CPU into multiple execution units. Systems that have multiple execution units allow the CPU to attempt to process more than one instruction at a time. Two hardware features of modern CPUs support multiple execution units: the cache a small memory inside the CPU. The pipeline is a small area of memory inside the CPU where instructions that are next in line to be executed are stored. Both cache and pipeline allow impressive increases in CPU performances. 24 System level Parallelism It is the parallelism of multiple nodes coordinating to work on a problem in parallel that gives the cluster its power. There are other levels at which even more parallelism can be introduced into this system. For example if we decide that each node in our cluster will be a multi CPU system we will be introducing a fundamental degree of parallel processing at the node level. Having more than one network interface on each node introduces communication channels that may be used in parallel to communicate with other nodes in the cluster. Finally, if we use multiple disk drive controllers in each node we create parallel data paths that can be used to increase the performance of I/O subsystem. Software Parallelism Software parallelism is the ability to find well defined areas in a problem we want to solve that can be broken down into self-contained parts. These parts are the program elements that can be distributed and give us the speedup that we want to get out of a high performance computing system. Before we can run a program on a parallel cluster, we have to ensure that the problems we are trying to solve are amenable to being done in a parallel fashion. Almost any problem that is composed of smaller subproblems that can be quantified can be broken down into smaller problems and run on a node on a cluster. System-Level Middleware System-level middleware offers Single System Image (SSI) and high availability infrastructure for processes, memory, storage, I/O, and networking. The single system image illusion can be implemented using the hardware or software infrastructure. This unit focuses on SSI at the operating system or subsystems level. 25 A modular architecture for SSI allows the use of services provided by lower level layers to be used for the implementation of higher-level services. This unit discusses design issues, architecture, and representative systems for job/resource management, network RAM, software RAID, single I/O space, and virtual networking. A number of operating systems have proposed SSI solutions, including MOSIX, Unixware, and Solaris -MC. It is important to discuss one or more such systems as they help students to understand architecture and implementation issues. Message Passing Primitives Although new high-performance protocols are available for cluster computing, some instructors may want provide students with a brief introduction to message passing programs using the BSD Sockets interface Transmission Control Protocol/Internet Protocol (TCP/IP) before introducing more complicated parallel

programming with distributed memory programming tools. If students have already had a course in data communications or computer networks then this unit should be skipped. Students should have access to a networked computer lab with the Sockets libraries enabled. Sockets usually come installed on Linux workstations. Parallel Programming Using MPI An introduction to distributed memory programming using a standard tool such as Message Passing Interface (MPI)[23] is basic to cluster computing. Current versions of MPI generally assume that programs will be written in C, C++, or Fortran. However, Java-based versions of MPI are becoming available. 26 Application-Level Middleware Application-level middleware is the layer of software between the operating system and applications. Middleware provides various services required by an application to function correctly. A course in cluster programming can include some coverage of middleware tools such as CORBA, Remote Procedure Call, Java Remote Method Invocation (RMI), or Jini. Sun Microsystems has produced a number of Java-based technologies that can become units in a cluster programming course, including the Java Development Kit (JDK) product family that consists of the essential tools and APIs for all developers writing in the Java programming language through to APIs such as for telephony (JTAPI), database connectivity (JDBC), 2D and 3D graphics, security as well as electronic commerce. These technologies enable Java to interoperate with many other devices, technologies, and software standards. Single System image A single system image is the illusion, created by software or hardware, that presents a collection of resources as one, more powerful resource. SSI makes the cluster appear like a single machine to the user, to applications, and to the network. A cluster without a SSI is not a cluster. Every SSI has a boundary. SSI support can exist at different levels within a system, one able to be build on another. 27 Single System Image Benefits Provide a simple, straightforward view of all system resources and activities, from any node of the cluster Free the end user from having to know where an application will run Free the operator from having to know where a resource is located Let the user work with familiar interface and commands and allows the administrators to manage the entire clusters as a single entity Reduce the risk of operator errors, with the result that end users see improved reliability and higher availability of the system Allowing centralize/decentralize system management and control to avoid the need of skilled administrators from system administration Present multiple, cooperating components of an application to the administrator as a single application 28 Greatly simplify system management Provide location- independent message communication Help track the locations of all resource so that there is no longer any need for system operators to be concerned with their physical location Provide transparent process migration and load balancing across nodes. Improved system response time and performance High speed networks Network is the most critical part of a cluster. Its capabilities and performance directly influences the applicability of the whole system for HPC. Starting from Local/Wide Area Networks (LAN/WAN) like Fast Ethernet and ATM, to System Area Networks (SAN) like Myrinet and Memory Channel Eg. Fast Ethernet 100 Mbps over UTP or fiber-optic cable MAC protocol: CSMA/CD 29 COMPONENTS OF CLUSTER COMPUTER 1. Multiple High Performance Computers a. PCs b. Workstations c. SMPs (CLUMPS) 2. State of the art Operating Systems a. Linux (Beowulf) b. Microsoft NT (Illinois HPVM) c. SUN Solaris (Berkeley NOW) d. HP UX (Illinois - PANDA) e. OS gluing layers(Berkeley Glunix) 3. High Performance Networks/Switches a. Ethernet (10Mbps), b. Fast Ethernet (100Mbps), c. Gigabit Ethernet (1Gbps) d. Myrinet (1.2Gbps) e. Digital Memory Channel f. FDDI 4. Network Interface Card a. Myrinet has NIC b. User-level access support 5. Fast Communication Protocols and Services a. Active Messages (Berkeley) b. Fast Messages (Illinois) c. U-net (Cornell) d. XTP (Virginia)

30 6. Cluster Middleware a. Single System Image (SSI) b. System Availability (SA) Infrastructure 7. Hardware a. DEC Memory Channel, DSM (Alewife, DASH), SMP Techniques 8. Operating System Kernel/Gluing Layers a. Solaris MC, Unixware, GLUnix 9. Applications and Subsystems a. Applications (system management and electronic forms) b. Runtime systems (software DSM, PFS etc.) c. Resource management and scheduling software (RMS) 10. Parallel Programming Environments and Tools a. Threads (PCs, SMPs, NOW..) b. MPI c. PVM d. Software DSMs (Shmem) e. Compilers f. RAD (rapid application development tools) g. Debuggers h. Performance Analysis Tools i. Visualization Tools 11. Applications a. Sequential b. Parallel / Distributed (Cluster-aware app.) 31 CLUSTER CLASSIFICATIONS Clusters are classified in to several sections based on the facts such as 1)Application target 2) Node owner ship 3) Node Hardware 4) Node operating System 5) Node configuration. Clusters based on Application Target are again classified into two: High Performance (HP) Clusters High Availability (HA) Clusters Clusters based on Node Ownership are again classified into two: Dedicated clusters Non-dedicated clusters Clusters based on Node Hardware are again classified into three: Clusters of PCs (CoPs) Clusters of Workstations (COWs) Clusters of SMPs (CLUMPs) Clusters based on Node Operating System are again classified into: Linux Clusters (e.g., Beowulf) Solaris Clusters (e.g., Berkeley NOW) Digital VMS Clusters HP-UX clusters 32 Microsoft Wolfpack clusters Clusters based on Node Configuration are again classified into: Homogeneous Clusters -All nodes will have similar architectures and run the same OSs Heterogeneous Clusters- All nodes will have different architectures and run different OSs ISSUES TO BE CONSIDERED Cluster Networking If you are mixing hardware that has different networking technologies, there will be large differences in the speed with which data will be accessed and how individual nodes can communicate. If it is in your budget make sure that all of the machines you want to include in your cluster have similar networking capabilities, and if at all possible, have network adapters from the same manufacturer. Cluster Software You will have to build versions of clustering software for each kind of system you include in your cluster. Programming Our code will have to be written to support the lowest common denominator for data types supported by the least powerful node in our cluster. With mixed machines, the more powerful machines will have attributes that cannot be attained in the powerful machine. Timing This is the most problematic aspect of heterogeneous cluster. Since these machines have different performance profile our code will execute at 33 different rates on the different kinds of nodes. This can cause serious bottlenecks if a process on one node is waiting for results of a calculation on a slower node. The second kind of heterogeneous clusters is made from different machines in the same architectural family: e.g. a collection of Intel boxes where the machines are different generations or machines of same generation from different manufacturers. Network Selection There are a number of different kinds of network topologies, including buses, cubes of various degrees, and grids/meshes. These network topologies will be implemented by use of one or more network interface cards, or NICs, installed into the head-node and compute nodes of our cluster. Speed Selection No matter what topology you choose for your cluster, you will want to get fastest network that your budget allows. Fortunately, the availability of high speed computers has also forced the development of high speed networking systems. Examples are 10Mbit Ethernet, 100Mbit Ethernet, gigabit networking, channel bonding etc. 34 FUTURE TRENDS - GRID COMPUTING As computer networks become cheaper and faster, a new computing paradigm, called the Grid has evolved. The Grid is a large system of computing resources that performs tasks and provides to users a single point of access, commonly based on the World Wide Web interface, to these distributed resources. Users consider the Grid as a single computational resource. Resource management software, frequently referenced as middleware, accepts jobs submitted by users and schedules them for execution on appropriate systems in the Grid, based upon resource management policies. Users can submit thousands of jobs at a time without being concerned about where they run. The Grid may scale from single systems to supercomputer-class compute farms that utilize thousands of processors. Depending on the type of applications, the interconnection between the Grid parts can be performed

using dedicated high-speed networks or the Internet. By providing scalable, secure, high-performance mechanisms for discovering and negotiating access to remote resources, the Grid promises to make it possible for scientific collaborations to share resources on an unprecedented scale, and for geographically distributed groups to work together in ways that were previously impossible. Several 35 examples of new applications that benefit from using Grid technology constitute a coupling of advanced scientific instrumentation or desktop computers with remote supercomputers; collaborative design of complex systems via high-bandwidth access to shared resources; ultra-large virtual supercomputers constructed to solve problems too large to fit on any single computer; rapid, largescale parametric studies. The Grid technology is currently under intensive development. Major Grid projects include NASAs Information Power Grid, two NSF Grid projects (NCSA Alliances Virtual Machine Room and NPACI), the European DataGrid Project and the ASCI Distributed Resource Management project. Also first Grid tools are already available for developers. The Globus Toolkit [20] represents one such example and includes a set of services and software libraries to support Grids and Grid applications. 36 CONCLUSION Clusters are promising Solve parallel processing paradox Offer incremental growth and matches with funding pattern New trends in hardware and software technologies are likely to make clusters more promising and fill SSI gap. Clusters based supercomputers (Linux based clusters) can be seen everywhere! 37 REFERENCE http://www.buyya.com http://www.beowulf.org http://www.clustercomp.org http://www.sgi.com http://ww w.thu.edu.tw/~sci/journal/v4/000407.pdf http://www.dgs.monash.edu.au/~rajkumar/cluster http://www.c fi.lu.lv/teor/pdf/LASC_short.pdf http://www.webopedia.com http://www.howstuffworks.com Reference: http://seminarprojects.com/Thread-cluster-computing-full-report#ixzz1qKQT4bSx

You might also like