You are on page 1of 11

Model Answers to Examination Papers for December 2008

School Department Level Title of Paper Course Code

Computing and Mathematical Sciences Computer Science Three Distributed Information System COMP1303

Model solutions are indicative of answers that would gain a good grade if clearly written with additional explanatory text. To gain a very good or excellent grade, a demonstration of a greater depth and breadth of knowledge is normally expected. In some cases alternative answers to a question may be accepted.

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 1 of 11

1 a) Purpose of DNS - The main purpose of the DNS is to resolve human-friendly URLs into IP addresses (numerical, logical addresses).

Architecture of DNS - Domain Name Service (DNS) is a hierarchical system of name servers, each authoritive for one or more domains within the Domain Name Space. Organisation of name servers, root servers, then those lower down the hierarchy to leaf (zonedomain) level. Delegation of responsibility resulting in a scalable and easy-to-modify distributed database

The DNS hierarchical structure

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 2 of 11

The following four types of server are used in the DNS: Primary master retrieves data from the host that it runs on and its data is held in a stored database. The Secondary master gets its data from another master that authoritative for the zone (i.e. a Primary Master). It contacts an authoritative name server and pulls the zone data over greatly reduce administrative load. The Caching name server does not have a database of mappings between IP addresses and names at start-up. It knows of Primary and Secondary servers which can supply such information if required. The use of caching servers are used to reduce the load on Primary and Secondary servers. The slave name server operates in a similar way to a Caching name server however it is less sophisticated than the other types of server and cannot not follow redirections. Query types could be either recursive or iterative. With iterative queries the NS gives the best answer it already knows. This might result in the name server referring the requester to a closer name server that it knows. There is no additional querying of other name servers. In contrast with recursive query resolution is managed by a single NS. The NS must return the final answer. This may involve querying other name servers and following referrals received thus resulting in further queries being sent to other name servers.

Recursive query

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 3 of 11

b) Two Tier Architecture The Traditional client/server database architecture implements a two-tier approach. The application logic exists in either the user interface or within the database server or both. The location of the partitioned processes gave rise to terms such as fat client and fat server depending on the location of the bulk of the application processing. Most client/server systems implement the fat client approach. Regardless of the version of two-tier implemented the system can face scalability, performance and flexibility problems. For larger systems it is better to adopt a three-tier or multi-tier approach. Three Tier Architecture In response to the drawbacks of the two-tiered network database architecture a three-tiered or multi-tiered approach can be adopted. This approach uses an application server between the clients and the back end database server. The application server contains the bulk of the application logic, with the remainder contained on the client. It accepts requests from the clients, obtains the desired data from the data sources, processes the results and sends them back to the client. There are many benefits to three-tier systems, they are more scalable and easier to control. The middle tier enables the system to handle more client connections, implement better security and provide easier maintenance. Connectivity to heterogeneous data sources is greatly simplified as the required database drivers are contained at a single location, requiring fewer client licenses. It is with three tier and multi-tier architectures that the future of client/server database applications lie, true database independence is not be possible without three tier architectures. Any application example can be provided to illustrate the general discussion and to put the various points into context.

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 4 of 11

c) Access transparency Access to remote and local resources using the same operations, VFS being a good example of access transparency with in a DIS. It is important as it removes complexity as users and software have a single method rather than potentially lots of different methods. Improves scalability . Location transparency Location of resources can vary without the user needing to know the location in order to access the resource. This can be achieved using standard directory service techniques and is important in many ways, can change location for example without the need to inform all users of the resource. Failure transparency Failures are not visible to users but a hidden. This can be achieved through replicated services as well as other methods. This is important as failures in a DIS will likely be partial and can be made transparent with careful management. Concurrency transparency Users of a resource can access resources seemingly concurrently while the resources themselves might maintain single write/read access at various points. This is achieved through read/write locking and exclusion. Updates can be written to be applied once locks are released. This is important as multiple access to resources will occur and ideally process should not be delayed while waiting for access to shared resources. Implementation transparency The implementation of the resource/object is not important to its functioning with a DIS. Instead its interfaces and behaviour are important and are all that is often required for use by a client. One method of achieving this is through the use of middleware and IDL, whereby common access / location transparency is achieved and resources can be altered / swapped without the need to update user processes. This is important as it promotes flexibility / portability / scalability. Processes can be altered without impacting on other parts of the DIS.

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 5 of 11

2 a) Clock drift and synchronisation to central clock poses problems as communication takes unknown amount of time. Clocks drift as there is inherent variability in the quartz crystals oscillations that provide the timing within computers. Thus two computers with identical clock settings would drift apart over time and become inconsistent. The synchronization problem can be summarised as follows: 1. Client polls server for a new time value 2. Message is prepared (this takes time - measurable) 3. Remote request made (this takes time - measurable) 4. Message passes through networks (this takes time not measurable) 5. Message is received (this takes time measurable) 6. Message is responded to (this takes time measurable) 7. Response is transmitted to client (this takes time not measurable) 8. Response is received and processed (this takes time measurable) 9. Client can update their time Thus, a new time value is out of date once received and furthermore we cannot tell how much out of date Time required to order events, same relative time required to interleave transactions correctly. Incorrect timing can lead to incorrect application of operations and thus inconsistent states. Replicated services require consistent states, transactions, concurrency control, etc.

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 6 of 11

(b) i) Logical numerical value increases monotonically with each operation, this technique cannot be used to measure the time interval between events or the duration of events. Physical representation of real-time (wall-clock time) this approach suffers from all of the issues of synchronisation can be used to measure event-duration (such as network round-trip time, or transaction duration). Below are some techniques that a student could show: Christian method T1 T0 Client Time? Time=T

Time server

Processing

Time

Message_delay = (T1 T0 processing)/2 New client time = T + message_delay

Berkeley method

Time server Time? Client 1 Time? Client 2 C2_time Diff(2) C1_time Diff(1)

Server: diff(i) = server_time (ci_time + message_delay) Client: ci_time = ci_time + diff(i)

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 7 of 11

Vector Clock
(1,0,0) (2,0,0) p1 a b m1 (2,1,0) p2 c (0,0,1) p3 e f d m2 (2,2,2) (2,2,0) Physical time

Timestamps are carried with messages and local Lamport clocks are updated with new information

ii) Physical clocks are best employed where absolute time-stamps are required (financial transactions, event logging etc), inter-event timing, activity-duration timing, performance measurement etc, Logical clocks are best employed where the event ordering is the only important consideration (such as in multicast / group communications, multi-entity transaction coordination, 2PC etc). An example could detail time stamping of transactions in banking / stock trading etc using physical clock. An example for a logical clock could involve maintenance of event-order consistency when performing update transactions at replicated / distributed databases.

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 8 of 11

3 a) The concept of mutual exclusion mainly applies to concurrent programming to avoid the simultaneous use of un-shareable resources by pieces of computer code called within critical sections. For example, its easy to understand the concept so far as physical resources are concerned. A printer can print one document at once. In essence for the duration of the printing of the document the printer offers exclusive rights to the document. Others must wait for the document to finish before they can print. Critical sections of code are designed in a similar way. Consider the printer again, its intuitive to see that while its printing a page, it should not allow another user to interrupt and print onto the page. This would result in the page having printing from multiple sources. A similar concept needs to apply to code and memory. If one thread is accessing and changing some data, and another thread then access the same data and manipulates it in some way, then there is a danger that it could become corrupt. A good example of this concept is the lost update in transactions solved using exclusive locking. The key concept here, is that we have critical sections of code accessing shared data, and that the shared data must be protected, so that other processes which read from or write to the chunk of data are excluded from running. Hence the name mutual exclusion.

b) The following functional requirements are important. The proportion of time that a master exists should be very high. A replacement must be found quickly if the current master is removed. Multi-master scenarios must be detected and resolved quickly. Spurious elections caused by falsely detecting master failure should be avoided. Normal-mode communication complexity must be low. Election-mode communication complexity must be low. Communications overhead (the mean total communication bandwidth required by the election algorithm) must be low. With respect to non-functional requirements the following are all important aspects for all election algorithms: Scalability, Robustness (Fault Tolerance), Efficiency and Low latency.

c) Replication is useful but does introduce additional costs. For example, the cost of additional equipment, and configuration and management of the replication services and equipment. There is also a need for consistency control and thus additional communication (which also costs money as well as time). That said, there are many benefits to the use of replication. For example, enhanced availability through multiple server, a better geographic dispersion (of users, of resources), It also facilitates the dispersion of processing workload (load sharing /distribution and parallel processing) and offers redundancy and thus better failure transparency, Replication and consistency: However, the main problem with the use of replication is that it becomes harder to maintain consistency as the replication factor increases. The communication cost of maintaining consistency also increases. Disconnected operation further complicates matters (invalidation policies, Lazy update policies etc.).

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 9 of 11

4 a) Kerberos keeps a database of its clients (Principals) and their private keys. Clients and Services that require authentication must first register with Kerberos. Kerberos stores information about its principals in a database. The database holds records of each principal, it contains: Their Name, Private Key, Expiration Date (of the principal) and Administrative information. It also stores other user information (real name, phone number, etc.) all stored within a Hesiod name-server so that sensitive information can be handled separately by Kerberos. An Authentication Server (AS) performs read-only operations on the Kerberos database. Its activities include, the authentication of Principals, the generation of Session Keys. A Ticket Granting Server is user by clients to obtain a ticket to access other servers. The client sends a request to the TGS. This contains the name of the server, a Ticket-Granting Ticket (TGT), an Authenticator. The TGS checks the Authenticator and TGT and if valid, the TGS generates a new Random Session Key to be used between the Client and the new Server.

Operation of Kerberos Mutual authentication refers to authenticator encrypted with timestamp (in shared C-S key) Authenticator sent from C to S, S decrypts, adds 1 to timestamp, re-encrypts, sends from S to C, C decrypts and checks value. Ticket A ticket contains (client and server ids, expiry time etc). The TGS creates tickets for the client to present to the server. The ticket is encrypted in the key of the server so the client cannot read it but simply pass it on. Has a lifetime of typically 8 hours and identifies senderrecipient pair. This is encrypted thus {s,c,addr,timestamp,life,Ks,c}Ks Authenticator Authenticator is used in mutual authentication whereby the server increments the timestamp re-encrypts and returns it. An authenticator comprises, the client name, workstations IP address, current workstation time, all Encrypted in the session key (from ticket), i.e. Model Answer for COMP1303 Distributed Information Systems December 2008 Page 10 of 11

{c,addr,timestamp}Ks,c. A new authenticator is generated by client for each transaction in shared client-server key. Authenticator and ticket contents must match, thus proving the sender is the proper user of the accompanying ticket. Prevention of replay Combination of tickets, timestamps, and one-use-only authenticators. The authenticator is used to prevent repay of ticket. The client sends a time stamp to server using the servers key to encrypt the message. The server decrpyts the message and increments the timestamp returning the message using the clients key. The client checks the increment to ensure the server was able to decrypt properly. Note, a nonce may be used in place of a timestamp.

b) How it works The digital signatures method operates using the public key encryption method. For example lets assume Alice wants to publish a document M in such a way that anyone can verify that it is from her. Alice computes a fixed-length digest of the document Digest(M). Alice encrypts the digest in her private key, appends it to M and makes the resulting signed document (M, {Digest(M)}KApriv) available to the intended users. Bob obtains the signed document, extracts M and computes Digest(M). Bob uses Alice's public key to decrypt {Digest(M)}KApriv and compares it with his computed digest. If they match, Alice's signature is verified The key points are the use of a digest function, use of keys, use of RSA as means for private key encryption, Secret keys can also be used however care must be taken in how the keys are distributed. Problem it solves This technique or something similar is an essential requirement for secure systems as they are able to certify information, for example to provide trustworthy statements binding users identities to public keys or binding access rights or roles to users identities. A synonym would be bankers cheque, whereby a signature validates the cheque as authentic, digital signature technique tries to ensure that information is authentic through the use of a signature. DS provide authentication (message can be trusted) and integrity (message can be checked to ensure it has not been altered) Issues and problems The distribution of keys RSA method is needs to be secure. This can often be difficult as it requires a chain of authoritative and trusted servers. There is a potential forgery if private key method is used. Also non repudiation not guaranteed, comprised key = compromised certificates. Finally the method itself does not include time stamping and thus messages could be replayed.

Model Answer for COMP1303 Distributed Information Systems December 2008 Page 11 of 11

You might also like