Professional Documents
Culture Documents
More history
Early focus was on RPC environments Culminated in DCE (Distributed Computing Environment), standardizes many aspects of RPC Then emphasis shifted to performance, many systems improved by a factor of 10 to 20 Today, RPC often used in object-oriented systems employing CORBA or COM standards.
and now SOAP & web services
Reliability & correctness issues are more evident than in the past.
Compilation stage
Server defines and exports a header file giving interfaces it supports and arguments expected. Uses interface definition language (IDL) Client includes this information Client invokes server procedures through stubs
provides interface identical to the server version responsible for building the messages and interpreting the reply messages (i.e., for packing and unpacking) passes arguments by value, never by reference may limit total size of arguments, in bytes
Binding stage
Occurs when client and server program first start execution Server registers its network address with name directory, perhaps with other information Client scans directory to find appropriate server Depending on how RPC protocol is implemented, may make a connection to the server, but this is not mandatory
Data in messages
We say that data is marshalled into a message and demarshalled from it Representation needs to deal with byte ordering issues (big-endian versus little endian), strings (some CPUs require padding), alignment, etc Must encode info in the messages that allows other side to understand the message
E.g, how to find start of an object in the message Both sides must agree in advance on packed representation to be used (typically use a standard rep)
Goal is to be as fast as possible on the most common architectures, yet must also be very general
Request marshalling
Client (stub) builds a message containing arguments, indicates what procedure to invoke Due to need for generality, data representation a potentially costly issue! Performs a send I/O operation to send the message Performs a receive I/O operation to accept the reply Unpacks the reply from the reply message Returns result to the client program
Client procedure calls client stub in normal way Client stub builds message, calls local OS Client's OS sends message to remote OS Remote OS gives message to server stub Server stub unpacks parameters, calls server Server does work, returns result to the stub Server stub packs it in message, calls local OS Server's OS sends message to client's OS Client's OS gives message to client stub Stub unpacks result, returns to client
Client/server programs do not know what stubs are doing Think they are calling a library function from the header file
Need RPC specification file (square.x): defines procedure name, arguments & results Run rpcgen square.x: generates square.h, square_clnt.c, square_xdr.c, square_svc.c square_clnt.c & square_svc.c: Stub routines for client & server square_xdr.c: XDR (External Data Representation) code - takes care of data type conversions
// procedure #
procedure
Corresponding message
*Shroader, M and Burrows, Performance of Firefly RPC, 12th ACM Symposium On Operating Systems Principles, December 1989.
Typical optimizations?
Compile the stub inline to put arguments directly into message Two versions of stub; if (at bind time) sender and dest. found to have same data representations, use host-specific rep. Use a special send, then receive system call (requires O/S extension) Optimize the O/S kernel path itself to eliminate copying treat RPC as the most important task the kernel will do
What about complex structures, pointers, big arrays? These will be very costly, and perhaps impractical to pass as arguments Most implementations limit size, types of RPC arguments. Very general systems less limited but much more costly.
Big messages
client sends request as a burst server
RPC semantics
Normally: client sends request, server executes, returns response What happens when server or network fails? Hang forever waiting for reply
Models regular procedure call
RPC semantics
At most once: request is processed 0 or 1 times Exactly once: request is always processed 1 time At least once: request processed 1 or more times
... but exactly once is impossible because we cant distinguish packet loss from true failures! (did the net fail to deliver message or server crash before sending the ack that my paris ticket purchase was satisfied?) In both cases, RPC protocol simply times out.
Costs may be very high ... so RPC is actually not very transparent!
RPC example
LRPC
O/S and dest initially are idle
LRPC
Control passes directly to dest
Performance work is producing enormous gains: from the old 75ms RPC to modern-day RPC, research has led to a factor of 1000 performance improvement!
Multithreading debate
Three major options:
Single-threaded server: only does one thing at a time, uses send/recv system calls and blocks while waiting
Non-blocking single-threaded server with bookkeeping: on an I/O for a request, saves state in pending activities table, starts on another request
Multi-threaded server: internally concurrent, each request spawns a new thread to handle it Upcalls: event dispatch loop does a procedure call for each incoming event, like for X11 or PCs running Windows.
Multithreading
Idea is to support internal concurrency as if each process was really multiple processes that share one address space Thread scheduler uses timer interrupts and context switching to mimic a physical multiprocessor using the smaller number of CPUs actually available
Multithreaded RPC
Each incoming request is handled by spawning a new thread Designer must implement appropriate mutual exclusion to guard against race conditions and other concurrency problems Ideally, server is more active because it can process new requests while waiting for its own RPCs to complete on other pending requests
Negatives to multithreading
Users may have little experience with concurrency and will then make mistakes Concurrency bugs are very hard to find due to nonreproducible scheduling orders Reentrance & ordering issues can come as an undesired surprise different data update behaviors Threads need stacks hence consumption of memory can be very high Deadlock remains a risk, now associated with concurrency control Stacks for threads must be finite and can overflow, corrupting the address space
SCHED
event
SCHED
event
SCHED
Eventually, application becomes bloated, begins to thrash. Performance drops and clients may think the server has failed
event
Upcall model
Common in windowing systems Each incoming event is encoded as a small descriptive data structure User registers event handling procedures Dispatch loop calls the procedures as new events arrive, waits for the call to finish, then dispatches a new event