* The V Distributed System * Author: D.R. Cheriton V was an early distributed system incorporating a micro-like-kernel, IPC, threads (called processes in V), and multicast. Design Philosophy: 1) Fast communication is the most critical facility 2) protocols, not software, define the system 3) A small kernel can support the basic protocols, IPC, and address spaces while other functionality can be deferred to user-level servers. Though they don't say "microkernel" in this paper that's what it seems like. IPC - Kernel has facilities for read/write operations that operate like RPC from the client perspective. From the server perspective the messages are either handled as messages, with only one message handled at a time and other requests queued, or like RPC with concurrent request handling (perhaps one request/thread or process). IPC is optimized heavily as it is the basis for almost all system services. For example, the system is optimized to handle 32-byte fixed-length messages that constitute > 50% of the message traffic. The VMTP transport protocol is connectionless and optimized for RPC-like calls. Another optimization is pre-initialized VMTP headers. When a fixed-length message is sent the data is transferred from registers directly into the header and sent into a network queue. Interesting fact: "it is faster to access a copy of the block at a remote server that has the data in its RAM cache than to read it from a local disk." Multicast - Groups of processes can be named; these processes can be on different machines. A process can belong to many groups. IPC can be used to to send a message to a group of processes, and also to receive multiple replies. Uses: implementation of naming system, clock sync, distributed scheduling, replication. Processes - "Processes" are what we would call threads. Their creation is separate from the creation of an address space, and there can be many processes that share an address space. The kernel schedules on a per-processes basis, not a per-VM basis. Interesting implementation: only one kernel stack per processor, not per process: "The simplicity of the kernel operations means that a process does not need to maintain state on a kernel stack when it blocks as part of kernel operation." Memory Management - Address spaces are entirely composed of files (or other UIO objects) mapped into regions. Page faults are handled similar to mmap'ed files in unix; the file is used as the backing store. I/O - V uses the system-level UIO interface, which is block oriented and stateful. As a system-level interface UIO is used by an I/O runtime library to implement the abstraction expected by applications. Naming - V uses a three level naming system with 1) character-string names, 2) object identifiers, and 3) entity, process, and group identifiers. Character string names are like file names, they are persistent and unique. These names are assigned locally by the object managers for the objects being named. Each object manager registers itself in the name handling group with a globally unique prefix. Clients can locate the appropriate object manager for a name by multicasting to the name handling group, and only the object managing that namespace responds. When an invalid name is queried for or the object manager is not available, there is a global name service that contains a database of the portions of the global name space that actually correspond to object managers. Caching is performed of positive and in particular negative replies to avoid unnecessary multicasting. Object identifiers are transient identifiers that are returned once a character string name maps to an object. The identifier consists of a manager-id and a local-object-id. The manager-id is an IPC identifier that talks to the object manager. The local-object-id specifies the object relative to this object manager. Since IPC identifiers are used, the "object manager" may be a process group implementing a replicated object manager, and there are ways of maintaining availability in the face of some object manager failures. Entity, Process and Group identifiers are low-level, host-independent communication endpoints. They are globally unique across the system to allow for migration. The paper also describes other services built into V, and some applications. Cheriton concludes that 1) IPC speed was critical to the utility of the system 2) protocols and interfaces define the system, 3) a distributed kernel can be used as the base for distributed systems: "In fact, some of our students are disappointed that there are not more distributed systems issues in the servers and the commands for V", 4) "We now conjecture that for every fast design, there exists an acceptably elegant design with comparable performance."