Threads and Input/Output in the Sythesis Kernel
MassalinPu
Principle of Frugality: use the least powerful solution to a
given problem. Kernel code synthesis
and reduced synchronization are examples. "Simplification of the normal
case."
Kernel code synthesis: generate specialized (short and fast)
kernel routines for specific situations
ex. don't save
floating point registers on a context switch if the currently running thread
doesn't do any floating
point operations.
synthesize the context switch code at run-time appropriately for
efficiency.
Three methods:
1) Factoring invariants: bypasses redundant computations
2) Collapsing Layers: eliminates unnecessary procedure calls
and context switches
3) Executable Data Structures: shortens data structure
traversal time
Synthesis Kernel Components:
Quabjects (Objects): procedures + data.
ex. threads, i/o device servers
Quabjects composed of smaller building blocks-- monitors,
queues, schedulers. Some unusual
building blocks-- switches (direct interrupts to appropriate service routines),
pump (copies input to output), schedulers (use guages to collect data for
scheduling decisions).
Several types of queues: synchronous (blocks on full or
empty) or asynchronus (signals on full or empty). For each type, two implementations: dedicated or optimistic
depending on whether there are one/many producers/consumers.
Reduced Synchronization
Three methods:
1) Code Isolation - synthesize short code to manipulate data
structures without synchronization. ex. each thread modifies its own thread
table entry (TTE) without having to lock the thread table.
2) Procedure Chaining - serialize the executing of
conflicting threads (threads that want to access the same resources) instead of
having them try to run concurrently.
3) Optimistic Synchronization - assume interference between
threads is rare; make changes w/o locking, and then rollback if there was
interference
Optimistic Queues.
single/multiple producer, single/multiple consumer.
** WORTH LOOKING AT THE PAPER TO CHECK OUT THE CODE FOR
THESE! *
Pretty cool queue structures-- single producer single consumer can be done with no
synchronization primitives at all-- producer only updates head until after
elements have been inserted in queue, which prevents consumer from seeing them
too early.
multiple producer single consumer -- multiple producers use
compare and swap to determine which one gets to increment head first, and
consumer is protected from reading too early by have a separate bit-flag array
to let consumer know when an element has been inserted. great performance too-- 11 instructions if
compare and swap is successful, and 20 instructions if compare and swap needs
to be retried once.
Threads: abstraction of CPU
Kernel synthesized code goes into protected area to avoid
tampering.
Kernel code is run by the currently executing thread; not in
a thread in a separate kernel process.
Such a thread is said to be running in kernel mode. Once a thread switches to this mode, kernel
address space becomes accessible.
Each OS resource has a queue of threads that are waiting to access it. These threads are chained together, so no
explicit schedule or synchronization mechanism is necessary. Synchronization and priorities are
implemented by modifiying a threads position in the queue.
Context Switching:
1) only switch the part of the context being used (don't
save FP regs if they are not being used; saves 50% of context switch teim)
2) use executable data structures to minimize critical path
(ready to run threads are chained in an "executable" circular
queue). Jmp instr in context switch out
proc points to context switch in proc of next thread. Two different entry points for switch in proc-- one if address
space needs to change (thread in different process), and another one is no
address space change needs to take place (thread in same process).
Context switch only takes 11 microseconds!
Each thread has its own interrupt handling routings and
system calls synthesized code.
I/O Device Servers: abstraction of physical I/O devices