Threads and Input/Output in the Sythesis Kernel

MassalinPu

 

Principle of Frugality: use the least powerful solution to a given problem.  Kernel code synthesis and reduced synchronization are examples. "Simplification of the normal case."

 

Kernel code synthesis: generate specialized (short and fast) kernel routines for specific situations

ex.  don't save floating point registers on a context switch if the currently running thread doesn't do any floating

point operations.  synthesize the context switch code at run-time appropriately for efficiency.

 

Three methods:

1) Factoring invariants: bypasses redundant computations

2) Collapsing Layers: eliminates unnecessary procedure calls and context switches

3) Executable Data Structures: shortens data structure traversal time

 

Synthesis Kernel Components:
Quabjects (Objects): procedures + data.  ex. threads, i/o device servers

Quabjects composed of smaller building blocks-- monitors, queues, schedulers.  Some unusual building blocks-- switches (direct interrupts to appropriate service routines), pump (copies input to output), schedulers (use guages to collect data for scheduling decisions).

 

Several types of queues: synchronous (blocks on full or empty) or asynchronus (signals on full or empty).  For each type, two implementations: dedicated or optimistic depending on whether there are one/many producers/consumers.

 

Reduced Synchronization

 

Three methods:

1) Code Isolation - synthesize short code to manipulate data structures without synchronization. ex. each thread modifies its own thread table entry (TTE) without having to lock the thread table.

2) Procedure Chaining - serialize the executing of conflicting threads (threads that want to access the same resources) instead of having them try to run concurrently. 

3) Optimistic Synchronization - assume interference between threads is rare; make changes w/o locking, and then rollback if there was interference

 

Optimistic Queues.  single/multiple producer, single/multiple consumer. 

** WORTH LOOKING AT THE PAPER TO CHECK OUT THE CODE FOR THESE! *

Pretty cool queue structures--  single producer single consumer can be done with no synchronization primitives at all-- producer only updates head until after elements have been inserted in queue, which prevents consumer from seeing them too early.

multiple producer single consumer -- multiple producers use compare and swap to determine which one gets to increment head first, and consumer is protected from reading too early by have a separate bit-flag array to let consumer know when an element has been inserted.  great performance too-- 11 instructions if compare and swap is successful, and 20 instructions if compare and swap needs to be retried once.

 

Threads: abstraction of CPU

 

Kernel synthesized code goes into protected area to avoid tampering.

 

Kernel code is run by the currently executing thread; not in a thread in a separate kernel process.  Such a thread is said to be running in kernel mode.  Once a thread switches to this mode, kernel address space becomes accessible. 


Each OS resource has a queue of threads that are waiting to access it.  These threads are chained together, so no explicit schedule or synchronization mechanism is necessary.  Synchronization and priorities are implemented by modifiying a threads position in the queue.

 

Context Switching:

1) only switch the part of the context being used (don't save FP regs if they are not being used; saves 50% of context switch teim)

2) use executable data structures to minimize critical path (ready to run threads are chained in an "executable" circular queue).  Jmp instr in context switch out proc points to context switch in proc of next thread.  Two different entry points for switch in proc-- one if address space needs to change (thread in different process), and another one is no address space change needs to take place (thread in same process).

 

Context switch only takes 11 microseconds!

 

Each thread has its own interrupt handling routings and system calls synthesized code.

 

I/O Device Servers: abstraction of physical I/O devices