Programming language issues to discuss


Supporting container classes of multiple types of objects without
having to cast returned value on lookup
( Array -> Array<ObjectType> -> Array<anyobject> )

--

Garbage collection: inappropriate for many systems programming tasks
because memory allocation/deallocation is performance-critical and
should be handled by the application.  Alternatives:
application-provided reference counting; garbage collection as an
audit mechanism, turned off in deployed applications.

Reference counting is often put down as insufficient because it does
not handle circular references.  However, circular references often
(mostly?)  fall into two categories:

- intra-class, like a LinkedList arranged in a circular structure

- superobject--component, where the component has an explicit
  register/deregister interface, so the back pointer need not be
  reference counted

--

Pros and cons of C++: big pro is that the language provides low-level
access as well as high-level abstractions; big con is the large number
of features, leading to misuse

Could misuse be avoided by more appropriate default behavior, favoring
safety/simpler semantics over performance?

- references default to const
- member functions default to virtual
- destructors default to virtual
- no default copy constructors
- others?

--

Actors (from Agha, Concurrent Object-Oriented Programming, CACM, Sept.
1990)

Actor model of concurrent programming: the behavior of an object is a
function of incoming communications.  Actors communicate by message
asynchronous message passing.  Behaviors include sending messages,
creating other actors, and replacing an actor's current state or
functions with new state or functions.

Every object is an actor with its own thread of execution, responding
to incoming messages with behavior expressed as functions rather than
imperative procedures.

Claimed advantages of actor model:

- Behavior replacement is a better abstraction than explicit, von
Neumann-style assignment to state, because the latter is too closely
tied to the characteristics of one hardware architecture.

- The state changes expressed in a program are only those relevant to
the logic of an algorithm, and are thus easier to reason about than a
slew of assignments to variables.

- Assignments to variables are more complicated than functional
components.

The first argument is reasonable; however, in systems programming you
often need to specify state changes explicitly, and in reality
hardware architectures are rarely so different that algorithms need to
be rewritten when porting code from one to another.

The second argument is less convincing; with appropriate invariants
and procedural decomposition it should be easy to see what state
changes occur in each of an object's methods.

The third argument seems to be just a religious argument in favor of
functional programming.

The main arguments against actors have to be performance, performance
and performance.  Function-oriented, continuation-style programming
results in creating far more actors (objects) than would occur in
imperative programs; message passing is implicit, hidden beneath a
layer of abstraction; every single object, down to integers, needs its
own message lookup table, thread of execution, etc.; lack of static
typing complicates optimization and type-checking.

--

Obliq is a statically-scoped, dynamically-typed, interpreted language
designed to support distributed programming in the style of network
objects (like Java remote objects).

Its main selling point is the ability to migrate running objects--both
functions and data.  Static scoping ensures that free identifiers
retain bindings to the originating site rather than to the new site,
ensuring safety.

Dynamic typing is supposed to shrink the size of the interpreter and
"avoid problems of compatibility of types at multiple sites."  Objects
are prototype-based rather than class-based.  The former seem better
suited to dynamically-typed languages.

Aliasing is used to map the methods and data from one object into
another object.  This offers a convenient way to implement surrogate
or proxy objects.

Issues:

- how useful is object migration, as opposed to shipping code to a
remote site and starting it there (as in Java)?

- do the benefits of dynamic typing outweigh the disadvantages
(discovering errors later, thwarting efforts at optimization)?

--

References on practical functional programming

Long thread on comp.lang.functional:
http://groups.google.com/groups?selm=9ko8b3%241j26%241%40news.uar.net

John Hughes, Why Functional Programming Matters:
http://www.cs.chalmers.se/~rjmh/Papers/whyfp.html

Antony Courtney et al, Genuinely Functional User Interfaces:
http://www.apocalypse.org/pub/u/antony/pubs/genuinely-functional-guis.pdf

Haskell, a modern "purely functional language": www.haskell.org

Erlang, a semi-functional language from Ericsson: www.erlang.org

--

Continuation-passing style, generators, coroutines and microthreads

A continuation is a function representing "the rest of the program" as
of the current point.  Returning a continuation from a function
preserves its state and allows the caller to resume the function
rather than start from scratch.

The most prevalent example of the usefulness of continuation-passing
style is a producer-consumer construct where one routine is generating
values that are being processed by another routine.  The most natural
way to code this is with a pair of threads, passing values from one to
the other using a pipe, but this can be prohibitively expensive.  All
the alternatives involving simple procedure calls force the producer
or the consumer or both to jump through hoops in order to preserve and
restore their state, obfuscating the overall program logic.

Coroutines are a set of routines, each of which can return a a value
without forgetting its internal state, and can be later resumed at the
point where it left off.  Generators are a specific coroutine
construct, in which one routine generates a stream of values,
returning control to the caller after each one.  Both coroutines and
generators can be implemented easily using continuations.

From most general to most specific:

  continuations > coroutines > generators

Coroutines can also be viewed as a set of microthreads (user-level,
nonpreemptive threads): basically, a set of execution contexts (aka
stacks) and the means to explicitly switch among them.  Coroutines and
generators can also be implemented using microthreads, and
microthreads can be implemented using continuations, so in some sense
they're all equivalent.

Issues:

- What are the implementation costs of all of these mechanisms, and
are they usable in systems programming languages like C++?

- Tradeoffs between allocating multiple stacks and using a more
flexible structure like a tree for execution context?

- Anecdotal evidence exists that continuation-passing style is hard
for programmers to learn.  It is a very general way of manipulating
program control, sort of like goto, and can be misused just as easily.
Is it better for languages to avoid first-class continuations in favor
of one of the more restrictive idioms?

Python supports generators natively:
http://www.python.org/peps/pep-0255.html

Stackless Python supports continuations natively (the "stackless"
refers to a Python implementation issue, i.e. decoupling the Python
interpreter stack from the state of the running program):
http://www.stackless.com/

This article shows how to implement simple generators in C using some
rather hacky macros:
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

POSIX routines setjmp and longjmp allow saving and restoring stack
context in Unix C programs.