Function orientated operating system

A bit of history

Most Operating Systems are process based. Processes are batch jobs that were designed to run for a long period of time and the OS is designed to allow each long running process to receive a fair share of CPU time. This model descends from a time when computers received large computational jobs from an operator that would load the job into the computer, start its execution, wait for the computer to complete the job and then load the next one.

Multitasking

Eventually people desired computers to be able to run several batch jobs simultaniously. In fact there was even a proposal to have computing as a public service utility where citizens could terminal into a machine and run batch jobs completely unaware that they were sharing the physical computer with other remote users. This required a multitasking computer where periodically (on an external clock tick) the current batch job would be paused, another job swapped into the CPU and resumed from where it was paused last. This context switching occurred so quickly that the user was unaware that it was happening and it appeared that they had dedicated access to the computer to run whatever batch jobs they wanted.

Interactive programs

Soon a different class of programs became popular: the interactive program and the server. Both types of programs were not batch jobs but programs that ran continuously while the users interacted with them. For example a text editor requiring prompt response to users pressing keys at their terminals. Or servers that needed to be running continously waiting for incoming connections and servicing them quickly. These programs were different from batch jobs because they had infinite loops that repeatedly waited for some kind of input event to react to, and didn't exit until the user had finished using them. The event loop.

Event loops everywhere

Most modern programs immediately enter some form of event loop and run for hours, days, weeks or longer. That could be a GUI event loop, a loop over a socket waiting for connections or even a loop waiting for data to be fed from an open file. These event loops need to be well behaved. They can not simply poll a resource as fast as possible but instead have to somehow relequish control back to the OS. Most examples of event loops start by putting a sleep() inside the loop but then quickly explain why this is a bad idea and move on to better methods such as a blocking read() or select() to inform the OS under what conditions it needs to wake the process up again. These are complex and sensitive bits of code to write and writing a good event loop is not a trivial task. Furthermore if your program needs to somehow combine two event loops to run a GUI and listen for incoming connections things gets really tricky really fast.

Operating Systems work hard

It turns out that preemptively switching between programs running event loops requires significant effort. Because programs can be interrumpted at any moment during their execution programs have to be careful when they access shared resources. They have to inform the OS when they want to access a shared resource so the OS can keep track of what programs have locked and prevent other programs from accessing that resource while locked. Furthermore the OS shouldn't bother switching a program back into context if it knows that process is waiting for data from some resource that hasn't arrived yet. This results in a really complex bookeeping system of locks, mutexes, pipes, sockets and states that must be correctly coordinated by OS code. Unfortunately the quality of the programs being run has an impact on the amount of this bookeeping that the OS does. Programs that are poorly written need more management overhead by the OS to ensure that they keep the illusion that everything is running in parallel.

A new (old) way

I am experimenting with a different type of OS that has no concept of processes in the traditional sense. It is a function based OS where the primary unit of code is not a process but a function. Just like any OS there is only one bit of code running at a time but instead of the kernel switching contexts between multiple chunks of code, each function is allowed to execute without interruption until its end (except in specific cases explained later). A function is invoked where it executes some logic and then returns. As long as functions are of a reasonable length and contain no event loops the computer executes them fast enough to approximate multitasking.

In the kernel core there is a single event loop which is responsible for calling functions in sequence. The kernel decides which function should be called based on the status of a series of queues or buffers. All functions are associated with a queue as either a producer or a consumer. If a queue has some unread data in it the consumer function is called. If a queue is not completely full the producer function is called.

Because only one function is being run at a time the are far less requirements for locks. Because producers and consumers are not run in parallel (except in specific cases explained later) there is no requirement to lock access to the queues. Because functions are not interrupted there is no requirement to switch the context (state of the registers and the stack) in or out of storage. It is also not strictly required to have more than one stack since functions return before other functions are started.

Functions are also free to add new queues to the system for other functions to attach themselves to. For example a USB driver may create a new queue when a mouse is hotplugged in, even if there is no software running yet to attach itself as a consumer of that data.

Specific cases explained later

So when are locks required? There are two cases when locks are required: a) the producer or consumer is being run from a hardware interrupt (eg: a keypress) and b) the computer has multiple cores and the producer and consumer functions for the same queue are not being run on the same core.

For both cases a lock is required on the queue that prevents a producer and consumer altering the state of the queue simultainiously. However the lock is pretty straight forward and it prevents either function from being started until the lock is resolved, rather than requiring the program code to contain logic about when it can and can't access the queue.