Pyli/Thread

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search
Top: Pyli
Up: Pyli
Prev: ???
Next: Pyli/Generators

Introduction & Philosophy

The concepts of time (sleeping for a moment, do something after a time, do something every X interval), threads, and I/O events are all unified.

The goal here is to avoid writing an Async control loop, at all, ever. You should be able to sit down and write a MUD just using threads and not thinking about blocking or non-blocking calls.

Even though it isn't necessary to write your own scheduler, it should be possible. You may also add your own special events.

Channels should handle the vast, vast majority of cases for inter-thread communication.

I don't like the idea of sharing memory, although I believe it can work.

All the complexity should be pushed off into the language developers and leave the application developers free to write their app.


Rules of Microthreading

A thread is created in a lexical environment. It is created with a set of statements that are expected to be run in that environment.

A thread is given a fresh, new dynamic environment.

A thread, when created, doesn't automatically run.

To run a thread, it is called. It may be called in the following ways:

  • (thread): Executes until a sleep event, completion, block, or unhandled error.
  • (thread &(until finished)): Executes until completion or unhandled error. On block, the calling thread will indicate a block. If it is the main thread, then the interpreter will resume operation when it unblocks. Otherwise, it will be up to the scheduler.
  • (thread &(for 6 ms) &(for 100 cycles)): Executes for 6 ms or 100 cycles, whichever ends first, or a sleep, completion, block or unhandled error event.

The return value is a pair:

  • (&finished value): Returned when the thread is complete.
  • (&sleep value): Returned when sleep is called. Note that value doesn't have to be a number representing milliseconds. Also note that it is up to the scheduler to start the thread again. Note that generators will return any value.
  • (&timeout): The thread timed out (limited by &(for n ms)).
  • (&runout): The thread ran out of cycles (limited by &(for n cycles)).
  • (&error condition): The thread had an unhandled exception.
  • (&block block-object): The thread has blocked on a specific thing.

Scheduler

A scheduler may also handle the running of threads. A scheduler runs in a loop until all the threads it is supposed to handle are complete (either normally or by exception). A scheduler typically takes the following into account:

  • Which threads are free to run at any time.
  • Which threads are blocked, and on what.
  • Which thread are scheduler to run at a specific time (returned sleep).
  • Which expressions are scheduled to run at a specific time (set with at).
  • Which expressions are scheduled to run at intervals (set with every).

Blocking

Blocking occurs when an operation would block. Due to the nature of Pyli, it will be impossible to actually block on anything. That is, we will never signal to the OS that we have nothing to do, unless we really have nothing to do, until the block is cleared.

Operations that might block--such as puts and gets on a channel, or reads and writes to a file handler, instead cause the thread to return a block-object. This block-object is callable. Calling it returns True or False depending on whether the block is still active. That is, as long as there can be no put, get, write, or read without blocking.

When the block is cleared, then the thread can be resumed in one of the ways that normal threads are resumed.

A block can be simulated by creation of a similar object or just a method or even a set of expressions.

(block expr ...)

The thread will not continue until the expressions (in block) or callable (in block-instance) evaluate to True.

Block instances may be put into a select or poll instance, to optimize the checking of unblocking in particular cases (like files.)

Block instances may also be connected to things like channels, so that on a put, all the getters will wake up right away.

Running a thread that is already running

Here's the scenario:

  • thread a runs thread b
  • thread b runs thread a

The simple solution: You can't run a thread that is already running. It will signal a ThreadRunningError error.

Unhandled Signals

  • thread a runs thread b
  • thread b signals a condition but doesn't handle it.

The simple solution: There is a signal handler installed at every thread boundary that will prevent the signal from percolating up.

More generically: The dynamic environment is not shared with threads that are run. They get a fresh dynamic environment. The lexical environment, of course, is the environment the thread was created in.

Concerns with Preemption

Thread a runs thread b for 100 cycles. Thread b runs thread c for 500 cycles. What happens when the 100 cycles expire, and what happens when the 500 cycles expire?

The obvious way is that after 100 cycles (some of which are spent in thread b, some of which are spent in thread c), that thread a resumes. When b is resumed, it passes control back to thread c to run the remaining 400+ cycles.

Low-Level Design

There is one main thread and many non-main threads.

When the main thread completes, then execution finishes. The result of the main thread is the result of the program. If the main thread creates threads, then usually it will end up in a scheduler.

The non-main threads can only be awoken from another thread. That is, if they are run it is because some other thread explicitly woke them.

Switching to a Thread

In order to switch from one thread to another:

  1. Store the env and stack in the current, invoking thread. (Question: Should we instead put it on some sort of stack?)
  2. Put an item on the stack to handle completion. It will return control to the invoking thread with a ThreadComplete signal (non-error).
  3. Set the cycle countdown. The handler on countdown will return control to the invoking thread with a ThreadRunout signal (non-error)
  4. Set the timer. The handler on timeout will return control to the invoking thread with a ThreadTimeout signal (non-error).
  5. Set the sleep function to return control to the invoking thread with a ThreadSleep signal.
  6. Push the invoking thread on the parent thread stack.
  7. Set the new thread as the current thread.
  8. Set the env and stack from the last state of the new thread.
  9. Set the current thread as active, the old thread as inactive.
  10. Continue with the last status of the new thread.

Switching out of a Thread

There are several ways for a thread to end processing:

  1. Run out of cycles --- runout.
  2. Run out of time --- timeout.
  3. Run out of things to do --- completion.
  4. Explicitly run out to be resumed later --- a sleep.
  5. Hitting a blocking operation, can't resume until unblocked.

In order to switch out of a thread, back to the parent:

  1. Set the env, stack, and status of the thread to the current state, up to the top marker in the stack.
  2. Clear the stack beyond the top marker.
  3. Pop off the parent thread, set it as the current thread.
  4. Set the current thread as active, the old thread as inactive.
  5. Continue with the proper result, usually signaling a non-error condition.

Counting Cycles and Time

To handle the counting of cycles, every thread object has two items:

  • The time when this thread stops execution, set if it was invoked the last time with a limit on time.
  • The cycle count when this thread stops execution, set if it was invoked the last time with a limit on cycles. (In the case of counting cycles, rather than working with long ints, just have the number cycle through 0. This prevents extremely long cycle limits, but that isn't useful anyway.

Every cycle, the current and parent threads are all called to see if the cycle or time limit has been exceeded. If so, that thread is switched out.

Blocking

As explained above, when blocked, it must return a set of expressions. So, when programming a situation that may block, write it like this:

  1. Check to see if the operation would block.
  2. If so, then return a block expression that will evaluate to true when the block is cleared.
  3. Do the operation.


Microthread Operations

Threads are implemented within the language itself. Only one thread is doing any work at any time. If you want true parallelization, then you have to have a separate OS process.

*this-thread*

Returns the current thread. See Pyli/Types/Thread

*child-threads*

A vector of threads that this thread has started.

*parent-threads*

A vector of threads that invoke this thread right now. The top is always the main thread.

(Thread expr ...)

Starts (AKA spawns) a new thread that will evaluate expr, returning it.

The thread has to be run for it to start evaluating.

(thread.kill)

Kill the thread thread. This will cause a KillError to be signalled by the thread, which may catch it and process it. If it is already handling that exception, then a second call to kill on the same thread will just kill the thread right away. Kind of like a SIGKILL handling.

(thread.restart)

Puts the thread back to its initial state. (Very useful for every.) It behaves first like a kill (unless the thread has finished or errored out.)

(thread [&(until finished)] [&(for n ms)] [&(for c cycles)])

Runs a thread given the paramters above.

Returns one of the following results:

  • (&block block-expr)
  • (&finished value)
  • (&error condition)
  • (&runout)
  • (&timeout)
  • (&sleep value)

(sleep value)

Yields the floor to the caller, passing back the value. Normally, a scheduler would interpret this as "sleep for value seconds". However, a generator will return a real value.

KillWarning

This is signaled from the point where the thread is killed or restarted. If it isn't handled, it isn't important.

Time

(time)

The current time in seconds as a float. This is the number of seconds since Jan 1, 1970.

(date)

The current date in a date-time format.

(sleep [seconds])

Sleep for seconds seconds.

If in a child thread, control is passed back to the scheduler and this thread is registered to run when the time expires. If this is the main thread, then the entire program sleeps, using unix sleep, until the time is expired.

(sleep-until time)

Sleep until a specified time. Very simple:

(defn &sleep-until &(time)
   &(sleep (- time (current-time)))

(at time expr ...)

At the time time, run function fn in a separate thread. Return the thread. This will need access to the scheduler.

Registers with the scheduler to call a thread that runs the body after a certain amount of time.

(after seconds expr ...)

Like the above, but gives the number of seconds from now.

(every seconds expr ...)

Every seconds seconds, run function fn in a separate thread. Return the thread. This needs access to the scheduler.

Registers with the scheduler to call a thread that runs the body every seconds seconds.

Scheduler

*default-scheduler*

The scheduler that everyone uses by default.

(scheduler.register thread)

Register a thread to be run as often as possible.

(scheduler.register-thread-at time thread)

  • register-at: Register a thread to run at a specific time. (Like unix at)

(scheduler.register-every interval thread)

Registers expressions to run at specific intervals. (like unix cron) Returns an

(scheduler.deregister thread)

Removes a thread from scheduling altogether.

Blocks

The block object has the following methods:

(block)

Tests to see if the block is still in effect. If the thread has been killed or restarted, it will return False.

(block.resume [same parameters as (thread ...)])

Resume the thread from the block point. If the thread has been killed or restarted, it will signal or return a value to that effect. Returns what (thread) would return.

(block.file-id)

Returns a pair (file-id, &read | &write), representing which file this is blocked on and which operation.

Channels

Channels allow inter-thread communication. A thread can wait to receive a message from a channel (listen). Doing so will put the thread to sleep until a message appears on the channel. A thread can send a message on a channel (say). Doing so will awaken any listeners, run their code. A say may also do context-switching, though not necessarily.

Channels behave like Qt's signal/slots.

  • (channel): Creates a new channel.
  • (say channel message): Send message on the channel.
  • (say-except channel exception): Causes an exception to be thrown for the listeners.
  • (listen channel): Waits for input on the channel.
  • (listen-for channel filter): Will wait until a message that (filter message) returns True on is sent.
  • (listen-catch channel filter): Will wait until an exception that (filter exception) returns True on is sent.
  • (listen-do channel fn): Sets up a listener on channel that will do fn when an event occurs. Returns the thread.
  • (listen-for-do channel fn): Sets up a listener on channel that only accepts messages matching filter. Returns the thread.
  • (listen-catch-do channel fn): Sets up a listener on channel that only accepts exceptions matching filter. Returns the thread.
  • (poll channel ...): Multi-listening: Allow threads to listen for activity on several threads at once, with timeouts.

Notes:

  1. When something is said on a channel, ALL the listeners hear it.
  2. You can simulate the closing of a channel by (say-except close) or something like that. If the listeners are dumb enough to keep listening, then they are allowed to do that.

Channels and Transactions: How to handle multiple threads trying to access the same data: Create a transaction thread that will actually apply the changes requested. It accepts messages on the channel and these messages specify how to modify the data. Then create a reading channel to access the data from that area. (Remind you of smalltalk?)

Async I/O

You NEVER have to write async I/O in pyli. All potentially blocking operations are made non-blocking, but at the thread level they appear blocking.

If you have a single thread that should both read and write at the same time, make two threads so that one will read and one will write.

Thinking of using libev to support this, or something like that.

Signals

I'm talking about OS signals, such as SIGHUP and friends.

These need handlers specified, which are really a function that is called when the signal arises. Need signal blockers. Or just make them exceptions or something. I don't know yet.

Scheduling

I want to allow programs to manipulate the scheduler. However, the default scheduler should be good enough for the vast majority of cases.

The scheduler actually has to balance the time between running threads and handling events. That is, it needs to allow threads that handle events to handle them quickly, but allow other thread to run as well.

  • (deschedule thread scheduler): Removes the thread from normal scheduling. Events such as channel writes and I/O will still be handled. Descheduled threads must either terminate or yield. Otherwise, the scheduler will not take control from them. The main thread is, by default, descheduled. Only descheduled threads should be running a scheduler.
  • (schedule thread scheduler): Put the thread into normal scheduling.