Language

Threading

Creating a thread

The following example starts five threads:

for (var Int i) 1 5 thread console "Hello from thread " i eolsleep 1

Variables ('i' in the sample) are automatically copied to the new thread.

There are several possible ways to have some variables shared by the original and the master thread:

var Int i := 1thread share i i := 12sleep 1 # Danger: this is only a 99.99% valid synchronizationconsole "i = " i eol

var Int ivar Pointer:Int j :> ithread j := 12sleep 1 # Danger: this is only a 99.99% valid synchronizationconsole "i = " i eol

In both samples, the program will display 'i = 12'.
In the first case, we have used 'share' to tell Pliant that the 'i' variable must be passed by address instead of by value to the new thread, and in the second case, we have created a pointer 'j' to 'i', and 'j' will be passed by value (copied), but what will be copied is the pointer, not what it's pointing, so the result is the same as using 'share'.

'share' allow compact writting since:

share i

is the same as:

{ share i ; i }

so that the previous sample could be written as:

var Int i := 1thread share i := 12sleep 1 # Danger: this is only a 99.99% valid synchronizationconsole "i = " i eol

Protecting data access and low level synchronization

The previous sample is not granted to work since in case of incredible load on the server, it would theorically be possible to have 'console' instruction executed before the 'i := 12' one.

We can grant proper execution through:

var Int i := 12var Link:Sem s :> new Sems requestthread share i := 12 s releases request ; s releaseconsole "i = " i eol

A more simple writing would have been:

var Int i := 12var Sem ss requestthread share i := 12 s release # Danger: s might be destoyed while we are still executing the end of 'release' methods request ; s releaseconsole "i = " i eol

but then we might have a problem is the initial thread destroys the 's' semaphore variable before the execution end of the called process. Creating a real Pliant object accessed by a link will grant that the 's' object will continue to exist as long as either of the two threads need it because the link will be copied to the new thread incrementing the reference count as a result.

A more compact, yet fully valid notation would be:

var Int i := 12(ovar Sem s) requestthread share i := 12 s releases request ; s releaseconsole "i = " i eol

please remind that:

ovar Sem s

is the equivalent of:

var Link:Sem sif not exists:s s :> new Sem

Up to now, we have used semaphores only to do end of parallel execution synchronization. This is not the main usage, and could have been easier achieved through using 'parallel' high level control that we will see later.
The main usage of semaphores is protecting access to non atomic data, and providing critical sections when several threads work on the same data.
Here is a simple example:

var Str txtfor (var Int i) 1 5 thread ... txt := txt+string:i # Absolutely buggy: likely to corrupt Pliant process memory layout ...

Here is the valid version:

var Str txtvar Sem sfor (var Int i) 1 5 thread ... s request txt := txt+string:i s release ...

Semaphores

API

Request a semphore: provides exclusive access to the protected ressource.

Var Sem ss request

Release it:

s release

Request semaphore for readonly access: several thread can get readonly access to the protected ressource at the same time, but no thread can get exclusive access while others get readonly access. In other words, you should use 'request' when you will change the protected ressource content, and 'rd_request' when you will just read it:

s rd_request

s rd_release

Please notice that if you aquired access through 'rd_request', you must release it through 'rd_release', not 'release'.

It is possible to wait for the semaphore content, and provide a message, so that the execution monitor be abble to display waiting threads in case of dead lock:

s request "Wait for my app semaphore n°3"

which is a short version of:

part wait "Wait for my app semaphore n°3" s request

Same for read access:

s rd_request "Wait for my app semaphore n°3"

It is also possible to aquire a semaphore without locking in case the ressource is already assigned:

if s:nowait_request console "got it" eol s releaseelse console "not available" eol

Same for read access:

if s:nowait_rd_request ...

It is also possible to request a semaphore for a limited amout of time (specified in seconds):

if (s request 5) console "got it" eol s releaseelse console "still not available after 5 seconds" eol

Same for read access:

if (s rd_request 5) ...

Various kind of semaphores

Depending on the plateform capabilities, the effective implementation of 'Sem' will be selected in module /pliant/language/schedule/sem.pli as either:

•	SemFutex based on Linux kernel futex API.

•	SemQueue based on thread stop and thread restart functions implemented on all plateforms supported by Pliant.

•	SemYield based on underlying kernel scheduler yield function. This implementation is the simplest but the less satisfying if the ressource is held for a long time because waiting threads are continuously restarted.

Please notice that providing a semaphore implementation that work optimally in any usage pattern is close to impossible because performances can drastically vary with just any detail change both at Pliant semaphore implementation level or at underlying kernel scheduler implementation.

Fast semaphores are intended to provide shorter aquire and release time, at the expense of wasting more time through continuously restarting the waiting threads if the ressource is held for long or many threads are fighting for the same ressource.
Only the very basic 'request' and 'release' operations are provided; no 'rd_request' and others.

var FastSem s

Nested semaphores allow one thread to aquire several time the same ressource without deadlocking with itself. Also the number of 'release' calls must match the number of 'request' calls in the end. Current implementation has quite long aquire time.

var NestedSem s

Resource allocation semaphores are not intended to provide exclusive access, but rather 'no more than n at once' regulation. As a result, they use a slighly different API:

var ResourceSem ss configure 1000 # no more than 1000 at onces request 50 # locks until we can aquire 50s release 50if (s nowait_request 200) # try to aquire 200 immediately console "Succeeded to get 200 more" s release 200s query (var Int current) (var Int maxi)console "Currently " current " in use out of " maxi eol

Here is a usage scenario. Imagine that your database application provides long to compute reports. If too many are requested at once, the server might start crowling, then users not receiving the result after decent amount time would start reissue the same request making the situation even worse until the server get's completely unsable even for answering quire simple queries. A resource semaphore is a simple way to prevent it:

(gvar ResourceSem big_queries) configure 4 # no more than 4 at once...if (big_queries nowait_request 1) # compute the big query big_queries release 1else text "The server is currently under high load, please retry later"

High level parallel control

parallel for (var Int i) 1 5 task console "Hello from task " i eoL post console "Task " i " post process" eol

The general idea of parallel control is to run several tasks in parallel, then sync in the end. A tasks queue is created, and while the parallel instruction body executes, each 'task' instruction will add an entry in the tasks queue. The system will execute several tasks at once using several threads that will last just the 'parallel' bloc life time.
'post' is optional and is a way to specify another bloc related to the task bloc it just follows. All the 'post' blocs will be executed sequencially and in order as opposed to their corresponding task blocs that will be executed in parallel. Yet, all local variables are shared between corresponding 'task' and 'post' blocs. In other words, 'post' is a way to group the results produced by various tasks without facing multithreading related issues.

Parallel instruction accepts several options:

parallel threads 4 mini 8 maxi 16 active true balance true ...

Most of the time, the best is to provide no option, or just 'threads' if the tasks are performing network connections instead of heavy computations as an example.
'threads' parameter force the number of threads to run in parallel.
If more than 'maxi' tasks are pending in the queue because the main thread is much faster than the ones executing the tasks, then the main thread will pause until the number of pending tasks falls down to 'mini'.
if 'active' is true and too many tasks are pending in the queue, then the main thread will just pick and execute one instead of pausing.
if 'balance' is true, then the number of tasks to run in parallel will be automatically adjusted according to the global process load. If several 'parallel' instructions are simultaneously executed by several threads, then the number of threads will be reduced in order to limit scheduling and cache miss wastes.

Here is a real sample from /pliant/util/crypto/rsa.pli using parallel to speed up RSA ciphering:

Standard version:

var Intn in # clear messagevar Intn n e # the RSA key two numbersvar Intn out := 0 # the ciphered messagevar Intn f := 1while in<>0 var Intn in_i := in%n var Intn out_i := in_i^e%n out := out+out_i*f f := f*n in := in\n# 'out' is now computed

Parallel version:

var Intn in # clear messagevar Intn n e # the RSA key two numbersvar Intn out := 0 # the ciphered messagevar Intn f := 1parallel while in<>0 var Intn out_i task share n e var Intn in_i := in%n out_i := ( in_i^e%n )*f post share out out := out+out_i f := f*n in := in\n# 'out' is now computed