mirror of
https://github.com/LCPQ/quantum_package
synced 2024-11-03 20:54:00 +01:00
Parallelism.md
parent
491378c3b5
commit
2e2d2b17c8
@ -40,19 +40,68 @@ enable both multi-threaded and distributed support.
|
||||
|
||||
To initiate a parallel job, a `Newjob` message has to be sent to the scheduler,
|
||||
with a job name (`state`) that will be checked at every connection to the
|
||||
scheduler. This will create a new `Queuing_system` instance. The
|
||||
`Queuing_system` contains
|
||||
scheduler.
|
||||
|
||||
`new_job <state> <push_address_tcp> <push_address_inproc>`
|
||||
|
||||
The collector thread needs to opens a `PULL` socket bound both using the TCP
|
||||
and the inproc protocols, and those endpoints need to be given together with
|
||||
the `Newjob` message.
|
||||
|
||||
Now, a new `Queuing_system` instance is created. The `Queuing_system` contains
|
||||
|
||||
* A list of tasks
|
||||
* A list of connected clients (empty at the initialization)
|
||||
* The subset of tasks still queued
|
||||
* The subset of tasks currently running, and on which client they run
|
||||
|
||||
To add tasks to the system, the `AddTask` message is sent. It can be
|
||||
The format of tasks is up to the user : it is just a string.
|
||||
To add new tasks to the system, the `AddTask` message is sent:
|
||||
|
||||
* `add_task <state> <string>` : Add a unique task to the `Queuing_system`
|
||||
* `add_task <state> range <i> <j> ` : Builds a list of tasks (j) where
|
||||
i<l<j.
|
||||
* `add_task <state> triangle <i>` : Builds a list of tasks (l,i) where
|
||||
1<l<i.
|
||||
* `add_task <state> <msg>` : Builds a list of tasks (msg)
|
||||
|
||||
|
||||
When workers connect to the `Queuing_system` with a `Connect` message, they
|
||||
obtain as a reply the address of the `PULL` socket of the collector,
|
||||
as well as a new Client ID :
|
||||
|
||||
`connect (tcp|inproc)`
|
||||
|
||||
Now the Workers connect a `PUSH` socket to the `PULL` endpoint of the
|
||||
collector (TCP or inproc).
|
||||
Workers are now ready to fetch new tasks using `GetTask` messages:
|
||||
|
||||
`get_task <state> <client_id>`
|
||||
|
||||
and the `Queuing_system` now knows which tasks runs on which client.
|
||||
The reply is the task, as a string, and the corresponding Task ID.
|
||||
|
||||
When the task is done, the worker pushes the results to the collector,
|
||||
and sends to `qp_run` a `TaskDone` message with the corresponding Task ID,
|
||||
and its contribution to the control integer which will be accumulated in the
|
||||
`Queuing_system` instance:
|
||||
|
||||
`task_done <state> <client_id> <task_id> <control>`
|
||||
|
||||
|
||||
If the queue is empty, the reply to the GetTask` message is a `Terminate` message
|
||||
which informs the workers to terminate.
|
||||
|
||||
When a worker terminates, it sends a `Disconnect` message to the scheduler:
|
||||
|
||||
`disconnect <state> <client_id>`
|
||||
|
||||
If there are remaining running clients, the reply is `0`. For the last client,
|
||||
the reply contains the control integer, and this allows the worker to inform the
|
||||
collector that all tasks are done and all workers are disconnected.
|
||||
|
||||
Once the collector thread has finished to pull all the data, it can terminate.
|
||||
|
||||
Now, the main thread can send an `End_job` message to the scheduler to inform
|
||||
it that the parallel task is done.
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user