mirror of
https://github.com/LCPQ/quantum_package
synced 2024-12-22 20:35:19 +01:00
Parallelism.md
parent
491378c3b5
commit
2e2d2b17c8
@ -40,19 +40,68 @@ enable both multi-threaded and distributed support.
|
|||||||
|
|
||||||
To initiate a parallel job, a `Newjob` message has to be sent to the scheduler,
|
To initiate a parallel job, a `Newjob` message has to be sent to the scheduler,
|
||||||
with a job name (`state`) that will be checked at every connection to the
|
with a job name (`state`) that will be checked at every connection to the
|
||||||
scheduler. This will create a new `Queuing_system` instance. The
|
scheduler.
|
||||||
`Queuing_system` contains
|
|
||||||
|
`new_job <state> <push_address_tcp> <push_address_inproc>`
|
||||||
|
|
||||||
|
The collector thread needs to opens a `PULL` socket bound both using the TCP
|
||||||
|
and the inproc protocols, and those endpoints need to be given together with
|
||||||
|
the `Newjob` message.
|
||||||
|
|
||||||
|
Now, a new `Queuing_system` instance is created. The `Queuing_system` contains
|
||||||
|
|
||||||
* A list of tasks
|
* A list of tasks
|
||||||
* A list of connected clients (empty at the initialization)
|
* A list of connected clients (empty at the initialization)
|
||||||
* The subset of tasks still queued
|
* The subset of tasks still queued
|
||||||
* The subset of tasks currently running, and on which client they run
|
* The subset of tasks currently running, and on which client they run
|
||||||
|
|
||||||
To add tasks to the system, the `AddTask` message is sent. It can be
|
The format of tasks is up to the user : it is just a string.
|
||||||
|
To add new tasks to the system, the `AddTask` message is sent:
|
||||||
|
|
||||||
|
* `add_task <state> <string>` : Add a unique task to the `Queuing_system`
|
||||||
* `add_task <state> range <i> <j> ` : Builds a list of tasks (j) where
|
* `add_task <state> range <i> <j> ` : Builds a list of tasks (j) where
|
||||||
i<l<j.
|
i<l<j.
|
||||||
* `add_task <state> triangle <i>` : Builds a list of tasks (l,i) where
|
* `add_task <state> triangle <i>` : Builds a list of tasks (l,i) where
|
||||||
1<l<i.
|
1<l<i.
|
||||||
* `add_task <state> <msg>` : Builds a list of tasks (msg)
|
|
||||||
|
|
||||||
|
When workers connect to the `Queuing_system` with a `Connect` message, they
|
||||||
|
obtain as a reply the address of the `PULL` socket of the collector,
|
||||||
|
as well as a new Client ID :
|
||||||
|
|
||||||
|
`connect (tcp|inproc)`
|
||||||
|
|
||||||
|
Now the Workers connect a `PUSH` socket to the `PULL` endpoint of the
|
||||||
|
collector (TCP or inproc).
|
||||||
|
Workers are now ready to fetch new tasks using `GetTask` messages:
|
||||||
|
|
||||||
|
`get_task <state> <client_id>`
|
||||||
|
|
||||||
|
and the `Queuing_system` now knows which tasks runs on which client.
|
||||||
|
The reply is the task, as a string, and the corresponding Task ID.
|
||||||
|
|
||||||
|
When the task is done, the worker pushes the results to the collector,
|
||||||
|
and sends to `qp_run` a `TaskDone` message with the corresponding Task ID,
|
||||||
|
and its contribution to the control integer which will be accumulated in the
|
||||||
|
`Queuing_system` instance:
|
||||||
|
|
||||||
|
`task_done <state> <client_id> <task_id> <control>`
|
||||||
|
|
||||||
|
|
||||||
|
If the queue is empty, the reply to the GetTask` message is a `Terminate` message
|
||||||
|
which informs the workers to terminate.
|
||||||
|
|
||||||
|
When a worker terminates, it sends a `Disconnect` message to the scheduler:
|
||||||
|
|
||||||
|
`disconnect <state> <client_id>`
|
||||||
|
|
||||||
|
If there are remaining running clients, the reply is `0`. For the last client,
|
||||||
|
the reply contains the control integer, and this allows the worker to inform the
|
||||||
|
collector that all tasks are done and all workers are disconnected.
|
||||||
|
|
||||||
|
Once the collector thread has finished to pull all the data, it can terminate.
|
||||||
|
|
||||||
|
Now, the main thread can send an `End_job` message to the scheduler to inform
|
||||||
|
it that the parallel task is done.
|
||||||
|
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user