Extending DAPHNE with More Scheduling Knobs
This document focuses on how a DAPHNE developer may extend DAPHNE by adding new scheduling techniques.
Guidelines
The DAPHNE developer should consider the following files for adding a new scheduling technique:
src/runtime/local/vectorized/LoadPartitioning.h
src/api/cli/daphne.cpp
Adding the actual code of the technique:
The first file LoadPartitioning.h
contains the implementation of the currently supported scheduling techniques, i.e., the current version of DAPHNE uses self-scheduling techniques to partition the tasks. Also, it uses the self-scheduling principle for executing the tasks.
For more details, please visit Scheduler design for tasks and pipelines.
In this file, the developer should change two things:
-
The enumeration
SelfSchedulingScheme
. The developer will have to add a name for the new technique, e.g.,MYTECH
-
The function
getNextChunk()
. This function has a switch-case-statement that selects the mathematical formula that corresponds to the chosen scheduling method. The developer has to add a new case to handle the new technique.uint64_t getNextChunk(){ // ... switch (schedulingMethod){ // ... // only the following part is what the developer has to add. The rest remains the same case MYTECH:{ // the new technique chunkSize= FORMULA; // some formula to calculate the chunk size (partition size) break; } // ... } // ... return chunkSize; }
Enabling the selection of the newly added technique:
The second file daphne.cpp
contains the code that parses the command-line arguments and passes them to the DAPHNE compiler and runtime. The developer has to add the new technique as a valid option. Otherwise, the developer will not be able to use the newly added technique.
There is a variable called taskPartitioningScheme
and it is of type opt<SelfSchedulingScheme>
.
The developer should extend the declaration of opt<SelfSchedulingScheme>
as follows:
opt<SelfSchedulingScheme> taskPartitioningScheme(
cat(daphneOptions), desc("Choose task partitioning scheme:"),
values(
clEnumVal(STATIC , "Static (default)"),
clEnumVal(SS, "Self-scheduling"),
clEnumVal(GSS, "Guided self-scheduling"),
clEnumVal(TSS, "Trapezoid self-scheduling"),
clEnumVal(FAC2, "Factoring self-scheduling"),
clEnumVal(TFSS, "Trapezoid Factoring self-scheduling"),
clEnumVal(FISS, "Fixed-increase self-scheduling"),
clEnumVal(VISS, "Variable-increase self-scheduling"),
clEnumVal(PLS, "Performance loop-based self-scheduling"),
clEnumVal(MSTATIC, "Modified version of Static, i.e., instead of n/p, it uses n/(4*p) where n is number of tasks and p is number of threads"),
clEnumVal(MFSC, "Modified version of fixed size chunk self-scheduling, i.e., MFSC does not require profiling information as FSC"),
clEnumVal(PSS, "Probabilistic self-scheduling"),
clEnumVal(MYTECH, "some meaningful description to the abbreviation of the new technique")
)
);
Usage of the new technique:
DAPHNE developers may now pass the new technique as an option when they execute a DaphneDSL script.
daphne --vec --MYTECH --grain-size 10 --num-threads 4 --PERCPU --SEQPRI --hyperthreading --debug-mt my_script.daphne
In this example, DAPHNE will execute my_script.daphne
with the following configuration:
- the vectorized engine is enabled due to
--vec
- the DAPHNE runtime will use MYTECH for task partitioning due to
--MYTECH
- the minimum partition size will be 10 due to
--grain-size 10
- the vectorized engine will use 4 threads due to
--num-threads 4
- work stealing will be used with a separate queue for each CPU due to
--PERCPU
- the work stealing victim selection will be sequential prioritized due to
--SEQPRI
- the rows will be evenly distributed before the scheduling technique is applied due to
--pre-partition
- the CPU workers will be pinned to CPU cores due to
--pin-workers
- if the number of threads were not specified the number of logical CPU cores would be used (instead of physical CPU cores) due to
--hyperthreading
- Debugging information related to the multithreading of vectorizable operations will be printed due to
--debug-mt