Scheduling

The scheduling strategy defines when each processing element - the passes from the nodes - should be executed. It needs to satisfy the constraints imposed by the DAG meaning the passes of a node which has data dependencies on other nodes can't be scheduled until the passes of the other nodes have been executed.

The order of node passes which are not constraint by data dependencies expressed in the DAG can be freely decided by the scheduler. Those might run sequentially in either order or concurrently.

Example

The following diagram show possible schedules for the same DAG. For simplicity it is assumed that all passes take the same amount of time and that the passes in "node 1" must run sequentially.

Example DAG

Example Schedule A

Example Schedule B

Periodicity

In complex applications the overall data pipeline might not be executed in a single repeated cycle. Parts of the DAG might be scheduled with different frequencies than others and will run independently - potentially concurrently. In STM (see below) terminology such concurrently running parts of the data pipeline are called epochs.

For nodes with a data dependency which spans across such scheduling domains, the channels must be configured to match the different execution frequencies.

If the producing node is scheduled at a higher frequency than the consuming node, the channel defines the desired policy, e.g.:
- use a queue large enough to store N messages if the producing node is running N times for each time the consuming node runs once, or
- overwrite older messages so that the consuming node only received the latest of the N messages when being run.
If the producing node is scheduled at a lower frequency than the consuming node, the channel defines the desired policy, e.g.:
- the consuming node only receives new messages every N-th scheduling iteration, or
- the consuming node reuses the previous message if no new message was provided by the producing node.

System Task Manager (STM)

CGF aims to support the implementation of custom schedulers to execute a CGF data pipeline. As of today only a single scheduler implementation is provided.

System Task Manager (STM) is a static, centrally-monitored, OS-agnostic, non-preemptive user-space scheduler that manages the work across hardware engines. Based on the information from the DAG, a schedule configuration, and the worst-case execution time (WCET) for each pass the STM compiler generates a static schedule. The global scheduling policies is then applied at runtime across the heterogeneous compute platform.

One important aspect of STM is that it ever schedules one single pass per hardware engine - never multiple concurrently. This aims to provide predictable performance by avoiding that the underlying OS affects the scheduling of concurrently running passes.

See the STM documentation for more details and available tools. The driveworks_stm.deb contains a PDF named Nvidia-STM-Userguide-for-DW<version>.pdf under /usr/local/driveworks-<version>/doc.

Process Layout

For scalability the execution of nodes might be distributed across more than one SoC and for fault isolation the distribution across multiple OS processes might be desired. Therefore a CGF application can define multiple OS processes and map each node to a specific process.

Process Layout

Table of Contents

Scheduling

Example

Periodicity

System Task Manager (STM)

Process Layout