A Graphlet represents (a part of) a data pipeline describes as a directed graph. Similar to graphs which can be composed out of subgraphs, a graphlet cannot only be composed out of nodes but also out of other graphlets. To enable this kind of hierarchical composition, a graphlet has a similar interface to a node:
Additionally, a graphlet is described by the following information:
A visualization of a node and a graphlet illustrating the concepts of ports and connections is shown in the graphical UI.
The source and destination endpoints of each connection are identified by the port. In the case of a wire the endpoint can be just a port name - referencing the port with the specific name in the containing graphlet. Otherwise the endpoint consists of the two parts separate by a dot (.
):
Commonly both endpoints of a connection are port of components in the DAG. But connections can also describe a data flow where either the source or the destination endpoint is outside the DAG, e.g. to connect to an external process. Such connections use an empty string to indicate the external endpoint.
For scheduling the data flow in the graph represents a data dependency. A topological order, in which each nodes passes should be executed, can only be established for a directed acyclic graph. To break potential cycles within the graph some connections might be marked as indirect connections. While these do represent a data flow they don't represent a data dependency for scheduling.
A wire is different from a connection since no actual data transport is implied yet. It solely exists to expose a port from a subcomponent to the public interface of the surrounding graphlet. An input port of a graphlet might be connected to one or multiple input ports of subcomponents.
While a connection only describes that data should flow between two endpoint, a Channel provides a specific kind of data transport mechanism (e.g. sockets, shared memory). A single channel can also provide the data transport for more than one connection, e.g. via UDP multicast. The exact type of a channel as well as optional configuration options of the specified transport mechanism are defined by parameters associated with each connection.
Importantly, independent of the selected transport mechanism all memory is owned by the channel to provide efficient communication.