ditto.transformers

class ditto.transformers.CopyTransformer(target_dag, defaults)[source]

Bases: ditto.api.OperatorTransformer

Deep copies the source operator, clears it of its connections to the source DAG, and any upstream/downstream tasks and assigns it to the target_dag.

Note

If you face issues with serialization of the transformed operator by airflow, you can exclude some attributes to be deep copied by overriding the following attribute of the source operator:

>>> input_operator.shallow_copy_attrs += ('_dag',)
Parameters
  • target_dag (DAG) – the target to which the transformed operators must be added

  • defaults (TransformerDefaults) – the default configuration for this transformer

transform(input_operator, parent_fragment, upstream_fragments)[source]

The main method to implement for an OperatorTransformer

Parameters
  • src_operator – The source operator to be tranformed

  • parent_fragment (DAGFragment) – This will give you a linked-list of parent DAGFragment’s, uptil the root The transform operation can make use of the upstream parent chain, for example, to obtain information about the upstream transformations. A good example of this is an add-step operator transformation which needs to get the cluster ID from an upstream transformed create-cluster operator

  • upstream_fragments (List[DAGFragment]) – This will give you all the upstream DAGFragment’s transformed so far in level-order, from the root till this node in the source DAG. This is just a more exhaustive version of parent_fragment, and can give you parent’s siblings, etc.

Return type

DAGFragment

Returns

the DAGFragment of transformed airflow operators

class ditto.transformers.IdentityTransformer(target_dag, defaults)[source]

Bases: ditto.api.OperatorTransformer

Re-uses the source operator as is as the transformed operator Just clears its connections to the source DAG and adds it to the target DAG

Warning

Use this transformer with caution. Reusing an operator can have weird side effects

Parameters
  • target_dag (DAG) – the target to which the transformed operators must be added

  • defaults (TransformerDefaults) – the default configuration for this transformer

transform(input_operator, parent_fragment, upstream_fragments)[source]

The main method to implement for an OperatorTransformer

Parameters
  • src_operator – The source operator to be tranformed

  • parent_fragment (DAGFragment) – This will give you a linked-list of parent DAGFragment’s, uptil the root The transform operation can make use of the upstream parent chain, for example, to obtain information about the upstream transformations. A good example of this is an add-step operator transformation which needs to get the cluster ID from an upstream transformed create-cluster operator

  • upstream_fragments (List[DAGFragment]) – This will give you all the upstream DAGFragment’s transformed so far in level-order, from the root till this node in the source DAG. This is just a more exhaustive version of parent_fragment, and can give you parent’s siblings, etc.

Return type

DAGFragment

Returns

the DAGFragment of transformed airflow operators