EmrCreateJobFlowOperatorTransformer

class ditto.transformers.emr.EmrCreateJobFlowOperatorTransformer(target_dag, defaults)[source]

Bases: ditto.api.OperatorTransformer

Transforms the operator EmrCreateJobFlowOperator

Parameters
  • target_dag (DAG) – the target to which the transformed operators must be added

  • defaults (TransformerDefaults) – the default configuration for this transformer

get_cluster_name(src_operator)[source]

get the cluster name from the EMR job flow and it’s connection

transform(src_operator, parent_fragment, upstream_fragments=None)[source]

Copies the AzureHDInsightCreateClusterOperator from the given default_operator provided and attaches it to the target DAG after transferring airflow attributes from the source EmrCreateJobFlowOperator

Relies on the default op as it is not possible to translate an EMR cluster spec to an HDInsight cluster spec automatically, so it is best to accept that operator from the user itself

What it does do though is attach a AzureHDInsightClusterSensor to monitor the provisioning of this newly created cluster.

Since the HDInsight management client is idempotent, it does not matter if the cluster already exists and the operator simply moves on if that is the case.

Return type

DAGFragment