EmrHdiDagTransformerTemplate

class ditto.templates.EmrHdiDagTransformerTemplate(target_dag, transformer_defaults=None, operator_transformers=None, transformer_resolvers=None, subdag_transformers=None, **kwargs)[source]

Bases: ditto.ditto.AirflowDagTransformer

This is the defacto template to use for converting an EMR-based airflow DAG to an Azure HDInsight based airlow DAG in ditto, unless you encounter more complex patterns, in which case you can always create your own template.

This is easily sub-classable.

See also

See examples/example_emr_job_flow_dag.py for an example. You can find more examples in the unit tests at tests/test_dag_transformations.py

Parameters
  • target_dag (DAG) – ditto allows you to provide a pre-fabricated airflow DAG object so that you can set essential parameters like it’s schedule_interval, params, give it a unique dag_id, etc. outside of ditto itself, instead of ditto copying the attributes of the DAG over from the source DAG. This gives more flexbility.

  • transformer_defaults (Optional[TransformerDefaultsConf]) – allows you to pass a map of transformer type to their default configuration. This is helpful to pass things like a default operator to use when the transformer cannot transform the source operator for some reason, or any other configuration required by the transformer

  • transformer_resolvers (Optional[List[TransformerResolver]]) – resolvers to use to find the transformers for each kind of operator in the source DAG.

  • subdag_transformers (Optional[List[Type[SubDagTransformer]]]) – subdag transformers to use for converting matching subdags in the source DAG to transformed subdags

  • debug_mode – when True it will render the intermediate results of transformation using networkx and maplotlib so that you can debug your transformations easily.