Tip | Interacting with partitioned datasets using the Python API#

If your recipe deals with partitioned datasets, in input or output, you need to be careful about reading and/or writing the correct data.

Reading and writing#

If your recipe deals with partitioned datasets, in input or output, you don’t need to specify the source or destination partitions in your code. Reading and writing is done through Dataiku.

To read from or write to the input partitions (as defined by the partition dependencies), use get_dataframe(). This will automatically give you the relevant partitions.

Other purposes#

For purposes other than reading or writing dataframes, you can access the partition name (as well as any other variables) you want to build using the Python dictionary called dku_flow_variables. This dictionary can be accessed using dataiku.dku_flow_variables, as described in the reference documentation.

Note

dataset.get_write_partition() is deprecated.