Solution | Factories Electricity & CO2 Emissions Forecasting#

Overview#

Business case#

Demand for industrial products has risen considerably in the past two decades, along with energy consumption and CO2 emissions. All industrial companies are engaged in a race to reduce their CO2 emissions. This is to adapt to a shift in market demand and better manage their environmental risks and obligations. It’s also to comply both with financial market requests and future regulatory requirements.

With this Solution, companies can create a unified and interactive view on their energy consumption across manufacturing sites and meters, and convert it to CO2 through real time carbon intensity data provided by electricityMap or RTE. Production planners are able to forecast electricity and CO2 emissions linked to their planned production plans and adjust their geographical allocation choices to optimize their footprint.

Installation#

  1. From the Design homepage of a Dataiku instance connected to the internet, click + Dataiku Solutions.

  2. Search for and select Factories Electricity & CO2 Emissions Forecasting.

  3. If needed, change the folder into which the Solution will be installed, and click Install.

  4. Follow the modal to either install the technical prerequisites below or request an admin to do it for you.

Note

Alternatively, download the Solution’s .zip project file, and import it to your Dataiku instance as a new project.

Technical requirements#

To leverage this Solution, you must meet the following requirements:

nbformat>=4.2.0
plotly-express>=0.4.1
MarkupSafe<2.1.0
cloudpickle>=1.3,<1.6
flask>=1.0,<1.1
itsdangerous<2.1.0
Jinja2>=2.11,<2.12
lightgbm>=3.2,<3.3
scikit-learn>=0.20,<0.21
scikit-optimize>=0.7,<0.8
scipy>=1.2,<1.3
statsmodels==0.12.2
xgboost==0.82
gluonts==0.10.4
pmdarima==1.2.1
mxnet==1.8.0.post0
  • To convert electricity consumption for locations other than France, you can use electricityMap API. You can get a token by completing the ElectricityMap Contact Form.

Data requirements#

The Solution depends on several input data sources:

  • Daily electricity consumption reports (each corresponding to a different fictional factory)

  • A dataset of site addresses with the postal address of all three fictional factories

  • A dataset containing the production history of the three factories

  • Two datasets to show the meter distribution in the factories (before and after data preparation steps)

  • A dataset representing three proposed production scenarios (this dataset is input to the 3 scenarios forecast Flow zone unlike the other datasets)

Note

This project is meant to be used as a template to guide development of your own analysis in Dataiku. You shouldn’t use the results of the model as actionable insights. Some data provided with the project may not be representative of actual data in a real-life project.

Workflow overview#

You can follow along with the Solution in the Dataiku gallery.

Dataiku screenshot of the final project Flow showing all Flow zones.

The project has the following high level steps:

  1. Ingest and prepare the data.

  2. Convert electricity consumption to CO2 emissions.

  3. Compute the cost of electricity per factory.

  4. Train a time series model for forecasting energy consumption and CO2 emissions.

  5. Forecast the CO2 emissions of 3 different production plans.

  6. Understand factories’ CO2 emissions for the past, present, and future with dashboard visualizations.

Walkthrough#

Note

In addition to reading this document, it’s recommended to read the wiki of the project before beginning to get a deeper technical understanding of how this Solution was created and more detailed explanations of Solution-specific vocabulary.

Ingest and preparing data#

The first two Flow zones of the project are straightforward. It begins in the Data Ingestion Flow zone by bringing in initial datasets detailed in the previous Data requirements section.

Dataiku screenshot of the Flow zone dedicated to preparing the Electricity Consumption datasets.

Now that you have access to all input datasets, you can take the three electricity consumption reports, each corresponding to a fictional factory, into the Electricity Consumption preparation Flow zone. It first stacks data from all three factories into a single AllSites_DailyConsumption dataset, prepares the dates, extracts relevant components, and filters rows where electricity consumption is negative.

When adapting this project for your own use, you will need to update dates preparation and components extraction according to your own data. In this Flow, you then create three sub-branches to showcase how to create a ‘soft sensor’ meter that aggregates consumptions from different meters. All data is then re-stacked to have a finalized dataset containing electricity consumption values for all three factories.

The costs of electricity consumption: CO2 emissions#

Now that you have a dataset with electricity consumption for all factories (AllSites_DailyConsumptionRestac), you can convert electricity consumption into its resulting CO2 emissions. Computation of CO2 emissions, however, also relies on knowing where the electricity was consumed.

Therefore, you can begin in the CO2 conversion zone by taking the sites_addresses input dataset as an input to the Geocoder Plugin to retrieve the latitude/longitude coordinates of the three factories. You then append this location data to the dataset containing the factories’ electricity consumption before, finally, using the CO2 Converter Plugin to convert electricity consumption to CO2. The resulting dataset, AllSites_DailyConsumption_CO2, contains three new columns:

  • co2_date_time

  • carbon_intensity (gCO₂eq/kWh)

  • co2_emission (kgCO₂eq)

Dataiku screenshot of the CO2 converter plugin used to convert electricity consumption to CO2 emissions.

Smart production planning: Forecasting the future#

The Time series forecasting Flow zone sees us training a time series forecasting model using the visual ML. Before training, however, we first filter the electricity consumption dataset on the main meters of the 3 factories: process_main, main_a, and factory_main. While the prediction in this project will be made at a site level, a prediction could also be done at the meters or workshops levels.

Post-filter we bring in the Weekly_production_history dataset to be used as an external feature for forecasting so that the model can better decompose the input signal. We also use the Time Series Preparation Plugin to interpolate the data to align this external feature on a fixed sampling rate. With our data sufficiently prepared, we’re ready to train a forecasting model.

Dataiku screenshot of the Forecast plugin being used to train a time series forecast model.

In this Solution we use the model to forecast the daily electricity consumption, and resulting CO2 emissions of three different production plans. In doing so smarter decisions can be made with regards to selecting production plans that will result in a lower amount of CO2 emissions. The 3 scenarios forecast Flow zone looks complex at first but is actually a repeating pattern. To begin we upload a dataset containing three possible production scenarios which is then resampled and split by each scenario into three branches. Each branch employs the same four steps:

  1. Use the trained model to forecast future consumption values.

  2. Prepare the forecasted data to have all electricity consumption (past and future) in the same column.

  3. Enrich the prepared data with coordinates of the factories.

  4. Convert electricity consumption into CO2 emissions using the plugin.

After completion of these four steps in parallel, the three scenarios are stacked into a single dataset and aggregated by factory and scenario.

Reducing our carbon footprint: Past, present, and future emissions#

This Solution comes with five tabs in a pre-built dashboard meant to enable central energy managers to understand the global consumption and emissions of multiple sites by visualizing the resulting analytics of the Flow in a clear, interactive, and shareable manner.

Tab

Description

Carbon intensity in Europe

Was built by ElectricityMap who enabled us to embed their interactive Carbon intensity map in the first tab of our dashboard. With this tab, you can confirm whether the location of your factory is included in the ElectricityMap API.

Group consumptions and emissions

Includes several visualizations that show the total electricity consumption and CO2 emissions per site, as well as the impact of production in different countries. Additionally, an interactive sunburst chart can be used to understand the initial distribution of the meters in the 3 factories.

Dataiku screenshot of the tab in the dashboard containing visualizations showing the total electricity consumption and co2 emissions per site.

Main consumers per factories

Drills us down to a per factory level to see the main consumers of each production site. A second sunburst chart can be interacted with to see the update distribution of the meters at each factory.

Machine learning evaluation

Shows the current quantity produced by each site, the result of the model, and the resulting prediction for each site.

Scenarios

Can be used to view a summary of the 3 production plans, as well as a visualization comparing production in Germany vs. France.

Dataiku screenshot of the final tab of the Dashboard with visualization that summarize the three proposed production plans

Reproducing these processes with minimal effort for your data#

The intent of this project is to enable understanding of how you can use Dataiku to create analysis of global consumption and CO2 emissions to reduce cumulative CO2 emissions of factories. By creating a singular Solution that can benefit and influence the decisions of a variety of teams in a single organization, you can develop and test smarter and more holistic production plans.

This documentation has reviewed provided several suggestions on how to derive value from this Solution. Ultimately however, the “best” approach will depend on your specific needs and data. If you’re interested in adapting this project to the specific goals and needs of your organization, Dataiku offers roll-out and customization services on demand.