Solution | Factories Electricity & CO2 Emissions Forecasting#
Overview#
Business Case#
Demand for industrial products has risen considerably in the past two decades, along with energy consumption and CO2 emissions. All industrial companies are engaged in a race to reduce their CO2 emissions to adapt to a shift in market demand and better manage their environmental risks and obligations to comply both with financial market requests and future regulatory requirements.
With this solution, companies can quickly create a unified and interactive view on their energy consumption across manufacturing sites and meters, and easily convert it to CO2 through real time carbon intensity data provided by electricityMap or RTE. Production planners are able to forecast electricity and CO2 emissions linked to their planned production plans and adjust their geographical allocation choices to optimize their footprint.
Installation#
The process to install this solution differs depending on whether you are using Dataiku Cloud or a self-managed instance.
Dataiku Cloud users should follow the instructions for installing solutions on cloud.
The Cloud Launchpad will automatically meet the technical requirements listed below, and add the Solution to your Dataiku instance.
Once the Solution has been added to your space, move ahead to Data Requirements.
After meeting the technical requirements below, self-managed users can install the Solution in one of two ways:
On your Dataiku instance connected to the internet, click + New Project > Dataiku Solutions > Search for Factories Electricity & CO2 Emissions Forecasting.
Alternatively, download the Solution’s .zip project file, and import it to your Dataiku instance as a new project.
Technical Requirements#
To leverage this solution, you must meet the following requirements:
Have access to a Dataiku 12.1+* instance.
A Python 3.6+ code environment named
solution_factories-electricity-co2-forecasting
and the following required packages:
nbformat>=4.2.0
plotly-express>=0.4.1
MarkupSafe<2.1.0
cloudpickle>=1.3,<1.6
flask>=1.0,<1.1
itsdangerous<2.1.0
Jinja2>=2.11,<2.12
lightgbm>=3.2,<3.3
scikit-learn>=0.20,<0.21
scikit-optimize>=0.7,<0.8
scipy>=1.2,<1.3
statsmodels==0.12.2
xgboost==0.82
gluonts==0.10.4
pmdarima==1.2.1
mxnet==1.8.0.post0
In order to convert electricity consumption for locations other than France, you can use electricityMap API. You can get a token by completing the ElectricityMap Contact Form.
Data Requirements#
The solution depends on several input data sources:
Daily electricity consumption reports (each corresponding to a different fictional factory)
A dataset of site addresses with the postal address of all three fictional factories
A dataset containing the production history of the three factories
Two datasets to show the meter distribution in the factories (before and after data preparation steps)
A dataset representing three proposed production scenarios (this dataset is input to the 3 scenarios forecast Flow zone unlike the other datasets)
Note
This project is meant to be used as a template to guide development of your own analysis in Dataiku. The results of the model should not be used as actionable insights and some of the data provided with the project may not be representative of actual data in a real-life project.
Workflow Overview#
You can follow along with the solution in the Dataiku gallery.
The project has the following high level steps:
Ingest and prepare our data.
Convert electricity consumption to CO2 emissions.
Compute the cost of electricity per factory.
Train a time series model to be used in forecasting energy consumption and CO2 emissions.
Forecast the CO2 emissions of 3 different production plans.
Understand our factories’ CO2 emissions for the past, present, and future with Dashboard visualizations.
Walkthrough#
Note
In addition to reading this document, it is recommended to read the wiki of the project before beginning to get a deeper technical understanding of how this Solution was created and more detailed explanations of Solution-specific vocabulary.
Ingest and Preparing our Data#
The first two Flow zones of our project are fairly straightforward. We begin in the Data Ingestion Flow zone by bringing in initial datasets that are detailed in the previous Data Requirements section.
Now that we have access to all of our input datasets, we can take the 3 electricity consumption reports, each corresponding to a fictional factory, into the Electricity Consumption preparation Flow zone. We first stack data from all 3 factories into a single AllSites_DailyConsumption dataset, prepare the dates, extract relevant components, and filter rows where electricity consumption is negative.
When adapting this project for your own use, dates preparation and components extraction will need to be updated according to your own data. In this Flow, we then create 3 sub branches in order to showcase how to create a ‘soft sensor’ meter that aggregates consumptions from different meters. All data is then restacked to have a finalized dataset containing electricity consumption values for all 3 factories.
The Costs of Electricity Consumption: CO2 Emissions#
Now that we have a dataset with electricity consumption for all factories (AllSites_DailyConsumptionRestac), we can convert our electricity consumption into its resulting CO2 Emissions. Computation of CO2 Emissions, however, also relies on knowing where the Electricity was consumed. Therefore, we begin in the CO2 conversion zone by taking the sites_addresses input dataset as an input to the Geocoder Plugin to retrieve the latitude/longitude coordinates of our 3 factories. We then append this location data to the dataset containing our factories’ electricity consumption before, finally, using the CO2 Converter Plugin to convert electricity consumption to CO2. The resulting dataset, AllSites_DailyConsumption_CO2, contains 3 new columns:
co2_date_time
carbon_intensity (gCO₂eq/kWh)
co2_emission (kgCO₂eq)
Smart Production Planning: Forecasting the Future#
The Time series forecasting Flow zone sees us training a time series forecasting model using the visual ML. Before training, however, we first filter the electricity consumption dataset on the main meters of the 3 factories: process_main, main_a, and factory_main. While the prediction in this project will be made at a site level, a prediction could also be done at the meters or workshops levels.
Post-filter we bring in the Weekly_production_history dataset to be used as an external feature for forecasting so that the model can better decompose the input signal. We also use the Time Series Preparation Plugin to interpolate the data to align this external feature on a fixed sampling rate. With our data sufficiently prepared, we are ready to train a forecasting model.
In this solution we use the model in order to forecast the daily electricity consumption, and resulting CO2 emissions of three different production plans. In doing so smarter decisions can be made with regards to selecting production plans that will result in a lower amount of CO2 emissions. The 3 scenarios forecast Flow zone looks very complex at first but is actually a repeating pattern. To begin we upload a dataset containing three possible production scenarios which is then resampled and split by each scenario into three branches. Each branch employs the same four steps:
Use the trained model to forecast future consumption values.
Prepare the forecasted data to have all electricity consumption (past and future) in the same column.
Enrich the prepared data with coordinates of the factories.
Convert electricity consumption into CO2 emissions using the plugin.
After completion of these four steps in parallel, the three scenarios are stacked into a single dataset and aggregated by factory and scenario.
Reducing Our Carbon Footprint: Past, Present, and Future Emissions#
This solution comes with five tabs in a pre-built dashboard meant to enable central energy managers to understand the global consumption and emissions of multiple sites by visualizing the resulting analytics of the Flow in a clear, interactive, and shareable manner.
Tab |
Description |
---|---|
Carbon intensity in Europe |
Was built by ElectricityMap who enabled us to embed their interactive Carbon intensity map in the first tab of our dashboard. With this tab, you can confirm whether or not the location of your factory is included in the ElectricityMap API. |
Group consumptions and emissions |
Includes several visualizations that show the total electricity consumption and CO2 emissions per site, as well as the impact of production in different countries. Additionally, an interactive sunburst chart can be used to understand the initial distribution of the meters in the 3 factories. |
Main consumers per factories |
Drills us down to a per factory level in order to see the main consumers of each production site. A second sunburst chart can be interacted with to see the update distribution of the meters at each factory. |
Machine learning evaluation |
Shows the current quantity produced by each site, the result of the model, and the resulting prediction for each site. |
Scenarios |
Can be used to view a summary of the 3 production plans, as well as a visualization comparing production in Germany vs. France. |
Reproducing these Processes With Minimal Effort For Your Own Data#
The intent of this project is to enable understanding of how Dataiku can be used to create analysis of global consumption and CO2 emissions in order to reduce cumulative CO2 emissions of factories. By creating a singular solution that can benefit and influence the decisions of a variety of teams in a single organization, smarter and more holistic production plans can be developed and tested.
We’ve provided several suggestions on how to use electricity consumption and pricing data to compute and predict CO2 emissions. If you’re interested in adapting this project to the specific goals and needs of your organization, roll-out and customization services can be offered on demand.