Using an Ensemble Forecast

Note

This example is an extension of Using Lookup Tables. It assumes prior knowledge of goal programming and the lookup tables components of RTC-Tools. If you are a first-time user of RTC-Tools, see Filling a Reservoir.

Then biggest change to RTC-Tools when using an ensemble is the structure of the directory. The folder <examples directory>\ensemble contains a complete RTC-Tools ensemble optimization problem. An RTC-Tools ensemble directory has the following structure:

  • model: This folder contains the Modelica model. The Modelica model contains the physics of the RTC-Tools model.

  • src: This folder contains a Python file. This file contains the configuration of the model and is used to run the model.

  • input: This folder contains the model input data pertaining to each ensemble member:

    • ensemble.csv: a file where the names and probabilities of the ensemble members are defined

    • forecast1

      • the file timeseries_import.csv

      • the file initial_state.csv

    • forecast2

      • timeseries_import.csv

      • initial_state.csv

  • output: The folder where the output is saved:

    • forecast1

      • timeseries_export.csv

    • forecast2

      • timeseries_export.csv

The Model

Note

This example uses the same hydraulic model as the basic example. For a detailed explanation of the hydraulic model, see Filling a Reservoir.

In OpenModelica Connection Editor, the model looks like this:

../../_images/simple_storage_openmodelica.png

In text mode, the Modelica model is as follows (with annotation statements removed):

 1model Example
 2  Deltares.ChannelFlow.SimpleRouting.BoundaryConditions.Inflow inflow;
 3  Deltares.ChannelFlow.SimpleRouting.Storage.Storage storage(V(nominal=4e5, min=2e5, max=6e5));
 4  Deltares.ChannelFlow.SimpleRouting.BoundaryConditions.Terminal outfall;
 5  input Modelica.SIunits.VolumeFlowRate Q_in(fixed = true);
 6  input Modelica.SIunits.VolumeFlowRate Q_release(fixed = false, min = 0.0, max = 6.0);
 7equation
 8  connect(inflow.QOut, storage.QIn);
 9  connect(storage.QOut, outfall.QIn);
10  storage.Q_release = Q_release;
11  inflow.Q = Q_in;
12end Example;

The Optimization Problem

The python script consists of the following blocks:

  • Import of packages

  • Declaration of Goals

  • Declaration of the optimization problem class

    • Constructor

    • Set csv_ensemble_mode = True

    • Declaration of a pre() method

    • Specification of Goals

    • Declaration of a priority_completed() method

    • Declaration of a post() method

    • Additional configuration of the solver

  • A run statement

Importing Packages

For this example, the import block is as follows:

 1import numpy as np
 2from rtctools.optimization.collocated_integrated_optimization_problem import (
 3    CollocatedIntegratedOptimizationProblem,
 4)
 5from rtctools.optimization.control_tree_mixin import ControlTreeMixin
 6from rtctools.optimization.csv_lookup_table_mixin import CSVLookupTableMixin
 7from rtctools.optimization.csv_mixin import CSVMixin
 8from rtctools.optimization.goal_programming_mixin import GoalProgrammingMixin, StateGoal
 9from rtctools.optimization.modelica_mixin import ModelicaMixin
10from rtctools.util import run_optimization_problem

Declaring Goals

First, we have a high priority goal to keep the water volume within a minimum and maximum.

13class WaterVolumeRangeGoal(StateGoal):
14    def __init__(self, optimization_problem):
15        # Assign V_min and V_max the the target range
16        self.target_min = optimization_problem.get_timeseries("V_min")
17        self.target_max = optimization_problem.get_timeseries("V_max")
18        super().__init__(optimization_problem)
19
20    state = "storage.V"
21    priority = 1

We also want to save energy, so we define a goal to minimize Q_release. This goal has a lower priority.

24class MinimizeQreleaseGoal(StateGoal):
25    # GoalProgrammingMixin will try to minimize the following state
26    state = "Q_release"
27    # The lower the number returned by this function, the higher the priority.
28    priority = 2
29    # The penalty variable is taken to the order'th power.
30    order = 1

Optimization Problem

Next, we construct the class by declaring it and inheriting the desired parent classes.

33class Example(
34    GoalProgrammingMixin,
35    CSVMixin,
36    CSVLookupTableMixin,
37    ModelicaMixin,
38    ControlTreeMixin,
39    CollocatedIntegratedOptimizationProblem,
40):

We turn on ensemble mode by setting csv_ensemble_mode = True:

49    def pre(self):
50        # Do the standard preprocessing

The method pre() is already defined in RTC-Tools, but we would like to add a line to it to create a variable for storing intermediate results. To do this, we declare a new pre() method, call super().pre() to ensure that the original method runs unmodified, and add in a variable declaration to store our list of intermediate results. This variable is a dict, reflecting the need to store results from multiple ensemble.

Because the timeseries we set will be the same for both ensemble members, we also make sure that the timeseries we set are set for both ensemble members using for loops.

49    def pre(self):
50        # Do the standard preprocessing
51        super().pre()
52
53        # Create a dict of empty lists for storing intermediate results from
54        # each ensemble
55        self.intermediate_results = {
56            ensemble_member: [] for ensemble_member in range(self.ensemble_size)
57        }
58
59        # Cache lookup tables for convenience and code legibility
60        _lookup_tables = self.lookup_tables(ensemble_member=0)
61        self.lookup_storage_V = _lookup_tables["storage_V"]
62
63        # Non-varying goals can be implemented as a timeseries
64        for e_m in range(self.ensemble_size):
65            self.set_timeseries("H_min", np.ones_like(self.times()) * 0.44, ensemble_member=e_m)
66            self.set_timeseries("H_max", np.ones_like(self.times()) * 0.46, ensemble_member=e_m)
67
68            # Q_in is a varying input and is defined in each timeseries_import.csv
69            # However, if we set it again here, it will be added to the output files
70            self.set_timeseries(
71                "Q_in", self.get_timeseries("Q_in", ensemble_member=e_m), ensemble_member=e_m
72            )
73
74            # Convert our water level goals into volume goals
75            self.set_timeseries(
76                "V_max", self.lookup_storage_V(self.get_timeseries("H_max")), ensemble_member=e_m
77            )
78            self.set_timeseries(
79                "V_min", self.lookup_storage_V(self.get_timeseries("H_min")), ensemble_member=e_m
80            )

Now we pass in the goals:

82    def path_goals(self):
83        g = []
84        g.append(WaterVolumeRangeGoal(self))
85        g.append(MinimizeQreleaseGoal(self))
86        return g

In order to better demonstrate the way that ensembles are handled in RTC- Tools, we modify the control_tree_options() method. The default setting allows the control tree to split at every time, but we override this method and force it to split at a single timestep. See ensemble-results at the bottom of the page for more information.

88    def control_tree_options(self):
89        # We want to modify the control tree options, so we override the default
90        # control_tree_options method. We call super() to get the default options
91        options = super().control_tree_options()
92        # Change the branching_times list to only contain the fifth timestep
93        options["branching_times"] = [self.times()[5]]
94        return options

We define the priority_completed() method. We ensure that it stores the results from both ensemble members.

 96    def priority_completed(self, priority):
 97        # We want to print some information about our goal programming problem.
 98        # We store the useful numbers temporarily, and print information at the
 99        # end of our run.
100        for e_m in range(self.ensemble_size):
101            results = self.extract_results(e_m)
102            self.set_timeseries("V_storage", results["storage.V"], ensemble_member=e_m)
103
104            _max = self.get_timeseries("V_max", ensemble_member=e_m).values
105            _min = self.get_timeseries("V_min", ensemble_member=e_m).values
106            V_storage = self.get_timeseries("V_storage", ensemble_member=e_m).values
107
108            # A little bit of tolerance when checking for acceptance. This
109            # tolerance must be set greater than the tolerance of the solver.
110            tol = 10
111            _max += tol
112            _min -= tol
113            n_level_satisfied = sum(np.logical_and(_min <= V_storage, V_storage <= _max))
114            q_release_integral = sum(results["Q_release"])
115            self.intermediate_results[e_m].append((priority, n_level_satisfied, q_release_integral))

We output our intermediate results using the post() method:

117    def post(self):
118        super().post()
119        for e_m in range(self.ensemble_size):
120            print("\n\nResults for Ensemble Member {}:".format(e_m))
121            for priority, n_level_satisfied, q_release_integral in self.intermediate_results[e_m]:
122                print("\nAfter finishing goals of priority {}:".format(priority))
123                print(
124                    "Level goal satisfied at {} of {} time steps".format(
125                        n_level_satisfied, len(self.times())
126                    )
127                )
128                print("Integral of Q_release = {:.2f}".format(q_release_integral))

Finally, we want to apply some additional configuration, reducing the amount of information the solver outputs:

131    def solver_options(self):
132        options = super().solver_options()
133        # When mumps_scaling is not zero, errors occur. RTC-Tools does its own
134        # scaling, so mumps scaling is not critical. Proprietary HSL solvers
135        # do not exhibit this error.
136        solver = options["solver"]
137        options[solver]["mumps_scaling"] = 0
138        options[solver]["print_level"] = 1
139        return options

Run the Optimization Problem

To make our script run, at the bottom of our file we just have to call the run_optimization_problem() method we imported on the optimization problem class we just created.

143run_optimization_problem(Example)

The Whole Script

All together, the whole example script is as follows:

  1import numpy as np
  2from rtctools.optimization.collocated_integrated_optimization_problem import (
  3    CollocatedIntegratedOptimizationProblem,
  4)
  5from rtctools.optimization.control_tree_mixin import ControlTreeMixin
  6from rtctools.optimization.csv_lookup_table_mixin import CSVLookupTableMixin
  7from rtctools.optimization.csv_mixin import CSVMixin
  8from rtctools.optimization.goal_programming_mixin import GoalProgrammingMixin, StateGoal
  9from rtctools.optimization.modelica_mixin import ModelicaMixin
 10from rtctools.util import run_optimization_problem
 11
 12
 13class WaterVolumeRangeGoal(StateGoal):
 14    def __init__(self, optimization_problem):
 15        # Assign V_min and V_max the the target range
 16        self.target_min = optimization_problem.get_timeseries("V_min")
 17        self.target_max = optimization_problem.get_timeseries("V_max")
 18        super().__init__(optimization_problem)
 19
 20    state = "storage.V"
 21    priority = 1
 22
 23
 24class MinimizeQreleaseGoal(StateGoal):
 25    # GoalProgrammingMixin will try to minimize the following state
 26    state = "Q_release"
 27    # The lower the number returned by this function, the higher the priority.
 28    priority = 2
 29    # The penalty variable is taken to the order'th power.
 30    order = 1
 31
 32
 33class Example(
 34    GoalProgrammingMixin,
 35    CSVMixin,
 36    CSVLookupTableMixin,
 37    ModelicaMixin,
 38    ControlTreeMixin,
 39    CollocatedIntegratedOptimizationProblem,
 40):
 41    """
 42    An extention of the goal programming and lookuptable examples that
 43    demonstrates how to work with ensembles.
 44    """
 45
 46    # Overide default csv_ensemble_mode = False from CSVMixin before calling pre()
 47    csv_ensemble_mode = True
 48
 49    def pre(self):
 50        # Do the standard preprocessing
 51        super().pre()
 52
 53        # Create a dict of empty lists for storing intermediate results from
 54        # each ensemble
 55        self.intermediate_results = {
 56            ensemble_member: [] for ensemble_member in range(self.ensemble_size)
 57        }
 58
 59        # Cache lookup tables for convenience and code legibility
 60        _lookup_tables = self.lookup_tables(ensemble_member=0)
 61        self.lookup_storage_V = _lookup_tables["storage_V"]
 62
 63        # Non-varying goals can be implemented as a timeseries
 64        for e_m in range(self.ensemble_size):
 65            self.set_timeseries("H_min", np.ones_like(self.times()) * 0.44, ensemble_member=e_m)
 66            self.set_timeseries("H_max", np.ones_like(self.times()) * 0.46, ensemble_member=e_m)
 67
 68            # Q_in is a varying input and is defined in each timeseries_import.csv
 69            # However, if we set it again here, it will be added to the output files
 70            self.set_timeseries(
 71                "Q_in", self.get_timeseries("Q_in", ensemble_member=e_m), ensemble_member=e_m
 72            )
 73
 74            # Convert our water level goals into volume goals
 75            self.set_timeseries(
 76                "V_max", self.lookup_storage_V(self.get_timeseries("H_max")), ensemble_member=e_m
 77            )
 78            self.set_timeseries(
 79                "V_min", self.lookup_storage_V(self.get_timeseries("H_min")), ensemble_member=e_m
 80            )
 81
 82    def path_goals(self):
 83        g = []
 84        g.append(WaterVolumeRangeGoal(self))
 85        g.append(MinimizeQreleaseGoal(self))
 86        return g
 87
 88    def control_tree_options(self):
 89        # We want to modify the control tree options, so we override the default
 90        # control_tree_options method. We call super() to get the default options
 91        options = super().control_tree_options()
 92        # Change the branching_times list to only contain the fifth timestep
 93        options["branching_times"] = [self.times()[5]]
 94        return options
 95
 96    def priority_completed(self, priority):
 97        # We want to print some information about our goal programming problem.
 98        # We store the useful numbers temporarily, and print information at the
 99        # end of our run.
100        for e_m in range(self.ensemble_size):
101            results = self.extract_results(e_m)
102            self.set_timeseries("V_storage", results["storage.V"], ensemble_member=e_m)
103
104            _max = self.get_timeseries("V_max", ensemble_member=e_m).values
105            _min = self.get_timeseries("V_min", ensemble_member=e_m).values
106            V_storage = self.get_timeseries("V_storage", ensemble_member=e_m).values
107
108            # A little bit of tolerance when checking for acceptance. This
109            # tolerance must be set greater than the tolerance of the solver.
110            tol = 10
111            _max += tol
112            _min -= tol
113            n_level_satisfied = sum(np.logical_and(_min <= V_storage, V_storage <= _max))
114            q_release_integral = sum(results["Q_release"])
115            self.intermediate_results[e_m].append((priority, n_level_satisfied, q_release_integral))
116
117    def post(self):
118        super().post()
119        for e_m in range(self.ensemble_size):
120            print("\n\nResults for Ensemble Member {}:".format(e_m))
121            for priority, n_level_satisfied, q_release_integral in self.intermediate_results[e_m]:
122                print("\nAfter finishing goals of priority {}:".format(priority))
123                print(
124                    "Level goal satisfied at {} of {} time steps".format(
125                        n_level_satisfied, len(self.times())
126                    )
127                )
128                print("Integral of Q_release = {:.2f}".format(q_release_integral))
129
130    # Any solver options can be set here
131    def solver_options(self):
132        options = super().solver_options()
133        # When mumps_scaling is not zero, errors occur. RTC-Tools does its own
134        # scaling, so mumps scaling is not critical. Proprietary HSL solvers
135        # do not exhibit this error.
136        solver = options["solver"]
137        options[solver]["mumps_scaling"] = 0
138        options[solver]["print_level"] = 1
139        return options
140
141
142# Run
143run_optimization_problem(Example)

Running the Optimization Problem

Following the execution of the optimization problem, the post() method should print out the following lines:

Results for Ensemble Member 0:

After finishing goals of priority 1:
Level goal satisfied at 10 of 12 time steps
Integral of Q_release = 17.34

After finishing goals of priority 2:
Level goal satisfied at 9 of 12 time steps
Integral of Q_release = 17.12


Results for Ensemble Member 1:

After finishing goals of priority 1:
Level goal satisfied at 10 of 12 time steps
Integral of Q_release = 20.82

After finishing goals of priority 2:
Level goal satisfied at 9 of 12 time steps
Integral of Q_release = 20.60

This is the same output as the output for Mixed Integer Optimization: Pumps and Orifices, except now the output for each ensemble is printed.

Extracting Results

The results from the run are found in output/forecast1/timeseries_export.csv and output/forecast2/timeseries_export.csv. Any CSV-reading software can import it, but this is how results can be plotted using the python library matplotlib:

from datetime import datetime

import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np
from pylab import get_cmap

forecast_names = ["forecast1", "forecast2"]
dir_template = "../../../examples/ensemble/reference_output/{}/timeseries_export.csv"

# Import Data
forcasts = {}
for forecast in forecast_names:
    data_path = dir_template.format(forecast)
    forcasts[forecast] = np.recfromcsv(data_path, encoding=None)

# Get times as datetime objects
times = [datetime.strptime(x, "%Y-%m-%d %H:%M:%S") for x in forcasts[forecast_names[0]]["time"]]

n_subplots = 3
fig, axarr = plt.subplots(n_subplots, sharex=True, figsize=(8, 4 * n_subplots))
axarr[0].set_title("Water Volume and Discharge")
cmaps = ["Blues", "Greens"]
shades = [0.5, 0.8]

# Upper Subplot
axarr[0].set_ylabel("Water Volume in Storage [m³]")
axarr[0].ticklabel_format(style="sci", axis="y", scilimits=(0, 0))

# Lower Subplots
axarr[1].set_ylabel("Flow Rate [m³/s]")
axarr[2].set_ylabel("Flow Rate [m³/s]")

# Plot Ensemble Members
for idx, forecast in enumerate(forecast_names):
    # Upper Subplot
    results = forcasts[forecast]
    if idx == 0:
        axarr[0].plot(times, results["v_max"], label="Max", linewidth=2, color="r", linestyle="--")
        axarr[0].plot(times, results["v_min"], label="Min", linewidth=2, color="g", linestyle="--")
    axarr[0].plot(
        times,
        results["v_storage"],
        label=forecast + ":Volume",
        linewidth=2,
        color=get_cmap(cmaps[idx])(shades[1]),
    )

    # middle subplot
    # put dots to see at which times the decision variables are defined:
    axarr[1].scatter(times, results["q_in"], linewidth=1, color=get_cmap(cmaps[idx])(shades[0]))
    axarr[1].scatter(
        times, results["q_release"], linewidth=1, color=get_cmap(cmaps[idx])(shades[1])
    )

    # add horizontal lines to the left of these dots, to indicate that the value is attained over
    # an entire timestep:
    axarr[1].step(
        times,
        results["q_in"],
        linewidth=2,
        where="pre",
        label="{}:Inflow".format(forecast),
        color=get_cmap(cmaps[idx])(shades[0]),
    )
    axarr[1].step(
        times,
        results["q_release"],
        linewidth=1,
        where="pre",
        label="{}:Release".format(forecast),
        color=get_cmap(cmaps[idx])(shades[1]),
    )

    # Lower Subplot
    axarr[2].plot(
        times,
        results["q_in"],
        label="{}:Inflow".format(forecast),
        linewidth=2,
        color=get_cmap(cmaps[idx])(shades[0]),
    )
    axarr[2].plot(
        times,
        results["q_release"],
        label="{}:Release".format(forecast),
        linewidth=2,
        color=get_cmap(cmaps[idx])(shades[1]),
    )


# Format bottom axis labels
axarr[-1].xaxis.set_major_formatter(mdates.DateFormatter("%m/%d"))

# Shrink margins
fig.tight_layout()

# Shrink each axis and put a legend to the right of the axis
for i in range(len(axarr)):
    box = axarr[i].get_position()
    axarr[i].set_position([box.x0, box.y0, box.width * 0.75, box.height])
    axarr[i].legend(loc="center left", bbox_to_anchor=(1, 0.5), frameon=False)

plt.autoscale(enable=True, axis="x", tight=True)

# Output Plot
plt.show()

(Source code, svg, png)

../../_images/ensemble_results.svg

This plots the inflow and release values in two different ways: the middle graph depicts them in the same way you have seen until now (and this is how they truly should be interpreted). The lower plot shows the same points, but connected with line segments. This makes it easier to see where the flows diverge.

Observations

Note that in the results plotted above, the control tree follows a single path and does not branch until it arrives at the first branching time. Up until the branching point, RTC-Tools is making decisions that enhance the flexibility of the system so that it can respond as ideally as possible to whichever future emerges. In the case of two forecasts, this means taking a path that falls between the two possible futures. This will cause the water level to diverge from the ideal levels as time progresses. While this appears to be suboptimal, it is preferable to simply gambling on one of the forecasts coming true and ignoring the other. Once the branching time is reached, RTC-Tools is allowed to optimize for each individual branch separately. Immediately, RTC-Tools applies the corrective control needed to get the water levels into the acceptable range. If the operator simply picks a forecast to use and guesses wrong, the corrective control will have to be much more drastic and potentially catastrophic.