Aggregation¶
Speed up large problems with time series aggregation techniques.
This notebook introduces:
- Resampling: Reduce time resolution (e.g., hourly → 4-hourly)
- Clustering: Identify typical periods (e.g., 8 representative days)
- Two-stage optimization: Size with reduced data, dispatch at full resolution
- Speed vs. accuracy trade-offs: When to use each technique
Setup¶
In [1]:
Copied!
import timeit
import pandas as pd
import plotly.express as px
import xarray as xr
import flixopt as fx
fx.CONFIG.notebook()
import timeit import pandas as pd import plotly.express as px import xarray as xr import flixopt as fx fx.CONFIG.notebook()
Out[1]:
flixopt.config.CONFIG
Load Time Series Data¶
We use real-world district heating data at 15-minute resolution (one month):
In [2]:
Copied!
# Load time series data (15-min resolution)
data = pd.read_csv('data/Zeitreihen2020.csv', index_col=0, parse_dates=True).sort_index()
data = data['2020-01-01':'2020-01-31 23:45:00'] # One month
data.index.name = 'time' # Rename index for consistency
timesteps = data.index
# Extract profiles
electricity_demand = data['P_Netz/MW'].to_numpy()
heat_demand = data['Q_Netz/MW'].to_numpy()
electricity_price = data['Strompr.€/MWh'].to_numpy()
gas_price = data['Gaspr.€/MWh'].to_numpy()
print(f'Timesteps: {len(timesteps)} ({len(timesteps) / 96:.0f} days at 15-min resolution)')
print(f'Heat demand: {heat_demand.min():.1f} - {heat_demand.max():.1f} MW')
print(f'Electricity price: {electricity_price.min():.1f} - {electricity_price.max():.1f} €/MWh')
# Load time series data (15-min resolution) data = pd.read_csv('data/Zeitreihen2020.csv', index_col=0, parse_dates=True).sort_index() data = data['2020-01-01':'2020-01-31 23:45:00'] # One month data.index.name = 'time' # Rename index for consistency timesteps = data.index # Extract profiles electricity_demand = data['P_Netz/MW'].to_numpy() heat_demand = data['Q_Netz/MW'].to_numpy() electricity_price = data['Strompr.€/MWh'].to_numpy() gas_price = data['Gaspr.€/MWh'].to_numpy() print(f'Timesteps: {len(timesteps)} ({len(timesteps) / 96:.0f} days at 15-min resolution)') print(f'Heat demand: {heat_demand.min():.1f} - {heat_demand.max():.1f} MW') print(f'Electricity price: {electricity_price.min():.1f} - {electricity_price.max():.1f} €/MWh')
Timesteps: 2976 (31 days at 15-min resolution) Heat demand: 122.2 - 266.2 MW Electricity price: -3.3 - 72.6 €/MWh
In [3]:
Copied!
# Visualize first week
profiles = xr.Dataset(
{
'Heat Demand [MW]': xr.DataArray(heat_demand[:672], dims=['time'], coords={'time': timesteps[:672]}),
'Electricity Price [€/MWh]': xr.DataArray(
electricity_price[:672], dims=['time'], coords={'time': timesteps[:672]}
),
}
)
df = profiles.to_dataframe().reset_index().melt(id_vars='time', var_name='variable', value_name='value')
fig = px.line(df, x='time', y='value', facet_col='variable', height=300)
fig.update_yaxes(matches=None, showticklabels=True)
fig.for_each_annotation(lambda a: a.update(text=a.text.split('=')[-1]))
fig
# Visualize first week profiles = xr.Dataset( { 'Heat Demand [MW]': xr.DataArray(heat_demand[:672], dims=['time'], coords={'time': timesteps[:672]}), 'Electricity Price [€/MWh]': xr.DataArray( electricity_price[:672], dims=['time'], coords={'time': timesteps[:672]} ), } ) df = profiles.to_dataframe().reset_index().melt(id_vars='time', var_name='variable', value_name='value') fig = px.line(df, x='time', y='value', facet_col='variable', height=300) fig.update_yaxes(matches=None, showticklabels=True) fig.for_each_annotation(lambda a: a.update(text=a.text.split('=')[-1])) fig
Build the Base FlowSystem¶
A typical district heating system with investment decisions:
In [4]:
Copied!
def build_system(timesteps, heat_demand, electricity_demand, electricity_price, gas_price):
"""Build a district heating system with CHP, boiler, and storage (with investment options)."""
fs = fx.FlowSystem(timesteps)
fs.add_elements(
# Buses
fx.Bus('Electricity'),
fx.Bus('Heat'),
fx.Bus('Gas'),
fx.Bus('Coal'),
# Effects
fx.Effect('costs', '€', 'Total Costs', is_standard=True, is_objective=True),
fx.Effect('CO2', 'kg', 'CO2 Emissions'),
# CHP with investment optimization
fx.linear_converters.CHP(
'CHP',
thermal_efficiency=0.58,
electrical_efficiency=0.22,
electrical_flow=fx.Flow('P_el', bus='Electricity', size=200),
thermal_flow=fx.Flow(
'Q_th',
bus='Heat',
size=fx.InvestParameters(
minimum_size=100,
maximum_size=300,
effects_of_investment_per_size={'costs': 10},
),
),
fuel_flow=fx.Flow('Q_fu', bus='Coal'),
),
# Gas Boiler with investment optimization
fx.linear_converters.Boiler(
'Boiler',
thermal_efficiency=0.85,
thermal_flow=fx.Flow(
'Q_th',
bus='Heat',
size=fx.InvestParameters(
minimum_size=0,
maximum_size=150,
effects_of_investment_per_size={'costs': 5},
),
),
fuel_flow=fx.Flow('Q_fu', bus='Gas'),
),
# Thermal Storage with investment optimization
fx.Storage(
'Storage',
capacity_in_flow_hours=fx.InvestParameters(
minimum_size=0,
maximum_size=1000,
effects_of_investment_per_size={'costs': 0.5},
),
initial_charge_state=0,
eta_charge=1,
eta_discharge=1,
relative_loss_per_hour=0.001,
charging=fx.Flow('Charge', size=137, bus='Heat'),
discharging=fx.Flow('Discharge', size=158, bus='Heat'),
),
# Fuel sources
fx.Source(
'GasGrid',
outputs=[fx.Flow('Q_Gas', bus='Gas', size=1000, effects_per_flow_hour={'costs': gas_price, 'CO2': 0.3})],
),
fx.Source(
'CoalSupply',
outputs=[fx.Flow('Q_Coal', bus='Coal', size=1000, effects_per_flow_hour={'costs': 4.6, 'CO2': 0.3})],
),
# Electricity grid connection
fx.Source(
'GridBuy',
outputs=[
fx.Flow(
'P_el',
bus='Electricity',
size=1000,
effects_per_flow_hour={'costs': electricity_price + 0.5, 'CO2': 0.3},
)
],
),
fx.Sink(
'GridSell',
inputs=[fx.Flow('P_el', bus='Electricity', size=1000, effects_per_flow_hour=-(electricity_price - 0.5))],
),
# Demands
fx.Sink('HeatDemand', inputs=[fx.Flow('Q_th', bus='Heat', size=1, fixed_relative_profile=heat_demand)]),
fx.Sink(
'ElecDemand', inputs=[fx.Flow('P_el', bus='Electricity', size=1, fixed_relative_profile=electricity_demand)]
),
)
return fs
flow_system = build_system(timesteps, heat_demand, electricity_demand, electricity_price, gas_price)
print(f'System: {len(timesteps)} timesteps')
def build_system(timesteps, heat_demand, electricity_demand, electricity_price, gas_price): """Build a district heating system with CHP, boiler, and storage (with investment options).""" fs = fx.FlowSystem(timesteps) fs.add_elements( # Buses fx.Bus('Electricity'), fx.Bus('Heat'), fx.Bus('Gas'), fx.Bus('Coal'), # Effects fx.Effect('costs', '€', 'Total Costs', is_standard=True, is_objective=True), fx.Effect('CO2', 'kg', 'CO2 Emissions'), # CHP with investment optimization fx.linear_converters.CHP( 'CHP', thermal_efficiency=0.58, electrical_efficiency=0.22, electrical_flow=fx.Flow('P_el', bus='Electricity', size=200), thermal_flow=fx.Flow( 'Q_th', bus='Heat', size=fx.InvestParameters( minimum_size=100, maximum_size=300, effects_of_investment_per_size={'costs': 10}, ), ), fuel_flow=fx.Flow('Q_fu', bus='Coal'), ), # Gas Boiler with investment optimization fx.linear_converters.Boiler( 'Boiler', thermal_efficiency=0.85, thermal_flow=fx.Flow( 'Q_th', bus='Heat', size=fx.InvestParameters( minimum_size=0, maximum_size=150, effects_of_investment_per_size={'costs': 5}, ), ), fuel_flow=fx.Flow('Q_fu', bus='Gas'), ), # Thermal Storage with investment optimization fx.Storage( 'Storage', capacity_in_flow_hours=fx.InvestParameters( minimum_size=0, maximum_size=1000, effects_of_investment_per_size={'costs': 0.5}, ), initial_charge_state=0, eta_charge=1, eta_discharge=1, relative_loss_per_hour=0.001, charging=fx.Flow('Charge', size=137, bus='Heat'), discharging=fx.Flow('Discharge', size=158, bus='Heat'), ), # Fuel sources fx.Source( 'GasGrid', outputs=[fx.Flow('Q_Gas', bus='Gas', size=1000, effects_per_flow_hour={'costs': gas_price, 'CO2': 0.3})], ), fx.Source( 'CoalSupply', outputs=[fx.Flow('Q_Coal', bus='Coal', size=1000, effects_per_flow_hour={'costs': 4.6, 'CO2': 0.3})], ), # Electricity grid connection fx.Source( 'GridBuy', outputs=[ fx.Flow( 'P_el', bus='Electricity', size=1000, effects_per_flow_hour={'costs': electricity_price + 0.5, 'CO2': 0.3}, ) ], ), fx.Sink( 'GridSell', inputs=[fx.Flow('P_el', bus='Electricity', size=1000, effects_per_flow_hour=-(electricity_price - 0.5))], ), # Demands fx.Sink('HeatDemand', inputs=[fx.Flow('Q_th', bus='Heat', size=1, fixed_relative_profile=heat_demand)]), fx.Sink( 'ElecDemand', inputs=[fx.Flow('P_el', bus='Electricity', size=1, fixed_relative_profile=electricity_demand)] ), ) return fs flow_system = build_system(timesteps, heat_demand, electricity_demand, electricity_price, gas_price) print(f'System: {len(timesteps)} timesteps')
System: 2976 timesteps
Technique 1: Resampling¶
Reduce time resolution to speed up optimization:
In [5]:
Copied!
solver = fx.solvers.HighsSolver(mip_gap=0.01)
# Resample from 15min to 4h resolution
fs_resampled = flow_system.transform.resample('4h')
reduction = (1 - len(fs_resampled.timesteps) / len(flow_system.timesteps)) * 100
print(f'Resampled: {len(flow_system.timesteps)} → {len(fs_resampled.timesteps)} timesteps ({reduction:.0f}% reduction)')
solver = fx.solvers.HighsSolver(mip_gap=0.01) # Resample from 15min to 4h resolution fs_resampled = flow_system.transform.resample('4h') reduction = (1 - len(fs_resampled.timesteps) / len(flow_system.timesteps)) * 100 print(f'Resampled: {len(flow_system.timesteps)} → {len(fs_resampled.timesteps)} timesteps ({reduction:.0f}% reduction)')
Resampled: 2976 → 186 timesteps (94% reduction)
In [6]:
Copied!
# Optimize resampled system
start = timeit.default_timer()
fs_resampled.optimize(solver)
time_resampled = timeit.default_timer() - start
print(f'Resampled: {time_resampled:.1f}s, {fs_resampled.solution["costs"].item():,.0f} €')
# Optimize resampled system start = timeit.default_timer() fs_resampled.optimize(solver) time_resampled = timeit.default_timer() - start print(f'Resampled: {time_resampled:.1f}s, {fs_resampled.solution["costs"].item():,.0f} €')
Resampled: 2.2s, 2,186,508 €
Technique 2: Two-Stage Optimization¶
- Stage 1: Size components with resampled data (fast)
- Stage 2: Fix sizes and optimize dispatch at full resolution
In [7]:
Copied!
# Stage 1: Sizing with resampled data
start = timeit.default_timer()
fs_sizing = flow_system.transform.resample('4h')
fs_sizing.optimize(solver)
time_stage1 = timeit.default_timer() - start
sizes = {k: float(v.item()) for k, v in fs_sizing.statistics.sizes.items()}
print(
f'Stage 1 (sizing): {time_stage1:.1f}s → CHP {sizes["CHP(Q_th)"]:.0f}, Boiler {sizes["Boiler(Q_th)"]:.0f}, Storage {sizes["Storage"]:.0f}'
)
# Stage 1: Sizing with resampled data start = timeit.default_timer() fs_sizing = flow_system.transform.resample('4h') fs_sizing.optimize(solver) time_stage1 = timeit.default_timer() - start sizes = {k: float(v.item()) for k, v in fs_sizing.statistics.sizes.items()} print( f'Stage 1 (sizing): {time_stage1:.1f}s → CHP {sizes["CHP(Q_th)"]:.0f}, Boiler {sizes["Boiler(Q_th)"]:.0f}, Storage {sizes["Storage"]:.0f}' )
Stage 1 (sizing): 2.5s → CHP 300, Boiler 0, Storage 1000
In [8]:
Copied!
# Stage 2: Dispatch at full resolution with fixed sizes
start = timeit.default_timer()
fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.statistics.sizes)
fs_dispatch.optimize(solver)
time_stage2 = timeit.default_timer() - start
print(
f'Stage 2 (dispatch): {time_stage2:.1f}s, {fs_dispatch.solution["costs"].item():,.0f} € (total: {time_stage1 + time_stage2:.1f}s)'
)
# Stage 2: Dispatch at full resolution with fixed sizes start = timeit.default_timer() fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.statistics.sizes) fs_dispatch.optimize(solver) time_stage2 = timeit.default_timer() - start print( f'Stage 2 (dispatch): {time_stage2:.1f}s, {fs_dispatch.solution["costs"].item():,.0f} € (total: {time_stage1 + time_stage2:.1f}s)' )
Stage 2 (dispatch): 6.7s, 2,135,566 € (total: 9.2s)
Technique 3: Full Optimization (Baseline)¶
For comparison, solve the full problem:
In [9]:
Copied!
start = timeit.default_timer()
fs_full = flow_system.copy()
fs_full.optimize(solver)
time_full = timeit.default_timer() - start
print(f'Full optimization: {time_full:.1f}s, {fs_full.solution["costs"].item():,.0f} €')
start = timeit.default_timer() fs_full = flow_system.copy() fs_full.optimize(solver) time_full = timeit.default_timer() - start print(f'Full optimization: {time_full:.1f}s, {fs_full.solution["costs"].item():,.0f} €')
Full optimization: 12.8s, 2,135,566 €
Compare Results¶
In [10]:
Copied!
# Collect results
results = {
'Full (baseline)': {
'Time [s]': time_full,
'Cost [€]': fs_full.solution['costs'].item(),
'CHP Size [MW]': fs_full.statistics.sizes['CHP(Q_th)'].item(),
'Boiler Size [MW]': fs_full.statistics.sizes['Boiler(Q_th)'].item(),
'Storage Size [MWh]': fs_full.statistics.sizes['Storage'].item(),
},
'Resampled (4h)': {
'Time [s]': time_resampled,
'Cost [€]': fs_resampled.solution['costs'].item(),
'CHP Size [MW]': fs_resampled.statistics.sizes['CHP(Q_th)'].item(),
'Boiler Size [MW]': fs_resampled.statistics.sizes['Boiler(Q_th)'].item(),
'Storage Size [MWh]': fs_resampled.statistics.sizes['Storage'].item(),
},
'Two-Stage': {
'Time [s]': time_stage1 + time_stage2,
'Cost [€]': fs_dispatch.solution['costs'].item(),
'CHP Size [MW]': fs_dispatch.statistics.sizes['CHP(Q_th)'].item(),
'Boiler Size [MW]': fs_dispatch.statistics.sizes['Boiler(Q_th)'].item(),
'Storage Size [MWh]': fs_dispatch.statistics.sizes['Storage'].item(),
},
}
comparison = pd.DataFrame(results).T
# Add relative metrics
baseline_cost = comparison.loc['Full (baseline)', 'Cost [€]']
baseline_time = comparison.loc['Full (baseline)', 'Time [s]']
comparison['Cost Gap [%]'] = ((comparison['Cost [€]'] - baseline_cost) / baseline_cost * 100).round(2)
comparison['Speedup'] = (baseline_time / comparison['Time [s]']).round(1)
comparison.style.format(
{
'Time [s]': '{:.2f}',
'Cost [€]': '{:,.0f}',
'CHP Size [MW]': '{:.1f}',
'Boiler Size [MW]': '{:.1f}',
'Storage Size [MWh]': '{:.0f}',
'Cost Gap [%]': '{:.2f}',
'Speedup': '{:.1f}x',
}
)
# Collect results results = { 'Full (baseline)': { 'Time [s]': time_full, 'Cost [€]': fs_full.solution['costs'].item(), 'CHP Size [MW]': fs_full.statistics.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_full.statistics.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_full.statistics.sizes['Storage'].item(), }, 'Resampled (4h)': { 'Time [s]': time_resampled, 'Cost [€]': fs_resampled.solution['costs'].item(), 'CHP Size [MW]': fs_resampled.statistics.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_resampled.statistics.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_resampled.statistics.sizes['Storage'].item(), }, 'Two-Stage': { 'Time [s]': time_stage1 + time_stage2, 'Cost [€]': fs_dispatch.solution['costs'].item(), 'CHP Size [MW]': fs_dispatch.statistics.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_dispatch.statistics.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_dispatch.statistics.sizes['Storage'].item(), }, } comparison = pd.DataFrame(results).T # Add relative metrics baseline_cost = comparison.loc['Full (baseline)', 'Cost [€]'] baseline_time = comparison.loc['Full (baseline)', 'Time [s]'] comparison['Cost Gap [%]'] = ((comparison['Cost [€]'] - baseline_cost) / baseline_cost * 100).round(2) comparison['Speedup'] = (baseline_time / comparison['Time [s]']).round(1) comparison.style.format( { 'Time [s]': '{:.2f}', 'Cost [€]': '{:,.0f}', 'CHP Size [MW]': '{:.1f}', 'Boiler Size [MW]': '{:.1f}', 'Storage Size [MWh]': '{:.0f}', 'Cost Gap [%]': '{:.2f}', 'Speedup': '{:.1f}x', } )
Out[10]:
| Time [s] | Cost [€] | CHP Size [MW] | Boiler Size [MW] | Storage Size [MWh] | Cost Gap [%] | Speedup | |
|---|---|---|---|---|---|---|---|
| Full (baseline) | 12.77 | 2,135,566 | 300.0 | 0.0 | 1000 | 0.00 | 1.0x |
| Resampled (4h) | 2.24 | 2,186,508 | 300.0 | 0.0 | 1000 | 2.39 | 5.7x |
| Two-Stage | 9.23 | 2,135,566 | 300.0 | 0.0 | 1000 | -0.00 | 1.4x |
Visual Comparison: Heat Balance¶
In [11]:
Copied!
# Full optimization heat balance
fs_full.statistics.plot.balance('Heat')
# Full optimization heat balance fs_full.statistics.plot.balance('Heat')
Out[11]:
In [12]:
Copied!
# Two-stage optimization heat balance
fs_dispatch.statistics.plot.balance('Heat')
# Two-stage optimization heat balance fs_dispatch.statistics.plot.balance('Heat')
Out[12]:
Energy Flow Sankey (Full Optimization)¶
A Sankey diagram visualizes the total energy flows:
In [13]:
Copied!
fs_full.statistics.plot.sankey.flows()
fs_full.statistics.plot.sankey.flows()
Out[13]:
When to Use Each Technique¶
| Technique | Best For | Trade-off |
|---|---|---|
| Full optimization | Final results, small problems | Slowest, most accurate |
| Resampling | Quick screening, trend analysis | Fast, loses temporal detail |
| Two-stage | Investment decisions, large problems | Good balance of speed and accuracy |
| Clustering | Preserves extreme periods | Requires tsam package |
Resampling Options¶
# Different resolutions
fs_2h = flow_system.transform.resample('2h') # 2-hourly
fs_4h = flow_system.transform.resample('4h') # 4-hourly
fs_daily = flow_system.transform.resample('1D') # Daily
# Different aggregation methods
fs_mean = flow_system.transform.resample('4h', method='mean') # Default
fs_max = flow_system.transform.resample('4h', method='max') # Preserve peaks
Two-Stage Workflow¶
# Stage 1: Sizing
fs_sizing = flow_system.transform.resample('4h')
fs_sizing.optimize(solver)
# Stage 2: Dispatch
fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.statistics.sizes)
fs_dispatch.optimize(solver)
Summary¶
You learned how to:
- Use
transform.resample()to reduce time resolution - Apply two-stage optimization for large investment problems
- Use
transform.fix_sizes()to lock in investment decisions - Compare speed vs. accuracy trade-offs
Key Takeaways¶
- Start fast: Use resampling for initial exploration
- Iterate: Refine with two-stage optimization
- Validate: Run full optimization for final results
- Monitor: Check cost gaps to ensure acceptable accuracy
Next Steps¶
- 08b-Rolling Horizon: For operational problems without investment decisions, decompose time into sequential segments
Further Reading¶
- For clustering with typical periods, see
transform.cluster()(requirestsampackage) - For time selection, see
transform.sel()andtransform.isel()