Aggregation¶
Speed up large problems with time series aggregation techniques.
This notebook introduces:
- Resampling: Reduce time resolution (e.g., hourly → 4-hourly)
- Two-stage optimization: Size with reduced data, dispatch at full resolution
- Speed vs. accuracy trade-offs: When to use each technique
Setup¶
In [1]:
Copied!
import timeit
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import flixopt as fx
fx.CONFIG.notebook()
import timeit import pandas as pd import plotly.graph_objects as go from plotly.subplots import make_subplots import flixopt as fx fx.CONFIG.notebook()
Out[1]:
flixopt.config.CONFIG
Create the FlowSystem¶
We use a district heating system with real-world time series data (one month at hourly resolution):
In [2]:
Copied!
from data.generate_example_systems import create_district_heating_system
flow_system = create_district_heating_system()
flow_system.connect_and_transform() # Align all data as xarray
timesteps = flow_system.timesteps
print(f'Loaded FlowSystem: {len(timesteps)} timesteps ({len(timesteps) / 24:.0f} days at hourly resolution)')
print(f'Components: {list(flow_system.components.keys())}')
from data.generate_example_systems import create_district_heating_system flow_system = create_district_heating_system() flow_system.connect_and_transform() # Align all data as xarray timesteps = flow_system.timesteps print(f'Loaded FlowSystem: {len(timesteps)} timesteps ({len(timesteps) / 24:.0f} days at hourly resolution)') print(f'Components: {list(flow_system.components.keys())}')
Loaded FlowSystem: 744 timesteps (31 days at hourly resolution) Components: ['CHP', 'Boiler', 'Storage', 'GasGrid', 'CoalSupply', 'GridBuy', 'GridSell', 'HeatDemand', 'ElecDemand']
In [3]:
Copied!
# Visualize first week of data
heat_demand = flow_system.components['HeatDemand'].inputs[0].fixed_relative_profile
electricity_price = flow_system.components['GridBuy'].outputs[0].effects_per_flow_hour['costs']
fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.1)
fig.add_trace(go.Scatter(x=timesteps[:168], y=heat_demand.values[:168], name='Heat Demand'), row=1, col=1)
fig.add_trace(go.Scatter(x=timesteps[:168], y=electricity_price.values[:168], name='Electricity Price'), row=2, col=1)
fig.update_layout(height=400, title='First Week of Data')
fig.update_yaxes(title_text='Heat Demand [MW]', row=1, col=1)
fig.update_yaxes(title_text='El. Price [€/MWh]', row=2, col=1)
fig.show()
# Visualize first week of data heat_demand = flow_system.components['HeatDemand'].inputs[0].fixed_relative_profile electricity_price = flow_system.components['GridBuy'].outputs[0].effects_per_flow_hour['costs'] fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.1) fig.add_trace(go.Scatter(x=timesteps[:168], y=heat_demand.values[:168], name='Heat Demand'), row=1, col=1) fig.add_trace(go.Scatter(x=timesteps[:168], y=electricity_price.values[:168], name='Electricity Price'), row=2, col=1) fig.update_layout(height=400, title='First Week of Data') fig.update_yaxes(title_text='Heat Demand [MW]', row=1, col=1) fig.update_yaxes(title_text='El. Price [€/MWh]', row=2, col=1) fig.show()
Technique 1: Resampling¶
Reduce time resolution to speed up optimization:
In [4]:
Copied!
solver = fx.solvers.HighsSolver(mip_gap=0.01)
# Resample from 1h to 4h resolution
fs_resampled = flow_system.transform.resample('4h')
reduction = (1 - len(fs_resampled.timesteps) / len(flow_system.timesteps)) * 100
print(f'Resampled: {len(flow_system.timesteps)} → {len(fs_resampled.timesteps)} timesteps ({reduction:.0f}% reduction)')
solver = fx.solvers.HighsSolver(mip_gap=0.01) # Resample from 1h to 4h resolution fs_resampled = flow_system.transform.resample('4h') reduction = (1 - len(fs_resampled.timesteps) / len(flow_system.timesteps)) * 100 print(f'Resampled: {len(flow_system.timesteps)} → {len(fs_resampled.timesteps)} timesteps ({reduction:.0f}% reduction)')
Resampled: 744 → 186 timesteps (75% reduction)
In [5]:
Copied!
# Optimize resampled system
start = timeit.default_timer()
fs_resampled.optimize(solver)
time_resampled = timeit.default_timer() - start
print(f'Resampled: {time_resampled:.1f}s, {fs_resampled.solution["costs"].item():,.0f} €')
# Optimize resampled system start = timeit.default_timer() fs_resampled.optimize(solver) time_resampled = timeit.default_timer() - start print(f'Resampled: {time_resampled:.1f}s, {fs_resampled.solution["costs"].item():,.0f} €')
Resampled: 6.0s, -135,955 €
Technique 2: Two-Stage Optimization¶
- Stage 1: Size components with resampled data (fast)
- Stage 2: Fix sizes and optimize dispatch at full resolution
In [6]:
Copied!
# Stage 1: Sizing with resampled data
start = timeit.default_timer()
fs_sizing = flow_system.transform.resample('4h')
fs_sizing.optimize(solver)
time_stage1 = timeit.default_timer() - start
sizes = {k: float(v.item()) for k, v in fs_sizing.stats.sizes.items()}
print(
f'Stage 1 (sizing): {time_stage1:.1f}s → CHP {sizes["CHP(Q_th)"]:.0f}, Boiler {sizes["Boiler(Q_th)"]:.0f}, Storage {sizes["Storage"]:.0f}'
)
# Stage 1: Sizing with resampled data start = timeit.default_timer() fs_sizing = flow_system.transform.resample('4h') fs_sizing.optimize(solver) time_stage1 = timeit.default_timer() - start sizes = {k: float(v.item()) for k, v in fs_sizing.stats.sizes.items()} print( f'Stage 1 (sizing): {time_stage1:.1f}s → CHP {sizes["CHP(Q_th)"]:.0f}, Boiler {sizes["Boiler(Q_th)"]:.0f}, Storage {sizes["Storage"]:.0f}' )
Stage 1 (sizing): 6.2s → CHP 157, Boiler 0, Storage 1000
In [7]:
Copied!
# Stage 2: Dispatch at full resolution with fixed sizes
start = timeit.default_timer()
fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes)
fs_dispatch.name = 'Two-Stage'
fs_dispatch.optimize(solver)
time_stage2 = timeit.default_timer() - start
print(
f'Stage 2 (dispatch): {time_stage2:.1f}s, {fs_dispatch.solution["costs"].item():,.0f} € (total: {time_stage1 + time_stage2:.1f}s)'
)
# Stage 2: Dispatch at full resolution with fixed sizes start = timeit.default_timer() fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes) fs_dispatch.name = 'Two-Stage' fs_dispatch.optimize(solver) time_stage2 = timeit.default_timer() - start print( f'Stage 2 (dispatch): {time_stage2:.1f}s, {fs_dispatch.solution["costs"].item():,.0f} € (total: {time_stage1 + time_stage2:.1f}s)' )
Stage 2 (dispatch): 7.4s, -148,776 € (total: 13.6s)
Technique 3: Full Optimization (Baseline)¶
For comparison, solve the full problem:
In [8]:
Copied!
start = timeit.default_timer()
fs_full = flow_system.copy()
fs_full.name = 'Full Optimization'
fs_full.optimize(solver)
time_full = timeit.default_timer() - start
print(f'Full optimization: {time_full:.1f}s, {fs_full.solution["costs"].item():,.0f} €')
start = timeit.default_timer() fs_full = flow_system.copy() fs_full.name = 'Full Optimization' fs_full.optimize(solver) time_full = timeit.default_timer() - start print(f'Full optimization: {time_full:.1f}s, {fs_full.solution["costs"].item():,.0f} €')
Full optimization: 16.7s, -148,912 €
Compare Results¶
In [9]:
Copied!
# Collect results
results = {
'Full (baseline)': {
'Time [s]': time_full,
'Cost [€]': fs_full.solution['costs'].item(),
'CHP Size [MW]': fs_full.stats.sizes['CHP(Q_th)'].item(),
'Boiler Size [MW]': fs_full.stats.sizes['Boiler(Q_th)'].item(),
'Storage Size [MWh]': fs_full.stats.sizes['Storage'].item(),
},
'Resampled (4h)': {
'Time [s]': time_resampled,
'Cost [€]': fs_resampled.solution['costs'].item(),
'CHP Size [MW]': fs_resampled.stats.sizes['CHP(Q_th)'].item(),
'Boiler Size [MW]': fs_resampled.stats.sizes['Boiler(Q_th)'].item(),
'Storage Size [MWh]': fs_resampled.stats.sizes['Storage'].item(),
},
'Two-Stage': {
'Time [s]': time_stage1 + time_stage2,
'Cost [€]': fs_dispatch.solution['costs'].item(),
'CHP Size [MW]': fs_dispatch.stats.sizes['CHP(Q_th)'].item(),
'Boiler Size [MW]': fs_dispatch.stats.sizes['Boiler(Q_th)'].item(),
'Storage Size [MWh]': fs_dispatch.stats.sizes['Storage'].item(),
},
}
comparison = pd.DataFrame(results).T
# Add relative metrics
baseline_cost = comparison.loc['Full (baseline)', 'Cost [€]']
baseline_time = comparison.loc['Full (baseline)', 'Time [s]']
comparison['Cost Gap [%]'] = ((comparison['Cost [€]'] - baseline_cost) / baseline_cost * 100).round(2)
comparison['Speedup'] = (baseline_time / comparison['Time [s]']).round(1)
comparison.style.format(
{
'Time [s]': '{:.2f}',
'Cost [€]': '{:,.0f}',
'CHP Size [MW]': '{:.1f}',
'Boiler Size [MW]': '{:.1f}',
'Storage Size [MWh]': '{:.0f}',
'Cost Gap [%]': '{:.2f}',
'Speedup': '{:.1f}x',
}
)
# Collect results results = { 'Full (baseline)': { 'Time [s]': time_full, 'Cost [€]': fs_full.solution['costs'].item(), 'CHP Size [MW]': fs_full.stats.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_full.stats.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_full.stats.sizes['Storage'].item(), }, 'Resampled (4h)': { 'Time [s]': time_resampled, 'Cost [€]': fs_resampled.solution['costs'].item(), 'CHP Size [MW]': fs_resampled.stats.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_resampled.stats.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_resampled.stats.sizes['Storage'].item(), }, 'Two-Stage': { 'Time [s]': time_stage1 + time_stage2, 'Cost [€]': fs_dispatch.solution['costs'].item(), 'CHP Size [MW]': fs_dispatch.stats.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_dispatch.stats.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_dispatch.stats.sizes['Storage'].item(), }, } comparison = pd.DataFrame(results).T # Add relative metrics baseline_cost = comparison.loc['Full (baseline)', 'Cost [€]'] baseline_time = comparison.loc['Full (baseline)', 'Time [s]'] comparison['Cost Gap [%]'] = ((comparison['Cost [€]'] - baseline_cost) / baseline_cost * 100).round(2) comparison['Speedup'] = (baseline_time / comparison['Time [s]']).round(1) comparison.style.format( { 'Time [s]': '{:.2f}', 'Cost [€]': '{:,.0f}', 'CHP Size [MW]': '{:.1f}', 'Boiler Size [MW]': '{:.1f}', 'Storage Size [MWh]': '{:.0f}', 'Cost Gap [%]': '{:.2f}', 'Speedup': '{:.1f}x', } )
Out[9]:
| Time [s] | Cost [€] | CHP Size [MW] | Boiler Size [MW] | Storage Size [MWh] | Cost Gap [%] | Speedup | |
|---|---|---|---|---|---|---|---|
| Full (baseline) | 16.69 | -148,912 | 165.7 | 0.0 | 1000 | -0.00 | 1.0x |
| Resampled (4h) | 5.97 | -135,955 | 156.6 | 0.0 | 1000 | -8.70 | 2.8x |
| Two-Stage | 13.62 | -148,776 | 156.6 | 0.0 | 1000 | -0.09 | 1.2x |
Visual Comparison: Heat Balance¶
Compare the full optimization with the two-stage approach side-by-side:
In [10]:
Copied!
# Side-by-side comparison of full optimization vs two-stage
comp = fx.Comparison([fs_full, fs_dispatch])
comp.stats.plot.balance('Heat')
# Side-by-side comparison of full optimization vs two-stage comp = fx.Comparison([fs_full, fs_dispatch]) comp.stats.plot.balance('Heat')
Out[10]:
Energy Flow Sankey (Full Optimization)¶
A Sankey diagram visualizes the total energy flows:
In [11]:
Copied!
fs_full.stats.plot.sankey.flows()
fs_full.stats.plot.sankey.flows()
Out[11]:
When to Use Each Technique¶
| Technique | Best For | Trade-off |
|---|---|---|
| Full optimization | Final results, small problems | Slowest, most accurate |
| Resampling | Quick screening, trend analysis | Fast, loses temporal detail |
| Two-stage | Investment decisions, large problems | Good balance of speed and accuracy |
| Clustering | Preserves extreme periods | Requires tsam package |
Resampling Options¶
# Different resolutions
fs_2h = flow_system.transform.resample('2h') # 2-hourly
fs_4h = flow_system.transform.resample('4h') # 4-hourly
fs_daily = flow_system.transform.resample('1D') # Daily
# Different aggregation methods
fs_mean = flow_system.transform.resample('4h', method='mean') # Default
fs_max = flow_system.transform.resample('4h', method='max') # Preserve peaks
Two-Stage Workflow¶
# Stage 1: Sizing
fs_sizing = flow_system.transform.resample('4h')
fs_sizing.optimize(solver)
# Stage 2: Dispatch
fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes)
fs_dispatch.optimize(solver)
Summary¶
You learned how to:
- Use
transform.resample()to reduce time resolution - Apply two-stage optimization for large investment problems
- Use
transform.fix_sizes()to lock in investment decisions - Compare speed vs. accuracy trade-offs
Key Takeaways¶
- Start fast: Use resampling for initial exploration
- Iterate: Refine with two-stage optimization
- Validate: Run full optimization for final results
- Monitor: Check cost gaps to ensure acceptable accuracy
Next Steps¶
- 08b-Rolling Horizon: For operational problems, decompose time into sequential segments
- 08c-Clustering: Use typical periods with the
tsampackage
Further Reading¶
- For clustering with typical periods, see
transform.cluster()(requirestsampackage) - For time selection, see
transform.sel()andtransform.isel()