Aggregation¶

Speed up large problems with time series aggregation techniques.

This notebook introduces:

Resampling: Reduce time resolution (e.g., hourly → 4-hourly)
Two-stage optimization: Size with reduced data, dispatch at full resolution
Speed vs. accuracy trade-offs: When to use each technique

Setup¶

In [1]:

  Copied!     
 
import timeit

import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots

import flixopt as fx

fx.CONFIG.notebook()
import timeit import pandas as pd import plotly.graph_objects as go from plotly.subplots import make_subplots import flixopt as fx fx.CONFIG.notebook()

Out[1]:

flixopt.config.CONFIG

Create the FlowSystem¶

We use a district heating system with real-world time series data (one month at hourly resolution):

In [2]:

  Copied!     
 
from data.generate_example_systems import create_district_heating_system

flow_system = create_district_heating_system()
flow_system.connect_and_transform()  # Align all data as xarray

timesteps = flow_system.timesteps
print(f'Loaded FlowSystem: {len(timesteps)} timesteps ({len(timesteps) / 24:.0f} days at hourly resolution)')
print(f'Components: {list(flow_system.components.keys())}')
from data.generate_example_systems import create_district_heating_system flow_system = create_district_heating_system() flow_system.connect_and_transform() # Align all data as xarray timesteps = flow_system.timesteps print(f'Loaded FlowSystem: {len(timesteps)} timesteps ({len(timesteps) / 24:.0f} days at hourly resolution)') print(f'Components: {list(flow_system.components.keys())}')

Loaded FlowSystem: 744 timesteps (31 days at hourly resolution)
Components: ['CHP', 'Boiler', 'Storage', 'GasGrid', 'CoalSupply', 'GridBuy', 'GridSell', 'HeatDemand', 'ElecDemand']

In [3]:

  Copied!     
 
# Visualize first week of data
heat_demand = flow_system.components['HeatDemand'].inputs[0].fixed_relative_profile
electricity_price = flow_system.components['GridBuy'].outputs[0].effects_per_flow_hour['costs']

fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.1)

fig.add_trace(go.Scatter(x=timesteps[:168], y=heat_demand.values[:168], name='Heat Demand'), row=1, col=1)
fig.add_trace(go.Scatter(x=timesteps[:168], y=electricity_price.values[:168], name='Electricity Price'), row=2, col=1)

fig.update_layout(height=400, title='First Week of Data')
fig.update_yaxes(title_text='Heat Demand [MW]', row=1, col=1)
fig.update_yaxes(title_text='El. Price [€/MWh]', row=2, col=1)
fig.show()
# Visualize first week of data heat_demand = flow_system.components['HeatDemand'].inputs[0].fixed_relative_profile electricity_price = flow_system.components['GridBuy'].outputs[0].effects_per_flow_hour['costs'] fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.1) fig.add_trace(go.Scatter(x=timesteps[:168], y=heat_demand.values[:168], name='Heat Demand'), row=1, col=1) fig.add_trace(go.Scatter(x=timesteps[:168], y=electricity_price.values[:168], name='Electricity Price'), row=2, col=1) fig.update_layout(height=400, title='First Week of Data') fig.update_yaxes(title_text='Heat Demand [MW]', row=1, col=1) fig.update_yaxes(title_text='El. Price [€/MWh]', row=2, col=1) fig.show()

Technique 1: Resampling¶

Reduce time resolution to speed up optimization:

In [4]:

  Copied!     
 
solver = fx.solvers.HighsSolver(mip_gap=0.01)

# Resample from 1h to 4h resolution
fs_resampled = flow_system.transform.resample('4h')

reduction = (1 - len(fs_resampled.timesteps) / len(flow_system.timesteps)) * 100
print(f'Resampled: {len(flow_system.timesteps)} → {len(fs_resampled.timesteps)} timesteps ({reduction:.0f}% reduction)')
solver = fx.solvers.HighsSolver(mip_gap=0.01) # Resample from 1h to 4h resolution fs_resampled = flow_system.transform.resample('4h') reduction = (1 - len(fs_resampled.timesteps) / len(flow_system.timesteps)) * 100 print(f'Resampled: {len(flow_system.timesteps)} → {len(fs_resampled.timesteps)} timesteps ({reduction:.0f}% reduction)')

Resampled: 744 → 186 timesteps (75% reduction)

In [5]:

  Copied!     
 
# Optimize resampled system
start = timeit.default_timer()
fs_resampled.optimize(solver)
time_resampled = timeit.default_timer() - start

print(f'Resampled: {time_resampled:.1f}s, {fs_resampled.solution["costs"].item():,.0f} €')
# Optimize resampled system start = timeit.default_timer() fs_resampled.optimize(solver) time_resampled = timeit.default_timer() - start print(f'Resampled: {time_resampled:.1f}s, {fs_resampled.solution["costs"].item():,.0f} €')

Resampled: 6.0s, -135,955 €

Technique 2: Two-Stage Optimization¶

Stage 1: Size components with resampled data (fast)
Stage 2: Fix sizes and optimize dispatch at full resolution

In [6]:

  Copied!     
 
# Stage 1: Sizing with resampled data
start = timeit.default_timer()
fs_sizing = flow_system.transform.resample('4h')
fs_sizing.optimize(solver)
time_stage1 = timeit.default_timer() - start

sizes = {k: float(v.item()) for k, v in fs_sizing.stats.sizes.items()}
print(
    f'Stage 1 (sizing): {time_stage1:.1f}s → CHP {sizes["CHP(Q_th)"]:.0f}, Boiler {sizes["Boiler(Q_th)"]:.0f}, Storage {sizes["Storage"]:.0f}'
)
# Stage 1: Sizing with resampled data start = timeit.default_timer() fs_sizing = flow_system.transform.resample('4h') fs_sizing.optimize(solver) time_stage1 = timeit.default_timer() - start sizes = {k: float(v.item()) for k, v in fs_sizing.stats.sizes.items()} print( f'Stage 1 (sizing): {time_stage1:.1f}s → CHP {sizes["CHP(Q_th)"]:.0f}, Boiler {sizes["Boiler(Q_th)"]:.0f}, Storage {sizes["Storage"]:.0f}' )

Stage 1 (sizing): 6.2s → CHP 157, Boiler 0, Storage 1000

In [7]:

  Copied!     
 
# Stage 2: Dispatch at full resolution with fixed sizes
start = timeit.default_timer()
fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes)
fs_dispatch.name = 'Two-Stage'
fs_dispatch.optimize(solver)
time_stage2 = timeit.default_timer() - start

print(
    f'Stage 2 (dispatch): {time_stage2:.1f}s, {fs_dispatch.solution["costs"].item():,.0f} € (total: {time_stage1 + time_stage2:.1f}s)'
)
# Stage 2: Dispatch at full resolution with fixed sizes start = timeit.default_timer() fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes) fs_dispatch.name = 'Two-Stage' fs_dispatch.optimize(solver) time_stage2 = timeit.default_timer() - start print( f'Stage 2 (dispatch): {time_stage2:.1f}s, {fs_dispatch.solution["costs"].item():,.0f} € (total: {time_stage1 + time_stage2:.1f}s)' )

Stage 2 (dispatch): 7.4s, -148,776 € (total: 13.6s)

Technique 3: Full Optimization (Baseline)¶

For comparison, solve the full problem:

In [8]:

  Copied!     
 
start = timeit.default_timer()
fs_full = flow_system.copy()
fs_full.name = 'Full Optimization'
fs_full.optimize(solver)
time_full = timeit.default_timer() - start

print(f'Full optimization: {time_full:.1f}s, {fs_full.solution["costs"].item():,.0f} €')
start = timeit.default_timer() fs_full = flow_system.copy() fs_full.name = 'Full Optimization' fs_full.optimize(solver) time_full = timeit.default_timer() - start print(f'Full optimization: {time_full:.1f}s, {fs_full.solution["costs"].item():,.0f} €')

Full optimization: 16.7s, -148,912 €

Compare Results¶

In [9]:

  Copied!     
 
# Collect results
results = {
    'Full (baseline)': {
        'Time [s]': time_full,
        'Cost [€]': fs_full.solution['costs'].item(),
        'CHP Size [MW]': fs_full.stats.sizes['CHP(Q_th)'].item(),
        'Boiler Size [MW]': fs_full.stats.sizes['Boiler(Q_th)'].item(),
        'Storage Size [MWh]': fs_full.stats.sizes['Storage'].item(),
    },
    'Resampled (4h)': {
        'Time [s]': time_resampled,
        'Cost [€]': fs_resampled.solution['costs'].item(),
        'CHP Size [MW]': fs_resampled.stats.sizes['CHP(Q_th)'].item(),
        'Boiler Size [MW]': fs_resampled.stats.sizes['Boiler(Q_th)'].item(),
        'Storage Size [MWh]': fs_resampled.stats.sizes['Storage'].item(),
    },
    'Two-Stage': {
        'Time [s]': time_stage1 + time_stage2,
        'Cost [€]': fs_dispatch.solution['costs'].item(),
        'CHP Size [MW]': fs_dispatch.stats.sizes['CHP(Q_th)'].item(),
        'Boiler Size [MW]': fs_dispatch.stats.sizes['Boiler(Q_th)'].item(),
        'Storage Size [MWh]': fs_dispatch.stats.sizes['Storage'].item(),
    },
}

comparison = pd.DataFrame(results).T

# Add relative metrics
baseline_cost = comparison.loc['Full (baseline)', 'Cost [€]']
baseline_time = comparison.loc['Full (baseline)', 'Time [s]']
comparison['Cost Gap [%]'] = ((comparison['Cost [€]'] - baseline_cost) / baseline_cost * 100).round(2)
comparison['Speedup'] = (baseline_time / comparison['Time [s]']).round(1)

comparison.style.format(
    {
        'Time [s]': '{:.2f}',
        'Cost [€]': '{:,.0f}',
        'CHP Size [MW]': '{:.1f}',
        'Boiler Size [MW]': '{:.1f}',
        'Storage Size [MWh]': '{:.0f}',
        'Cost Gap [%]': '{:.2f}',
        'Speedup': '{:.1f}x',
    }
)
# Collect results results = { 'Full (baseline)': { 'Time [s]': time_full, 'Cost [€]': fs_full.solution['costs'].item(), 'CHP Size [MW]': fs_full.stats.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_full.stats.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_full.stats.sizes['Storage'].item(), }, 'Resampled (4h)': { 'Time [s]': time_resampled, 'Cost [€]': fs_resampled.solution['costs'].item(), 'CHP Size [MW]': fs_resampled.stats.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_resampled.stats.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_resampled.stats.sizes['Storage'].item(), }, 'Two-Stage': { 'Time [s]': time_stage1 + time_stage2, 'Cost [€]': fs_dispatch.solution['costs'].item(), 'CHP Size [MW]': fs_dispatch.stats.sizes['CHP(Q_th)'].item(), 'Boiler Size [MW]': fs_dispatch.stats.sizes['Boiler(Q_th)'].item(), 'Storage Size [MWh]': fs_dispatch.stats.sizes['Storage'].item(), }, } comparison = pd.DataFrame(results).T # Add relative metrics baseline_cost = comparison.loc['Full (baseline)', 'Cost [€]'] baseline_time = comparison.loc['Full (baseline)', 'Time [s]'] comparison['Cost Gap [%]'] = ((comparison['Cost [€]'] - baseline_cost) / baseline_cost * 100).round(2) comparison['Speedup'] = (baseline_time / comparison['Time [s]']).round(1) comparison.style.format( { 'Time [s]': '{:.2f}', 'Cost [€]': '{:,.0f}', 'CHP Size [MW]': '{:.1f}', 'Boiler Size [MW]': '{:.1f}', 'Storage Size [MWh]': '{:.0f}', 'Cost Gap [%]': '{:.2f}', 'Speedup': '{:.1f}x', } )

Out[9]:

	Time [s]	Cost [€]	CHP Size [MW]	Storage Size [MWh]	Cost Gap [%]	Speedup
Full (baseline)	16.69	-148,912	165.7	1000	-0.00	1.0x
Resampled (4h)	5.97	-135,955	156.6	1000	-8.70	2.8x
Two-Stage	13.62	-148,776	156.6	1000	-0.09	1.2x

Visual Comparison: Heat Balance¶

Compare the full optimization with the two-stage approach side-by-side:

In [10]:

  Copied!     
 
# Side-by-side comparison of full optimization vs two-stage
comp = fx.Comparison([fs_full, fs_dispatch])
comp.stats.plot.balance('Heat')
# Side-by-side comparison of full optimization vs two-stage comp = fx.Comparison([fs_full, fs_dispatch]) comp.stats.plot.balance('Heat')

Out[10]:

Energy Flow Sankey (Full Optimization)¶

A Sankey diagram visualizes the total energy flows:

In [11]:

  Copied!     
 
fs_full.stats.plot.sankey.flows()
fs_full.stats.plot.sankey.flows()

Out[11]:

When to Use Each Technique¶

Technique	Best For	Trade-off
Full optimization	Final results, small problems	Slowest, most accurate
Resampling	Quick screening, trend analysis	Fast, loses temporal detail
Two-stage	Investment decisions, large problems	Good balance of speed and accuracy
Clustering	Preserves extreme periods	Requires `tsam` package

Resampling Options¶

# Different resolutions
fs_2h = flow_system.transform.resample('2h')   # 2-hourly
fs_4h = flow_system.transform.resample('4h')   # 4-hourly
fs_daily = flow_system.transform.resample('1D')  # Daily

# Different aggregation methods
fs_mean = flow_system.transform.resample('4h', method='mean')  # Default
fs_max = flow_system.transform.resample('4h', method='max')    # Preserve peaks

Two-Stage Workflow¶

# Stage 1: Sizing
fs_sizing = flow_system.transform.resample('4h')
fs_sizing.optimize(solver)

# Stage 2: Dispatch
fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes)
fs_dispatch.optimize(solver)

Summary¶

You learned how to:

Use transform.resample() to reduce time resolution
Apply two-stage optimization for large investment problems
Use transform.fix_sizes() to lock in investment decisions
Compare speed vs. accuracy trade-offs

Key Takeaways¶

Start fast: Use resampling for initial exploration
Iterate: Refine with two-stage optimization
Validate: Run full optimization for final results
Monitor: Check cost gaps to ensure acceptable accuracy

Next Steps¶

08b-Rolling Horizon: For operational problems, decompose time into sequential segments
08c-Clustering: Use typical periods with the tsam package