flixopt.transform_accessor ¶
Transform accessor for FlowSystem.
This module provides the TransformAccessor class that enables transformations on FlowSystem like clustering, selection, and resampling.
Attributes¶
Classes¶
TransformAccessor ¶
Accessor for transformation methods on FlowSystem.
This class provides transformations that create new FlowSystem instances with modified structure or data, accessible via flow_system.transform.
Examples:
Time series aggregation (8 typical days):
>>> reduced_fs = flow_system.transform.cluster(n_clusters=8, cluster_duration='1D')
>>> reduced_fs.optimize(solver)
>>> expanded_fs = reduced_fs.transform.expand()
Future MGA:
Initialize the accessor with a reference to the FlowSystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
flow_system | FlowSystem | The FlowSystem to transform. | required |
Functions¶
sel ¶
sel(time: str | slice | list[str] | Timestamp | DatetimeIndex | None = None, period: int | slice | list[int] | Index | None = None, scenario: str | slice | list[str] | Index | None = None) -> FlowSystem
Select a subset of the FlowSystem by label.
Creates a new FlowSystem with data selected along the specified dimensions. The returned FlowSystem has no solution (it must be re-optimized).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
time | str | slice | list[str] | Timestamp | DatetimeIndex | None | Time selection (e.g., slice('2023-01-01', '2023-12-31'), '2023-06-15') | None |
period | int | slice | list[int] | Index | None | Period selection (e.g., slice(2023, 2024), or list of periods) | None |
scenario | str | slice | list[str] | Index | None | Scenario selection (e.g., 'scenario1', or list of scenarios) | None |
Returns:
| Name | Type | Description |
|---|---|---|
FlowSystem | FlowSystem | New FlowSystem with selected data (no solution). |
Examples:
isel ¶
isel(time: int | slice | list[int] | None = None, period: int | slice | list[int] | None = None, scenario: int | slice | list[int] | None = None) -> FlowSystem
Select a subset of the FlowSystem by integer indices.
Creates a new FlowSystem with data selected along the specified dimensions. The returned FlowSystem has no solution (it must be re-optimized).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
time | int | slice | list[int] | None | Time selection by integer index (e.g., slice(0, 100), 50, or [0, 5, 10]) | None |
period | int | slice | list[int] | None | Period selection by integer index | None |
scenario | int | slice | list[int] | None | Scenario selection by integer index | None |
Returns:
| Name | Type | Description |
|---|---|---|
FlowSystem | FlowSystem | New FlowSystem with selected data (no solution). |
Examples:
resample ¶
resample(time: str, method: Literal['mean', 'sum', 'max', 'min', 'first', 'last', 'std', 'var', 'median', 'count'] = 'mean', hours_of_last_timestep: int | float | None = None, hours_of_previous_timesteps: int | float | ndarray | None = None, fill_gaps: Literal['ffill', 'bfill', 'interpolate'] | None = None, **kwargs: Any) -> FlowSystem
Create a resampled FlowSystem by resampling data along the time dimension.
Creates a new FlowSystem with resampled time series data. The returned FlowSystem has no solution (it must be re-optimized).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
time | str | Resampling frequency (e.g., '3h', '2D', '1M') | required |
method | Literal['mean', 'sum', 'max', 'min', 'first', 'last', 'std', 'var', 'median', 'count'] | Resampling method. Recommended: 'mean', 'first', 'last', 'max', 'min' | 'mean' |
hours_of_last_timestep | int | float | None | Duration of the last timestep after resampling. If None, computed from the last time interval. | None |
hours_of_previous_timesteps | int | float | ndarray | None | Duration of previous timesteps after resampling. If None, computed from the first time interval. Can be a scalar or array. | None |
fill_gaps | Literal['ffill', 'bfill', 'interpolate'] | None | Strategy for filling gaps (NaN values) that arise when resampling irregular timesteps to regular intervals. Options: 'ffill' (forward fill), 'bfill' (backward fill), 'interpolate' (linear interpolation). If None (default), raises an error when gaps are detected. | None |
**kwargs | Any | Additional arguments passed to xarray.resample() | {} |
Returns:
| Name | Type | Description |
|---|---|---|
FlowSystem | FlowSystem | New resampled FlowSystem (no solution). |
Raises:
| Type | Description |
|---|---|
ValueError | If resampling creates gaps and fill_gaps is not specified. |
Examples:
fix_sizes ¶
fix_sizes(sizes: Dataset | dict[str, float] | None = None, decimal_rounding: int | None = 5) -> FlowSystem
Create a new FlowSystem with investment sizes fixed to specified values.
This is useful for two-stage optimization workflows: 1. Solve a sizing problem (possibly resampled for speed) 2. Fix sizes and solve dispatch at full resolution
The returned FlowSystem has InvestParameters with fixed_size set, making those sizes mandatory rather than decision variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sizes | Dataset | dict[str, float] | None | The sizes to fix. Can be: - None: Uses sizes from this FlowSystem's solution (must be solved) - xr.Dataset: Dataset with size variables (e.g., from statistics.sizes) - dict: Mapping of component names to sizes (e.g., {'Boiler(Q_fu)': 100}) | None |
decimal_rounding | int | None | Number of decimal places to round sizes to. Rounding helps avoid numerical infeasibility. Set to None to disable. | 5 |
Returns:
| Name | Type | Description |
|---|---|---|
FlowSystem | FlowSystem | New FlowSystem with fixed sizes (no solution). |
Raises:
| Type | Description |
|---|---|
ValueError | If no sizes provided and FlowSystem has no solution. |
KeyError | If a specified size doesn't match any InvestParameters. |
Examples:
Two-stage optimization:
>>> # Stage 1: Size with resampled data
>>> fs_sizing = flow_system.transform.resample('2h')
>>> fs_sizing.optimize(solver)
>>>
>>> # Stage 2: Fix sizes and optimize at full resolution
>>> fs_dispatch = flow_system.transform.fix_sizes(fs_sizing.stats.sizes)
>>> fs_dispatch.optimize(solver)
Using a dict:
clustering_data ¶
Get the time-varying data that would be used for clustering.
This method extracts only the data arrays that vary over time, which is the data that clustering algorithms use to identify typical periods. Constant arrays (same value for all timesteps) are excluded since they don't contribute to pattern identification.
Use this to inspect or pre-process the data before clustering, or to understand which variables influence the clustering result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
period | Any | None | Optional period label to select. If None and the FlowSystem has multiple periods, returns data for all periods. | None |
scenario | Any | None | Optional scenario label to select. If None and the FlowSystem has multiple scenarios, returns data for all scenarios. | None |
Returns:
| Type | Description |
|---|---|
Dataset | xr.Dataset containing only time-varying data arrays. The dataset |
Dataset | includes arrays like demand profiles, price profiles, and other |
Dataset | time series that vary over the time dimension. |
Examples:
Inspect clustering input data:
>>> data = flow_system.transform.clustering_data()
>>> print(f'Variables used for clustering: {list(data.data_vars)}')
>>> data['HeatDemand(Q)|fixed_relative_profile'].plot()
Get data for a specific period/scenario:
>>> data_2024 = flow_system.transform.clustering_data(period=2024)
>>> data_high = flow_system.transform.clustering_data(scenario='high')
Convert to DataFrame for external tools:
cluster ¶
cluster(n_clusters: int, cluster_duration: str | float, data_vars: list[str] | None = None, cluster: ClusterConfig | None = None, extremes: ExtremeConfig | None = None, segments: SegmentConfig | None = None, preserve_column_means: bool = True, rescale_exclude_columns: list[str] | None = None, round_decimals: int | None = None, numerical_tolerance: float = 1e-13, **tsam_kwargs: Any) -> FlowSystem
Create a FlowSystem with reduced timesteps using typical clusters.
This method creates a new FlowSystem optimized for sizing studies by reducing the number of timesteps to only the typical (representative) clusters identified through time series aggregation using the tsam package.
The method: 1. Performs time series clustering using tsam (hierarchical by default) 2. Extracts only the typical clusters (not all original timesteps) 3. Applies timestep weighting for accurate cost representation 4. Handles storage states between clusters based on each Storage's cluster_mode
Use this for initial sizing optimization, then use fix_sizes() to re-optimize at full resolution for accurate dispatch results.
To reuse an existing clustering on different data, use apply_clustering() instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_clusters | int | Number of clusters (typical periods) to extract (e.g., 8 typical days). | required |
cluster_duration | str | float | Duration of each cluster. Can be a pandas-style string ('1D', '24h', '6h') or a numeric value in hours. | required |
data_vars | list[str] | None | Optional list of variable names to use for clustering. If specified, only these variables are used to determine cluster assignments, but the clustering is then applied to ALL time-varying data in the FlowSystem. Use | None |
cluster | ClusterConfig | None | Optional tsam | None |
extremes | ExtremeConfig | None | Optional tsam | None |
segments | SegmentConfig | None | Optional tsam | None |
preserve_column_means | bool | Rescale typical periods so each column's weighted mean matches the original data's mean. Ensures total energy/load is preserved when weights represent occurrence counts. Default is True. | True |
rescale_exclude_columns | list[str] | None | Column names to exclude from rescaling when | None |
round_decimals | int | None | Round output values to this many decimal places. If None (default), no rounding is applied. | None |
numerical_tolerance | float | Tolerance for numerical precision issues. Controls when warnings are raised for aggregated values exceeding original time series bounds. Default is 1e-13. | 1e-13 |
**tsam_kwargs | Any | Additional keyword arguments passed to | {} |
Returns:
| Type | Description |
|---|---|
FlowSystem | A new FlowSystem with reduced timesteps (only typical clusters). |
FlowSystem | The FlowSystem has metadata stored in |
Raises:
| Type | Description |
|---|---|
ValueError | If timestep sizes are inconsistent. |
ValueError | If cluster_duration is not a multiple of timestep size. |
Examples:
Basic clustering with peak preservation:
>>> from tsam import ExtremeConfig
>>> fs_clustered = flow_system.transform.cluster(
... n_clusters=8,
... cluster_duration='1D',
... extremes=ExtremeConfig(
... method='new_cluster',
... max_value=['HeatDemand(Q_th)|fixed_relative_profile'],
... ),
... )
>>> fs_clustered.optimize(solver)
Clustering based on specific variables only:
>>> # See available variables for clustering
>>> print(flow_system.transform.clustering_data().data_vars)
>>>
>>> # Cluster based only on demand profile
>>> fs_clustered = flow_system.transform.cluster(
... n_clusters=8,
... cluster_duration='1D',
... data_vars=['HeatDemand(Q)|fixed_relative_profile'],
... )
Note
- This is best suited for initial sizing, not final dispatch optimization
- Use
extremesto ensure peak demand clusters are captured - A 5-10% safety margin on sizes is recommended for the dispatch stage
- For seasonal storage (e.g., hydrogen, thermal storage), set
Storage.cluster_mode='intercluster'or'intercluster_cyclic'
apply_clustering ¶
Apply an existing clustering to this FlowSystem.
This method applies a previously computed clustering (from another FlowSystem) to the current FlowSystem's data. The clustering structure (cluster assignments, number of clusters, etc.) is preserved while the time series data is aggregated according to the existing cluster assignments.
Use this to: - Compare different scenarios with identical cluster assignments - Apply a reference clustering to new data
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
clustering | Clustering | A | required |
Returns:
| Type | Description |
|---|---|
FlowSystem | A new FlowSystem with reduced timesteps (only typical clusters). |
FlowSystem | The FlowSystem has metadata stored in |
Raises:
| Type | Description |
|---|---|
ValueError | If the clustering dimensions don't match this FlowSystem's periods/scenarios. |
Examples:
Apply clustering from one FlowSystem to another:
expand ¶
Expand a clustered FlowSystem back to full original timesteps.
After solving a FlowSystem created with cluster(), this method disaggregates the FlowSystem by: 1. Expanding all time series data from typical clusters to full timesteps 2. Expanding the solution by mapping each typical cluster back to all original clusters it represents
For FlowSystems with periods and/or scenarios, each (period, scenario) combination is expanded using its own cluster assignment.
This enables using all existing solution accessors (statistics, plot, etc.) with full time resolution, where both the data and solution are consistently expanded from the typical clusters.
Returns:
| Name | Type | Description |
|---|---|---|
FlowSystem | FlowSystem | A new FlowSystem with full timesteps and expanded solution. |
Raises:
| Type | Description |
|---|---|
ValueError | If the FlowSystem was not created with |
ValueError | If the FlowSystem has no solution. |
Examples:
Two-stage optimization with expansion:
>>> # Stage 1: Size with reduced timesteps
>>> fs_reduced = flow_system.transform.cluster(
... n_clusters=8,
... cluster_duration='1D',
... )
>>> fs_reduced.optimize(solver)
>>>
>>> # Expand to full resolution FlowSystem
>>> fs_expanded = fs_reduced.transform.expand()
>>>
>>> # Use all existing accessors with full timesteps
>>> fs_expanded.stats.flow_rates # Full 8760 timesteps
>>> fs_expanded.stats.plot.balance('HeatBus') # Full resolution plots
>>> fs_expanded.stats.plot.heatmap('Boiler(Q_th)|flow_rate')
Note
The expanded FlowSystem repeats the typical cluster values for all original clusters belonging to the same cluster. Both input data and solution are consistently expanded, so they match. This is an approximation - the actual dispatch at full resolution would differ due to intra-cluster variations in time series data.
For accurate dispatch results, use fix_sizes() to fix the sizes from the reduced optimization and re-optimize at full resolution.
Segmented Systems Variable Handling:
For systems clustered with SegmentConfig, special handling is applied to time-varying solution variables. Variables without a time dimension are unaffected by segment expansion. This includes:
- Investment:
{component}|size,{component}|exists - Storage boundaries:
{storage}|SOC_boundary - Aggregated totals:
{flow}|total_flow_hours,{flow}|active_hours - Effect totals:
{effect},{effect}(temporal),{effect}(periodic)
Time-varying variables are categorized and handled as follows:
-
State variables - Interpolated within segments:
-
{storage}|charge_state: Linear interpolation between segment boundary values to show the charge trajectory during charge/discharge. -
Segment totals - Divided by segment duration:
These variables represent values summed over the segment. Division converts them back to hourly rates for correct plotting and analysis.
{effect}(temporal)|per_timestep: Per-timestep effect contributions{flow}->{effect}(temporal): Flow contributions (includes botheffects_per_flow_hourandeffects_per_startup){component}->{effect}(temporal): Component-level contributions-
{source}(temporal)->{target}(temporal): Effect-to-effect shares -
Rate/average variables - Expanded as-is:
These variables represent average values within the segment. tsam already provides properly averaged values, so no correction needed.
{flow}|flow_rate: Average flow rate during segment-
{storage}|netto_discharge: Net discharge rate (discharge - charge) -
Binary status variables - Constant within segment:
These variables cannot be meaningfully interpolated. The status indicates the dominant state during the segment.
-
{flow}|status: On/off status (0 or 1), repeated for all timesteps -
Binary event variables (segmented systems only) - First timestep of segment:
For segmented systems, these variables indicate that an event occurred somewhere during the segment. When expanded, the event is placed at the first timestep of each segment, with zeros elsewhere. This preserves the total count of events while providing a reasonable temporal placement.
For non-segmented systems, the timing within the cluster is preserved by normal expansion (no special handling needed).
{flow}|startup: Startup event{flow}|shutdown: Shutdown event