Storing Outputs#
This documentation describes the functionality of the output processing system used to store simulation outputs either in a database or as CSV files. It covers the configuration, data buffering, conversion, and storage mechanisms of the main class as well as the support class for database maintenance.
WriteOutput Class#
The WriteOutput
class is responsible for:
Collecting simulation output messages.
Buffering data until either a time-based interval or a memory limit is reached.
Converting raw data into structured pandas DataFrames.
Storing data into a database and/or exporting it as CSV files.
Managing performance trade-offs by allowing configuration of buffer sizes and storage intervals.
This class is part of the assume.common.outputs
module. For more details, refer to assume.common.outputs.WriteOutput
.
Data Processing Workflow#
Message Handling:
The method
handle_output_message
receives asynchronous messages containing simulation outputs.Data is buffered in lists corresponding to the intended database tables based on content type.
Buffer Size Management:
The memory usage of buffered data is monitored continuously.
If memory usage exceeds the
outputs_buffer_size_mb
threshold, buffered data is flushed to storage.
Data Conversion:
Helper methods like
convert_rl_params
andconvert_market_results
transform raw data into DataFrames suitable for storage.
Data Storage:
The method
store_dfs
flushes buffered DataFrames to storage based on time intervals (save_frequency_hours
) or buffer size limits.
Performance Considerations#
Buffering Strategy: Increasing
outputs_buffer_size_mb
can improve performance by reducing write frequency. Default: 300 MB.Write Intervals: Setting
save_frequency_hours
facilitates real-time observation with Grafana dashboards. Disabling this by setting it tonull
improves performance but prevents real-time monitoring.Optimal Configuration: Increase
outputs_buffer_size_mb
and disablesave_frequency_hours
for maximum performance. Ensure sufficient memory is available.
Note
When storing results as CSV files, save_frequency
is automatically disabled, meaning data is only saved at the end of the simulation or if the buffer size limit is reached. This prevents real-time observation through Grafana dashboards.
Database Tables Overview#
The simulation outputs are stored in various tables. Below is a breakdown of tables, their purpose, and what columns they contain. When the outputs are stored in CSV files, the data is organized in a similar structure, with each table represented as a separate CSV file.
Table Name |
Purpose |
Columns |
---|---|---|
market_meta |
Market-related data. |
simulation, market_id, time, node, product_start, product_end, only_hours, price, max_price, min_price, supply_volume, supply_volume_energy, demand_volume, demand_volume_energy |
market_dispatch |
Dispatch data for markets. |
simulation, market_id, datetime, unit_id, power |
unit_dispatch |
Unit-level dispatch data. |
simulation, unit, power, heat, soc, energy_generation_costs, energy_cashflow, total_costs |
market_orders |
Market order details. |
simulation, market_id, node, bid_id, unit_id, parent_bid_id, bid_type, end_time, price, volume, accepted_price, accepted_volume, min_acceptance_ratio |
power_plant_meta |
Power plant metadata. |
simulation, unit_id, unit_type (“power_plant”), max_power, min_power, emission_factor, efficiency |
storage_meta |
Storage unit metadata. |
simulation, unit_id, unit_type (“storage”), max_soc, min_soc, max_power_charge, max_power_discharge, min_power_charge, min_power_discharge, efficiency_charge, efficiency_discharge |
rl_params |
Reinforcement learning parameters. |
simulation, unit, datetime, evaluation_mode, episode, profit, reward, regret, actions, exploration_noise, critic_loss, total_grad_norm, max_grad_norm, learning_rate |
rl_meta |
Reinforcement learning metadata. |
simulation, episode, eval_episode, learning_mode, evaluation_mode |
Note
The columns for all unit_meta like power_plant_meta and storage_meta are not listed here. Their structure is dictated by the as_dict
method in the respective unit classes.
DatabaseMaintenance Class#
The DatabaseMaintenance
class provides utility functions for managing simulation data stored in a database. It includes:
Retrieving unique simulation IDs from the database.
Deleting specific simulations.
Performing bulk deletion of simulation data with optional exclusions.
This class is part of the assume.common.outputs
module. For more details, refer to assume.common.outputs.DatabaseMaintenance
.
Usage Example#
Below is an example YAML configuration snippet for a simulation configuring the save frequency and buffer size:
example_simulation:
simulation_id: example_simulation
start: 2025-01-01 00:00:00
end: 2025-01-02 00:00:00
save_frequency_hours: 24 # Time interval for saving data in hours
outputs_buffer_size_mb: 300 # Buffer size in MB
Example usage of the DatabaseMaintenance
class:
from assume.common import DatabaseMaintenance
# Select to store the simulation results in a local database or in TimescaleDB.
# When using TimescaleDB, ensure Docker is installed and the Grafana dashboard is accessible.
data_format = "timescale" # Options: "local_db" or "timescale"
if data_format == "local_db":
db_uri = "sqlite:///./examples/local_db/assume_db.db"
elif data_format == "timescale":
db_uri = "postgresql://assume:assume@localhost:5432/assume"
maintenance = DatabaseMaintenance(db_uri=db_uri)
# 1. Retrieve unique simulation IDs:
unique_ids = maintenance.get_unique_simulation_ids()
print("Unique simulation IDs:", unique_ids)
# 2. Delete specific simulations:
maintenance.delete_simulations(["example_01", "example_02"])
# 3. Delete all simulations except a few:
maintenance.delete_all_simulations(exclude=["example_01"])