dynabench.dataset
Module for loading the data.
Functions
|
Download a dataset and unpack it to the right place. |
Classes
|
Iterator for the Dynabench dataset. |
|
|
|
Iterator for arbitrary equations generated using the dynabench solver. |
- class dynabench.dataset.DynabenchIterator(split: str = 'train', equation: str = 'wave', structure: str = 'cloud', resolution: str = 'low', base_path: str = 'data', lookback: int = 1, rollout: int = 1, download: bool = False, *args, **kwargs)[source]
Bases:
object
Iterator for the Dynabench dataset. This iterator will iterate over each simulation in the dataset, by moving a window over the simulation data. The window size is defined by the lookback and rollout parameters, which define the number of timesteps to be used as input and output, respectively.
- Parameters:
split (str) – The split of the dataset to use. Can be “train”, “val” or “test”.
equation (str) – The equation to use. Can be “advection”, “burgers”, “gasdynamics”, “kuramotosivashinsky”, “reactiondiffustion” or “wave”.
structure (str) – The structure of the dataset. Can be “cloud” or “grid”.
resolution (str) – The resolution of the dataset. Can be low, medium, high or full. Low resolution corresponds to 225 points in total (aranged in a 15x15 grid for the grid structure). Medium resolution corresponds to 484 points in total (aranged in a 22x22 grid for the grid structure). High resolution corresponds to 900 points in total (aranged in a 30x30 grid for the grid structure). Full resolution uses the full simulation grid of shape (64x64) that has been used to numerically solve the simulations.
base_path (str) – Location where the data is stored. Defaults to “data”.
lookback (int) – Number of timesteps to use for the input data. Defaults to 1.
rollout (int) – Number of timesteps to use for the target data. Defaults to 1.
download (int) – Whether to download the data. Defaults to False.
- class dynabench.dataset.DynabenchSimulationIterator(split: str = 'train', equation: str = 'wave', structure: str = 'cloud', resolution: str = 'low', base_path: str = 'data', download: bool = False, *args, **kwargs)[source]
Bases:
object
- class dynabench.dataset.EquationMovingWindowIterator(data_path: str, lookback: int, rollout: int)[source]
Bases:
object
Iterator for arbitrary equations generated using the dynabench solver. Each sample returned by the __getitem__ method is a tuple of (data_input, data_target, points), where data_input is the input data of shape (L, F, H, W), data_target is the target data of shape (R, F, H, W), and points are the points in the grid of shape (H, W, 2). In this context L corresponds to the lookback parameter and R corresponds to the rollout parameter. H and W are the height and width of the grid, respectively. F is the number of variables in the equation system.
- Parameters:
data_path (str) – Path to the data file in h5 format.
lookback (int) – Number of time steps to look back. This corresponds to the L parameter.
rollout (int) – Number of time steps to predict. This corresponds to the R parameter.
- get_full_simulation_data()[source]
This method returns the full simulation data from the data file, along with the points in the grid.
- Returns:
The data and the points. The data has shape (T, F, H, W) and the points have shape (H, W, 2), where T is the number of time steps, F is the number of variables, H and W are the height and width of the grid, respectively.
- Return type:
np.ndarray, np.ndarray
- dynabench.dataset.download_equation(equation: str, structure: str, resolution: str, data_dir: str = 'data', tmp_dir: str = 'tmp')[source]
Download a dataset and unpack it to the right place.
- Parameters:
equation (str) – Name of the equation to download.
structure (str) – Description of how the observation points are structured. Can be “cloud” or “grid”.
resolution (str) – Resolution of the dataset. Can be “low”, “medium”, or “high”.
data_dir (str) – Directory where the dataset should be saved. Defaults to “data/”.
tmp_dir (str) – Directory where the temporary files should be saved. Defaults to “data/tmp/”. This directory will be deleted after the dataset is unpacked.
- Return type:
None