dsipts.data_management package¶

Submodules¶

dsipts.data_management.monash module¶

class dsipts.data_management.monash.Monash(filename: str, baseUrl: str = 'https://forecastingdata.org/', rebuild: bool = False)¶

Bases: object

Class for downloading datasets listed here https://forecastingdata.org/

Parameters:

filename (str) – name of the class, used for saving
baseUrl (str, optional) – url to the source page. Defaults to ‘https://forecastingdata.org/’.
rebuild (bool, optional) – if true the table will be loaded from the webpage otherwise it will be loaded from the saved file. Defaults to False.

download_dataset(path: str, id: int, rebuild=False) → None¶

download a specific dataset

Parameters:

path (str) – path in which save the data
id (int) – id of the dataset
rebuild (bool, optional) – if true the dataset will be re-downloaded. Defaults to False.

generate_dataset(id: int) → None | DataFrame¶

Parse the id-th dataset in a convient format and return a pandas dataset

Parameters:: id (int) – id of the dataset
Returns:: dataframe
Return type:: None or pd.DataFrame

load(filename: str) → None¶

Load a monarch structure

Parameters:: filename (str) – filename to load

save(filename: str) → None¶

Save the monarch structure

Parameters:: filename (str) – name of the file to generate

dsipts.data_management.monash.convert_tsf_to_dataframe(full_file_path_and_name: str, replace_missing_vals_with: str = 'NaN', value_column_name: str = 'series_value') → DataFrame¶

I copied this function from the repo

Parameters:

full_file_path_and_name (str) – path
replace_missing_vals_with (str, optional) – replace not valid numbers. Defaults to “NaN”.
value_column_name (str, optional) – . Defaults to “series_value”.

Raises:

Exception – see https://forecastingdata.org/ for more information

Returns:

the selected timseries

Return type:

pd.DataFrame

dsipts.data_management.monash.get_freq(freq) → str¶

Get the frequency based on the string reported. I don’t think there are all the possibilities here

Parameters:: freq (str) – string coming from
Returns:: pandas frequency format
Return type:: str

dsipts.data_management.public_datasets module¶

dsipts.data_management.public_datasets.build_venice(path: str, url='https://www.comune.venezia.it/it/content/archivio-storico-livello-marea-venezia-1') → None¶

dsipts.data_management.public_datasets.read_public_dataset(path: str, dataset: str) → Tuple[DataFrame, List[str]]¶

Returns the public dataset chosen. Pleas download the dataset from here https://drive.google.com/drive/folders/1ZOYpTUa82_jCcxIdTmyr0LXQfvaM9vIy or ask to agobbi@fbk.eu.

Parameters:

path (str) – path to data
dataset (str) – dataset (one of ‘electricity’,’etth1’,’etth2’,’ettm1’,’ettm2’,’exchange_rate’,’illness’,’traffic’,’weather’)

Returns:

The target variable is y and the time index is time and the list of the covariates

Return type:

Tuple[pd.DataFrame,List[str]]

dsipts.data_management package¶

Submodules¶

dsipts.data_management.monash module¶

dsipts.data_management.public_datasets module¶

Module contents¶

DSIPTS

Navigation

Related Topics