general_function#
Auxiliary functions
- build_non_existing_dirs(file_path: str)[source]#
Build non-existing directories for a given file path.
- Parameters:
file_path (str) – The file path.
- Returns:
True if directories were created successfully.
- Return type:
bool
- camel_to_snake(camel_str: str) str [source]#
Convert a camelCase string to snake_case.
- Parameters:
camel_str (str) – The camelCase string.
- Returns:
The snake_case string.
- Return type:
str
- convert_list_to_string(list_data: list) str [source]#
Convert a list to a comma-separated string.
- Parameters:
list_data (list) – The list to convert.
- Returns:
The comma-separated string.
- Return type:
str
- dict_to_duckdb(data: dict[str, DataFrame], file_path: str)[source]#
Save a dictionary of Polars DataFrames as a DuckDB file.
- Parameters:
data (dict[str, pl.DataFrame]) – The dictionary of Polars DataFrames.
file_path (str) – The DuckDB file path.
- dict_to_gpkg(data: dict, file_path: str, srid: int = 2056)[source]#
Save a dictionary of Polars DataFrames as a GeoPackage file.
- Parameters:
data (dict) – The dictionary of Polars DataFrames.
file_path (str) – The GeoPackage file path.
srid (int, optional) – The SRID. Defaults to SWISS_SRID.
- dictionary_key_filtering(dictionary: dict, key_list: list) dict [source]#
Filter a dictionary by a list of keys.
- Parameters:
dictionary (dict) – The dictionary to filter.
key_list (list) – The list of keys to keep.
- Returns:
The filtered dictionary.
- Return type:
dict
- download_from_switch(switch_folder_path: str, switch_link: str, switch_pass: str, local_folder_path: str = '.cache', download_anyway: bool = False)[source]#
Download files from a SWITCH directory to a local folder.
- Parameters:
switch_folder_path (str) – The SWITCH folder path.
switch_link (str) – The public link to the SWITCH folder.
switch_pass (str) – The password for the SWITCH folder.
local_folder_path (str, optional) – The local folder path. Defaults to “.cache”.
download_anyway (bool, optional) – Whether to download files even if they already exist locally. Defaults to False.
- duckdb_to_dict(file_path: str) dict [source]#
Load a DuckDB file into a dictionary of Polars DataFrames.
- Parameters:
file_path (str) – The DuckDB file path.
- Returns:
The dictionary of Polars DataFrames.
- Return type:
dict
- extract_archive(file_name: str, extracted_folder: str | None = None, force_extraction: bool = False) None [source]#
Extract an archive file to a specified folder.
- Parameters:
file_name (str) – The name of the archive file.
extracted_folder (Optional[str], optional) – The folder to extract the files to. Defaults to None.
force_extraction (bool, optional) – Whether to force extraction even if the folder already exists. Defaults to False.
- generate_log(name: str, log_level: str = 'info') Logger [source]#
Generate a logger with the specified name and log level.
- Parameters:
name (str) – The name of the logger.
log_level (str, optional) – The log level. Defaults to “info”.
- Returns:
The generated logger.
- Return type:
logging.Logger
- generate_uuid(base_value: str, base_uuid: UUID | None = None, added_string: str = '') str [source]#
Generate a UUID based on a base value, base UUID, and an optional added string.
- Parameters:
base_value (str) – The base value for generating the UUID.
base_uuid (uuid.UUID, optional) – The base UUID for generating the UUID.
added_string (str, optional) – The optional added string. Defaults to “”.
- Returns:
The generated UUID.
- Return type:
str
- initialize_output_file(file_path: str)[source]#
Initialize an output file by creating necessary directories and removing the file if it already exists.
- Parameters:
file_path (str) – The path of the file to initialize.
- modify_string(string: str, format_str: dict) str [source]#
Modify a string by replacing substrings according to a format dictionary - Input could contains RegEx. - The replacement is done in the order of the dictionary keys.
- Parameters:
string (str) – Input string.
format_str (dict) – Dictionary containing the substrings to be replaced and their replacements.
- Returns:
Modified string.
- Return type:
str
- pl_to_dict(df: DataFrame) dict [source]#
Convert a Polars DataFrame with two columns into a dictionary. It is assumed that the first column contains the keys and the second column contains the values. The keys must be unique but Null values will be filtered.
- Parameters:
df (pl.DataFrame) – Polars DataFrame with two columns.
- Returns:
Dictionary representation of the DataFrame.
- Return type:
dict
- Raises:
ValueError – If the DataFrame does not have exactly two columns or if the keys are not unique.
- pl_to_dict_with_tuple(df: DataFrame) dict [source]#
Convert a Polars DataFrame with two columns into a dictionary where the first column contains tuples as keys and the second column contains the values.
- Parameters:
df (pl.DataFrame) – Polars DataFrame with two columns.
- Returns:
Dictionary representation of the DataFrame with tuples as keys.
- Return type:
dict
- Raises:
ValueError – If the DataFrame does not have exactly two columns.
Example: >>> import polars as pl >>> data = {‘key’: [[1, 2], [3, 4], [5, 6]], ‘value’: [10, 20, 30]} >>> df = pl.DataFrame(data) >>> pl_to_dict_with_tuple(df) {(1, 2): 10, (3, 4): 20, (5, 6): 30}
- scan_folder(folder_name: str, extension: str | list[str] | None = None, file_names: str | None = None) list[str] [source]#
Scan a folder and return a list of file paths with specified extensions or names.
- Parameters:
folder_name (str) – The folder to scan.
extension (Optional[Union[str, list[str]]], optional) – The file extensions to filter by. Defaults to None.
file_names (Optional[str], optional) – The file names to filter by. Defaults to None.
- Returns:
List of file paths.
- Return type:
list[str]
- scan_switch_directory(oc: Client, local_folder_path: str, switch_folder_path: str, download_anyway: bool) list[str] [source]#
Scan a directory on the SWITCH server and return a list of file paths.
- Parameters:
oc (owncloud.Client) – The ownCloud client.
local_folder_path (str) – The local folder path.
switch_folder_path (str) – The SWITCH folder path.
download_anyway (bool) – Whether to download files even if they already exist locally.
- Returns:
List of file paths.
- Return type:
list[str]
- snake_to_camel(snake_str: str) str [source]#
Convert a snake_case string to CamelCase.
- Parameters:
snake_str (str) – The snake_case string.
- Returns:
The CamelCase string.
- Return type:
str
- table_to_gpkg(table: DataFrame, gpkg_file_name: str, layer_name: str, srid: int = 2056)[source]#
Save a Polars DataFrame as a GeoPackage file. As GeoPackage does not support list columns, the list columns are joined into a single string separated with a comma.
- Parameters:
table (pl.DataFrame) – The Polars DataFrame.
gpkg_file_name (str) – The GeoPackage file name.
layer_name (str) – The layer name.
srid (int, optional) – The SRID. Defaults to SWISS_SRID.