polars_function#
- cast_boolean(col: Expr) Expr [source]#
Cast a column to boolean based on predefined replacements.
- Parameters:
col (pl.Expr) – The column to cast.
- Returns:
The casted boolean column.
- Return type:
pl.Expr
- cast_float(float_str: Expr) Expr [source]#
Cast a string column to float, modifying the string format as needed.
- Parameters:
float_str (pl.Expr) – The string column to cast.
- Returns:
The casted float column.
- Return type:
pl.Expr
- cast_to_utc_timestamp(timestamp: Expr, initial_time_zone: str = 'Europe/Zurich') Expr [source]#
Convert a timestamp column to UTC from the specified initial time zone.
- Parameters:
timestamp (pl.Expr) – The timestamp column to convert.
initial_time_zone (str, optional) – The initial time zone of the timestamps. Defaults to “Europe/Zurich”.
- Returns:
The timestamp column converted to UTC.
- Return type:
pl.Expr
- concat_list_of_list(col_list: Expr) Expr [source]#
Concatenate a column of lists into a list containing sublist.
- Parameters:
col_list (pl.Expr) – The column of lists to concatenate.
- Returns:
The concatenated list column.
- Return type:
pl.Expr
- cum_count_duplicates(cols_names: str | list[str]) Expr [source]#
Calculate the cumulative count of duplicate values in a specified column of a DataFrame, assigning half of the count as strict positive values and the other half as strict negative values.
- Parameters:
cols_names (Union[str, list[str]]) – The name of the column to check for duplicates.
- Returns:
A polar expression showing the cumulative count of duplicates.
- Return type:
pl.Expr
Example:#
>>> df = pl.DataFrame({"a": [1, 1, 2, 3, 4, 4, 4]}) ... df.with_columns( ... cum_count_duplicates(cols_names="a").alias("cum_count") ... ) shape: (7, 2) ┌─────┬────────────┐ │ id ┆ cum_count │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪════════════╡ │ 1 ┆ -1 │ │ 1 ┆ 1 │ │ 2 ┆ 1 │ │ 3 ┆ 1 │ │ 4 ┆ -1 │ │ 4 ┆ 1 │ │ 4 ┆ 2 │ └─────┴────────────┘
- digitize_col(col: Expr, min: float, max: float, nb_state: int) Expr [source]#
Digitize a column into discrete states based on the specified number of states.
- Parameters:
col (pl.Expr) – The column to digitize.
min (float) – The minimum value of the column.
max (float) – The maximum value of the column.
nb_state (int) – The number of discrete states.
- Returns:
The digitized column.
- Return type:
pl.Expr
- generate_random_uuid(col: Expr) Expr [source]#
Generate a random UUID.
- Returns:
The generated UUID.
- Return type:
str
- generate_uuid_col(col: Expr, base_uuid: UUID | None = None, added_string: str = '') Expr [source]#
Generate UUIDs for a column based on a base UUID and an optional added string.
- Parameters:
col (pl.Expr) – The column to generate UUIDs for.
base_uuid (uuid.UUID, optional) – The base UUID for generating the UUIDs.
added_string (str, optional) – The optional added string. Defaults to “”.
- Returns:
The column with generated UUIDs.
- Return type:
pl.Expr
- get_meta_data_string(metadata: Expr) Expr [source]#
Convert metadata to a JSON string, excluding keys with None values.
- Parameters:
metadata (pl.Expr) – The metadata column.
- Returns:
The metadata column as JSON strings.
- Return type:
pl.Expr
- get_transfo_admittance(rated_v: Expr, rated_s: Expr, oc_current_ratio: Expr) Expr [source]#
Get the transformer admittance based on the open circuit test
- Parameters:
rated_v (pl.Expr) – The rated voltage column indicates which side of the transformer the parameters are
with (associated)
rated_s (pl.Expr) – The rated power column [VA].
oc_current_ratio (pl.Expr) – The ratio between the measured current when transformer secondary is opened and the
[%]. (rated current)
- Returns:
The transformer admittance column [Simens].
- Return type:
pl.Expr
- get_transfo_conductance(rated_v: Expr, iron_losses: Expr) Expr [source]#
Get the transformer conductance based on iron losses measurement.
- Parameters:
rated_v (pl.Expr) – The rated voltage column indicates which side of the transformer the parameters are
with (associated)
iron_losses (pl.Expr) – The iron losses column [W].
- Returns:
The transformer conductance column [Simens].
- Return type:
pl.Expr
- get_transfo_imaginary_component(module: Expr, real: Expr) Expr [source]#
Get the transformer imaginary component based on the module and real component.
- Parameters:
module (pl.Expr) – The module column [Ohm or Simens].
real (pl.Expr) – The real component column [Ohm or Simens].
- Returns:
The transformer imaginary component column [Ohm or Simens].
- Return type:
pl.Expr
- get_transfo_impedance(rated_v: Expr, rated_s: Expr, voltage_ratio: Expr) Expr [source]#
Get the transformer impedance (or resistance if real part) based on the short-circuit tests.
- Parameters:
rated_v (pl.Expr) – The rated voltage column indicates which side of the transformer the parameters are
with (associated)
rated_s (pl.Expr) – The rated power column [VA].
voltage_ratio (pl.Expr) – The ratio between the applied input voltage to get rated current when transformer
[%]. (secondary is short-circuited and the rated voltage)
- Returns:
The transformer impedance column [Ohm].
- Return type:
pl.Expr
- get_transfo_resistance(rated_v: Expr, rated_s: Expr, copper_losses: Expr) Expr [source]#
Get the transformer resistance based on copper losses measurement.
- Parameters:
rated_v (pl.Expr) – The rated voltage column indicates which side of the transformer the parameters are
with (associated)
rated_s (pl.Expr) – The rated power column [VA].
copper_losses (pl.Expr) – The copper losses column [W].
- Returns:
The transformer resistance column [Ohm].
- Return type:
pl.Expr
- linear_interpolation_for_bound(x_col: Expr, y_col: Expr) Expr [source]#
Perform linear interpolation for boundary values in a column.
- Parameters:
x_col (pl.Expr) – The x-axis column.
y_col (pl.Expr) – The y-axis column to interpolate.
- Returns:
The interpolated y-axis column.
- Return type:
pl.Expr
- linear_interpolation_using_cols(df: DataFrame, x_col: str, y_col: list[str] | str) DataFrame [source]#
Perform linear interpolation on specified columns of a DataFrame.
- Parameters:
df (pl.DataFrame) – The DataFrame containing the data.
x_col (str) – The name of the x-axis column.
y_col (Union[list[str], str]) – The name(s) of the y-axis column(s) to interpolate.
- Returns:
The DataFrame with interpolated y-axis columns.
- Return type:
pl.DataFrame
- modify_string_col(string_col: Expr, format_str: dict) Expr [source]#
Modify string columns based on a given format dictionary.
- Parameters:
string_col (pl.Expr) – The string column to modify.
format_str (dict) – The format dictionary containing the string modifications.
- Returns:
The modified string column.
- Return type:
pl.Expr
- parse_date(date_str: str | None, default_date: datetime) datetime [source]#
Parse a date string and return a datetime object.
- Parameters:
date_str (str, optional) – The date string to parse.
default_date (datetime) – The default date to return if the date string is None.
- Returns:
The parsed datetime object.
- Return type:
datetime
- Raises:
ValueError – If the date format is not recognized.
- parse_timestamp(timestamp_str: Expr, item: str | None, keep_string_format: bool = False, convert_to_utc: bool = False, initial_time_zone: str = 'Europe/Zurich') Expr [source]#
Parse a timestamp column based on a given item.
- Parameters:
timestamp_str (pl.Expr) – The timestamp column.
item (str, optional) – The item to parse.
keep_string_format (bool, optional) – Whether to keep the string format. Defaults to False.
convert_to_utc (bool, optional) – Whether to convert the timestamp to UTC. Defaults to False.
initial_time_zone (str, optional) – The initial time zone of the timestamps. Defaults to “Europe/Zurich”.
- Returns:
The parsed timestamp column.
- Return type:
pl.Expr
- Raises:
ValueError – If the timestamp format is not recognized.