Skip to content

Output

Meteole provides forecasts in pandas.DataFrame, a convenient standard format for integration into your modeling pipelines.

Processing Indicator Names in Output DataFrame

Indicator names are reprocessed by Meteole for the following reasons:

  • Handling unknown column names:

Sometimes the column name is "unknown", in which case meteole generates a name by taking the first letters of the indicator name.

  • Concatenation of height or pressure values:

To simplify output (mainly for retrieving multiple indicators), if the data includes height or pressure values, these are concatenated into the indicator name and the column is deleted. In this case, height (in meters) or pressure (in hPa) values are added to the base name.

For further details, please refer to the function :

(Protected) Return the forecast's data for a given time and indicator.

Parameters:

Name Type Description Default
coverage_id str

the indicator.

required
height int

height in meters

required
pressure int

pressure in hPa

required
forecast_horizon timedelta

the forecast horizon (how much time ahead?)

required
lat tuple

minimum and maximum latitude

required
long tuple

minimum and maximum longitude

required
temp_dir str | None

Directory to store the temporary file. Defaults to None.

None

Returns:

Type Description
DataFrame

pd.DataFrame: The forecast for the specified time.

Source code in meteole/forecast.py
def _get_data_single_forecast(
    self,
    coverage_id: str,
    forecast_horizon: dt.timedelta,
    pressure: int | None,
    height: int | None,
    lat: tuple,
    long: tuple,
    temp_dir: str | None = None,
) -> pd.DataFrame:
    """(Protected)
    Return the forecast's data for a given time and indicator.

    Args:
        coverage_id (str): the indicator.
        height (int): height in meters
        pressure (int): pressure in hPa
        forecast_horizon (dt.timedelta): the forecast horizon (how much time ahead?)
        lat (tuple): minimum and maximum latitude
        long (tuple): minimum and maximum longitude
        temp_dir (str | None): Directory to store the temporary file. Defaults to None.

    Returns:
        pd.DataFrame: The forecast for the specified time.
    """

    grib_binary: bytes = self._get_coverage_file(
        coverage_id=coverage_id,
        height=height,
        pressure=pressure,
        forecast_horizon_in_seconds=int(forecast_horizon.total_seconds()),
        lat=lat,
        long=long,
    )

    df: pd.DataFrame = self._grib_bytes_to_df(grib_binary, temp_dir=temp_dir)

    # Drop and rename columns
    df.drop(columns=["surface", "valid_time"], errors="ignore", inplace=True)
    df.rename(
        columns={
            "time": "run",
            "step": "forecast_horizon",
        },
        inplace=True,
    )
    known_columns = {"latitude", "longitude", "run", "forecast_horizon", "heightAboveGround", "isobaricInhPa"}
    indicator_column = (set(df.columns) - known_columns).pop()

    if indicator_column == "unknown":
        base_name = "".join([word[0] for word in coverage_id.split("__")[0].split("_")]).lower()
    else:
        base_name = re.sub(r"\d.*", "", indicator_column)

    if "heightAboveGround" in df.columns:
        suffix = f"_{int(df['heightAboveGround'].iloc[0])}m"
    elif "isobaricInhPa" in df.columns:
        suffix = f"_{int(df['isobaricInhPa'].iloc[0])}hpa"
    else:
        suffix = ""

    new_indicator_column = f"{base_name}{suffix}"
    df.rename(columns={indicator_column: new_indicator_column}, inplace=True)

    df.drop(
        columns=["isobaricInhPa", "heightAboveGround", "meanSea", "potentialVorticity"],
        errors="ignore",
        inplace=True,
    )

    return df