TimeSeriesDataFrame.convert_frequency¶
- TimeSeriesDataFrame.convert_frequency(freq: str | DateOffset, agg_numeric: str = 'mean', agg_categorical: str = 'first', num_cpus: int = -1, chunk_size: int = 100, **kwargs) TimeSeriesDataFrame[source]¶
- Convert each time series in the data frame to the given frequency. - This method is useful for two purposes: - Converting an irregularly-sampled time series to a regular time index. 
- Aggregating time series data by downsampling (e.g., convert daily sales into weekly sales) 
 - Standard - df.groupby(...).resample(...)can be extremely slow for large datasets, so we parallelize this operation across multiple CPU cores.- Note - This method assumes that the index of the TimeSeriesDataFrame is sorted by [item_id, timestamp]. - If the index is not sorted, this method will log a warning and may produce an incorrect result. - Parameters:
- freq (Union[str, pd.DateOffset]) – Frequency to which the data should be converted. See pandas frequency aliases for supported values. 
- agg_numeric ({"max", "min", "sum", "mean", "median", "first", "last"}, default = "mean") – Aggregation method applied to numeric columns. 
- agg_categorical ({"first", "last"}, default = "first") – Aggregation method applied to categorical columns. 
- num_cpus (int, default = -1) – Number of CPU cores used when resampling in parallel. Set to -1 to use all cores. 
- chunk_size (int, default = 100) – Number of time series in a chunk assigned to each parallel worker. 
- **kwargs – Additional keywords arguments that will be passed to - pandas.DataFrameGroupBy.resample.
 
- Returns:
- ts_df – A new time series dataframe with time series resampled at the new frequency. Output may contain missing values represented by - NaNif original data does not have information for the given period.
- Return type:
 - Examples - Convert irregularly-sampled time series data to a regular index - >>> ts_df target item_id timestamp 0 2019-01-01 NaN 2019-01-03 1.0 2019-01-06 2.0 2019-01-07 NaN 1 2019-02-04 3.0 2019-02-07 4.0 >>> ts_df.convert_frequency(freq="D") target item_id timestamp 0 2019-01-01 NaN 2019-01-02 NaN 2019-01-03 1.0 2019-01-04 NaN 2019-01-05 NaN 2019-01-06 2.0 2019-01-07 NaN 1 2019-02-04 3.0 2019-02-05 NaN 2019-02-06 NaN 2019-02-07 4.0 - Downsample quarterly data to yearly frequency - >>> ts_df target item_id timestamp 0 2020-03-31 1.0 2020-06-30 2.0 2020-09-30 3.0 2020-12-31 4.0 2021-03-31 5.0 2021-06-30 6.0 2021-09-30 7.0 2021-12-31 8.0 >>> ts_df.convert_frequency("YE") target item_id timestamp 0 2020-12-31 2.5 2021-12-31 6.5 >>> ts_df.convert_frequency("YE", agg_numeric="sum") target item_id timestamp 0 2020-12-31 10.0 2021-12-31 26.0