Replies: 2 comments
-
|
yes, Are you interested in making a PR? here is what Claude Code & I came up with: import xarray as xr
import numpy as np
ds = xr.tutorial.open_dataset("air_temperature")
da = ds.air
grouped = da.groupby("time.month")
idxs = grouped.encoded.group_indices
def pad_and_stack_vectorized(arrays, fillvalue=-1):
# Convert to object array first to handle ragged arrays
arrays = np.array(arrays, dtype=object)
# Get lengths and find max
lengths = np.array([len(arr) for arr in arrays])
max_len = lengths.max()
# Pre-allocate output array
result = np.full((len(arrays), max_len), fillvalue, dtype=int)
# Create row and column indices for all valid positions
row_idx = np.repeat(np.arange(len(arrays)), lengths)
col_idx = np.concatenate([np.arange(length) for length in lengths])
# Concatenate all values and assign vectorized
values = np.concatenate(arrays)
result[row_idx, col_idx] = values
return result
stacked_idxs = pad_and_stack_vectorized(idxs)
result = da.data[stacked_idxs, :, :]
result[stacked_idxs == -1] = np.nan
result |
Beta Was this translation helpful? Give feedback.
-
|
@dcherian sorry for the late reply. Thanks for the code with an example dataset! I could make a PR, since I've had to make a convoluted workaround for this multiple times now, though I would need a little bit more direction on how to implement these functions. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Suppose I have a dataset with the first index,
time, being a datetime dimension with uneven spacing of60sor more between points. I want to resample to a certain interval, say 1800 s. Furthermore, I want to keep the points within each resample window as separate values. This means I want to reshape the time dimension to (N, 30), where N is the number of 1800s intervals, and there are at most 30 points per 1800s interval. I tried to mess around withresample(...).apply, but was unable to get it to work. The best solution I came up with is the following:Before,
datasetequals:After,
resampled_datasetequals:This seems to work, but it would be nice if it can be done by something like:
dataset.resample(time='1800s').stack()Beta Was this translation helpful? Give feedback.
All reactions