Skip to content

libosf.core cannot deal with gpslocation #2

@Mq89

Description

@Mq89

It seems that the libosf.core cannot deal with channels of datatype gpslocation.
Running the "to_csv" example with python3 ./to_csv.py -i example.osf -c GPS.Location produces the following stack trace:

Traceback (most recent call last):
  File "[REDACTED]/python-osf/examples/./to_csv.py", line 58, in <module>
    main(sys.argv[1:])
  File "[REDACTED]/python-osf/examples/./to_csv.py", line 35, in main
    df = samples.make_column_based()
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[REDACTED]/python-osf/src/libosf/core.py", line 149, in make_column_based
    df = DataFrame(data=frame_data)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[REDACTED]/python-osf/venv/lib/python3.11/site-packages/pandas/core/frame.py", line 767, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[REDACTED]/python-osf/venv/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 503, in dict_to_mgr
    return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[REDACTED]/python-osf/venv/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 114, in arrays_to_mgr
    index = _extract_index(arrays)
            ^^^^^^^^^^^^^^^^^^^^^^
  File "[REDACTED]/python-osf/venv/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 677, in _extract_index
    raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length

I already narrowed it down to get_samples already producing a tuple of three arrays with different lengths (i.e., 362, 266, 362) while they should have the same length. The loop starting in L209 always extends result_timestamps by 1 element. While result_values and result_indexes are sometimes extended by more than 1 element, hence the lengths diverge.

def get_samples(self, name_list: list[str], as_class=False):
"""
"""
ch_info = convert_channels_to_array(self.channels())
ch_info_array = []
blob_array = []
ch_list = [ch for ch in self.channels() if ch.name in name_list]
ch_filter_list = np.array([ch.index for ch in ch_list], dtype=np.uint16)
index_start = self._magic_header['header_size'] + self._magic_header['magic_length']
self._file.seek(index_start)
buffer_bytes = self._file.read(-1)
data_buffer = np.frombuffer(buffer_bytes, dtype='B').view(dtype=np.uint8)
bytes_size = data_buffer.shape[0]
index = 0
while index < bytes_size:
blob, index, chi = read_sample_blob(data_buffer, ch_info, index, ch_filter_list)
if blob.shape[0] != 0:
blob_array.append(blob)
ch_info_array.append(chi)
index = 0
result_indexes = []
result_values = []
result_timestamps = []
for blob in blob_array:
values, timestamps = decode_datablob(blob, ch_info_array[index])
result_values.extend(values)
result_timestamps.extend(timestamps)
result_indexes.extend([ch_info_array[index][0]] * len(values))
index = index + 1
if as_class:
return RawData((result_values, result_timestamps, result_indexes), ch_list)
else:
return result_values, result_timestamps, result_indexes

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions