Adding frame.py, functionality for window extraction and coherence analysis by judewerth · Pull Request #196 · dj-sciops/utah_organoids

judewerth · 2025-12-08T19:27:55Z

Here is my draft for a new pipeline for coherence analysis. However this hasn't been tested, @MilagrosMarin when you come back I would appreciate a coding meeting where we can talk about how to improve, test, and integrate this code into the main repository.
New DataJoint Pipeline.pdf

MilagrosMarin

@judewerth, thank you for providing this pipeline expansion for coherence analyses. The overall pipeline design seems well thought out and aligned with the rest of the pipelines. @ttngu207 and I have reviewed this PR, and we are impressed with your learning and progress on the pipeline design. I have left a few comments on our discussion today. Please feel free to test this new pipeline in the database, but run only 1–2 jobs at a time to avoid overwhelming the worker with untested code.

This PR's feedback is mainly focused on improving the clarity, usability, and integration of your proposed changes to the pipeline:

We suggest more meaningful names for the new schema, some tables, and some attributes, to be more related and intuitive to the analysis being performed.
Review the total number of primary keys in the downstream pipeline, which can hurt query usability. Consider reducing the number of primary key attributes in downstream tables (e.g., by optimizing the computation to use ephys.LFP instead of ephys.LFP.Trace in Coherence table).
Ensure foreign key references between tables (e.g., linking burst detection parameters to active timeframes) are properly defined.
Verify the new function for channel-to-electrode mapping is compatible with the existing channel mapping functionality in the pipeline.

Overall, this is a valuable addition to the pipeline, and I am happy to discuss these suggestions further in our meeting tomorrow and provide any additional guidance as needed. Thanks!

MilagrosMarin · 2025-12-09T21:11:42Z

+    # correctly map electrode indices
+    electrode_ids = lookup[channel_ids]
+
+    return electrode_ids


Please ensure that this helper function, map_channel_to_electrode, is compatible with the channel mapping in the ephys pipeline.

MilagrosMarin · 2025-12-09T21:12:58Z

+
+
+# Set up schema (connects to database and manages table creation)
+schema = dj.schema(DB_PREFIX + "frame")


Let's consider a more meaningful name for the schema instead of frame; what about coherence or a similar option?

MilagrosMarin · 2025-12-09T21:16:17Z

+
+    definition = """
+    -> ActiveTimeFrames
+    burst_param_idx: int # Reference to BurstDetectionParamset


It seems that the reference to BurstDetectionParamset is missing as a foreign attribute here instead of defining burst_param_idx.

MilagrosMarin · 2025-12-09T21:34:58Z

+
+# Define Manual Tables
+@schema
+class AnalysisBoundaries(dj.Manual):


What about AnalysisBlock or AnalysisSegment?

MilagrosMarin · 2025-12-09T21:35:41Z

+
+    definition = """
+    -> AnalysisBoundaries
+    start_time: datetime # Start of active time frame


Let's use explicit attribute names, e.g., active_start_time, active_end_time.

MilagrosMarin · 2025-12-09T21:47:25Z

+
+    definition = """
+    -> ActiveTimeFrames
+    -> ephys.LFP.Trace


Should this foreign key be ephys.LFP.Trace? The electrode attribute in Synchrony is unnecessary, since the foreign key to LFP.Trace here in Coherence already indicates electrode information. Moreover, referencing ephys.LFP.Trace here (with electrode details) would make the primary key too long, reducing query efficiency. Could you optimize the code to reference ephys.LFP here instead, and add individual electrode entries in the Synchrony table?

judewerth · 2025-12-13T02:33:49Z

@MilagrosMarin @ttngu207
I've edited frame.py to only include the active frame detection portion of the code (please confirm this change made it to the pull request).

Additionally, I propose a strategy to process the MUA data to make this analysis possible. We focus on the first 2 days (already processed), 2 days before drug treatments, and drug treatments.

However, I'm interested to hear your thoughts. If we agree to move forward with this, I can begin inserting MUA Sessions.

Next steps:

I need access to debug the file directly (see below)

Confirm code works properly
Strategy to process mua sessions

It takes ~9 seconds to process a minute of recording per organoid

If we were to process every file with 15 workers, a conservative estimate is 50 days. Pretty long.

However, if we were to process only the first 2 days, the 2 days before drug treatments, and 1 day of drug treatments, it would only take about 10 days.

judewerth · 2025-12-19T20:20:29Z

Hello @MilagrosMarin and @ttngu207,
This is the GitHub request accompanying and email sent out to the whole team. I've included my powerpoint presentation here as well. GBM Analysis - DataJoint Integration.pdf

In summary, this is my proposed additions to the pipeline in preparation for the methods paper that will include the GBM analysis (my previous results have been through independent code, these changes can get those results through the pipeline).

What I would appreciate from y'all:

What are your thoughts on the new analysis, do you believe there are better ways to characterize the recorded neural activity?
How is the architecture set up, do you think there is a better way to integrate the new analysis?
What steps need to be taken to push these changes to the main repository (all the changes are on the frame.py file but they would need to be incorporated into the ephys, analysis and mua files)?
A plan regarding processing the MUA data (via mua.EphysSession).

Best,
Jude

judewerth · 2026-01-05T19:58:00Z

@MilagrosMarin @ttngu207

Hello,

I would like to start the process of inserting the MUA Data which is necessary to run the frame pipeline. I've added a new notebook based on the CREATE_new_session notebook to explain and streamline this process (CREATE_new_MUA_session).

Along with this push I've inserted two days worth of data (organoid 15) into the MUAEphysSession table. If possible please change the number of workers to what you believe to be best. Let me know how this process goes, if all goes well I will add new organoids.

Best,
Jude

MilagrosMarin · 2026-01-07T23:20:20Z

@MilagrosMarin @ttngu207

Hello,

I would like to start the process of inserting the MUA Data which is necessary to run the frame pipeline. I've added a new notebook based on the CREATE_new_session notebook to explain and streamline this process (CREATE_new_MUA_session).

Along with this push I've inserted two days worth of data (organoid 15) into the MUAEphysSession table. If possible please change the number of workers to what you believe to be best. Let me know how this process goes, if all goes well I will add new organoids.

Best, Jude

Hi @judewerth,

Regarding the operational status of the new MUAEphysSession entries for organoid_id=O15, there are a total of 5,760 sessions for this organoid O15, including 3,607 entries in the MUASpikes table with results already processed and 2,150 error jobs for this table, all with the error message: "FileNotFoundError: No valid full-path found (from [PosixPath('/home/jovyan/s3/inbox'), PosixPath('/home/jovyan/efs/outbox')]) for O13-16_raw/xxx". You can check these errors in the DataJoint platform as well. Please confirm whether the files associated with these sessions are correctly uploaded to the cloud.

judewerth · 2026-01-07T23:24:59Z

Hi Milagros,

Thank you for the message. The insert code in the notebook relys on ephys.RawFile to extract file times. I believe all MUA entries contain files in ephys.RawFile which I verified a handful are seen on our Desktop. I've provided a screenshot in the issue #202.

Best,
Jude

judewerth · 2026-01-10T00:49:17Z

Hi @MilagrosMarin @ttngu207,

Like we discussed in the meeting, I will be using this pipeline manually to analyze the first 2 days. I've just pushed a new version including STTFA (spike LFP coupling) and STTC (single unit functional connectivity) along with a notebook showcasing these tables (temporary).

While I'm doing that please keep me updated regarding the MUA processing error in the backend as well as how I can help get this code moved from frame.py to the correct schema (analysis, ephys, and mua).

Best,
Jude

MilagrosMarin

Initial Review

Thanks for this comprehensive addition! I've added some inline comments on issues that need to be addressed before merging.

Summary of Issues:

Critical bugs: Missing parentheses on .pop calls
Missing dependencies: Several packages not in pyproject.toml
Hardcoded configuration: S3 store config should not be in schema module
README changes: Removes important documentation

Happy to discuss any of these points!

MilagrosMarin · 2026-01-23T16:30:25Z

+                f"Multiple Port IDs found for the {key} - cannot determine the port ID"
+            )
+        port_id = port_id.pop
+


🐛 Bug: Missing parentheses on .pop() call

This assigns the method object itself to port_id, not the result of calling it.

port_id = port_id.pop # ❌ Returns method object port_id = port_id.pop() # ✅ Returns the value

This will cause a runtime error when port_id is used later.

MilagrosMarin · 2026-01-23T16:31:33Z

+                f"Multiple Probes found for the {key} - cannot determine the probe name"
+            )
+        probe_name = probe_name.pop
+


Bug: Missing parentheses on .pop() call

Same issue as line 156 - should be probe_name.pop() with parentheses.

Also, probe_name is fetched but never used in this method. Is this intentional or should it be used in the commented-out EphysSessionProbe.insert1() below?

MilagrosMarin · 2026-01-23T18:26:24Z

+import plotly.io as pio
+import neo
+import quantities as pq
+from elephant.spike_train_correlation import spike_time_tiling_coefficient


Missing dependencies in pyproject.toml

The following packages are imported but not listed in pyproject.toml:

bottleneck (line 11)

specparam (line 16)

neo (line 21)

quantities (line 22)

elephant (line 23)

Please add these to the dependencies list to ensure the package installs correctly.

MilagrosMarin · 2026-01-23T18:26:47Z

+
+    # create lookup to convert
+    lookup = np.empty(32, dtype=int)
+    lookup[channel_mapping] = electrode_mapping


Hardcoded electrode count

The value 32 is hardcoded here, which assumes all probes have 32 electrodes. Consider using the actual electrode count from probe.ElectrodeConfig.Electrode to make this more robust:

num_electrodes = len(probe.ElectrodeConfig.Electrode & electrode_config_key) lookup = np.empty(num_electrodes, dtype=int)

This would handle probes with different electrode counts.

MilagrosMarin · 2026-01-23T18:27:08Z

 # Mac
 .DS_Store

+# Documentation


Question: Why ignore README.md?

Adding README.md to .gitignore is unusual - the README is typically an important tracked file. Could you clarify the reason for this?

If the intention is to have a local-only README, consider using a different filename like README.local.md instead.

MilagrosMarin · 2026-01-23T18:27:52Z

@@ -0,0 +1,5910 @@
+{


Suggestion: Clear notebook outputs before merge

This notebook has 5910 lines with embedded cell outputs. Consider clearing outputs before merging to:

Reduce file size

Avoid tracking output data in version control

Make diffs cleaner for future changes

You can clear outputs with: jupyter nbconvert --clear-output --inplace notebooks/test_worker.ipynb

MilagrosMarin · 2026-03-20T15:47:27Z

Hi @judewerth, I suggest closing this PR as it's been superseded by PR #211, where you reorganized the tables from frame.py into their logical homes (analysis.py, mua.py, culture.py, frame.py) — which is what you were asking about in the Jan 10 comment here and what we discussed and reviewed together.

Two tables from this PR are not in #211: ImpedanceFile/ImpedanceMeasurements and STTC. Are those planned for a follow-up PR?

Also, regarding the MUA processing errors for O15 (2,150 FileNotFoundError jobs from Jan 7) — is that resolved, or does it still need attention?

Thanks for the original draft and the PowerPoint — they were very helpful for understanding the design evolution into #211.

MilagrosMarin requested changes Dec 9, 2025

View reviewed changes

MilagrosMarin mentioned this pull request Dec 10, 2025

chore(MUATracePlot): temporarily limited key_source for testing/validation #197

Merged

ttngu207 previously approved these changes Dec 22, 2025

View reviewed changes

judewerth dismissed ttngu207’s stale review via 02ebfe0 January 5, 2026 19:52

MilagrosMarin reviewed Jan 23, 2026

View reviewed changes

MilagrosMarin mentioned this pull request Mar 20, 2026

Integrate tools for final analysis #211

Open

judewerth closed this Mar 25, 2026

judewerth force-pushed the main branch from 9efa46d to 12cf115 Compare March 25, 2026 16:52



		# Set up schema (connects to database and manages table creation)
		schema = dj.schema(DB_PREFIX + "frame")

Conversation

judewerth commented Dec 8, 2025

Uh oh!

MilagrosMarin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

judewerth commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

judewerth commented Dec 19, 2025

Uh oh!

judewerth commented Jan 5, 2026

Uh oh!

MilagrosMarin commented Jan 7, 2026

Uh oh!

judewerth commented Jan 7, 2026

Uh oh!

judewerth commented Jan 10, 2026

Uh oh!

MilagrosMarin left a comment

Choose a reason for hiding this comment

Initial Review

Summary of Issues:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MilagrosMarin commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

judewerth commented Dec 13, 2025 •

edited

Loading