Skip to content

skeliner==0.3.0: Mesh Preprocessing#32

Open
huangziwei wants to merge 349 commits into
mainfrom
mesh
Open

skeliner==0.3.0: Mesh Preprocessing#32
huangziwei wants to merge 349 commits into
mainfrom
mesh

Conversation

@huangziwei
Copy link
Copy Markdown
Collaborator

@huangziwei huangziwei commented Mar 16, 2026

TL;DR

  • viewer module, replacing the vtk-based microviewer for mesh viewing, and also made possible of interactive mesh processing; without vtk the first time startup time will be much faster.
    • can be invoked by CLI skeliner view --port 1234 or in Jupyter notebook sk.plot.view(port=1234)
  • pre module for mesh preprocessing (can be invoked via sk.pre.* manually stay by step or sk.preprocess(mesh) automatically); output a MeshComponents dataclass, which include Soma, Organelles, Neurites and Discarded.
  • do sk.skeletonize(mesh, components=components) will use MeshComponents as input.
  • Add and modify many dataclass to hold the new processed data; change io to save and load them;

In Depths

The raw mesh we can get directly from CAVE/Cloudvolume might contains quite a few artifacts, some might cause problems for the downstream skeletonization. So far skeliner skipped the whole preprocessing step for the sake of speed, but sometimes we really need to clean them up beforehand. So what are the problems?

Parallels patches due to chunk merging failure

Screenshot 2026-04-09 at 13 54 40

A mesh is composed of many chunks. I don't fully understand how the mesh is made but what I think it happened is that the mesh is created at per-chunk level, then stitched up into one cell. The stitching works most of the time, but sometimes it failed and created some artifacts which I termed "parallel patches".

Screenshot 2026-04-09 at 14 06 36

It can happen at all chunk boundaries, at all 3 axis. Admittedly, most of the time they are also harmless, but sometimes they breaks the continuity of a neurites which creates (one type of) gaps, sometimes they merged wrongly which creates "fusions" (connection between vertices, edges, faces that shouldn't happen, sometimes non manifold sometimes manifold; more on it later) which in turns creates (one type of) dendritic loop, sometimes they creates both gaps AND fusions at the same time (image above, green is gaps, blue is fusions).

pre.find_parallel_patches and pre.remove_parellel_patches fixed this problem: we need to find them and remove the harmful ones, which open up the mesh with various size of contours, then we stitch the contours back. The stitching works mostly, but still not perfect (because we also need to consider the cases where the parallels are between the surface mesh and internal organelles, which we might not want to mess with them yet, so we filter out some not-so-much overlapped cases, which might miss some cases that should be stitched; trade-off everywhere).

This is the first step one should do before any other preprocessing, because removing and faces will change the mesh statistics.

Organelles

The raw mesh also includes organelles, if not removed / ignored during skeletonization, might create (one type of) "phantom branches" (a branch in the skeleton that captures the organelles instead of a real branching; but can happen without organelles due to geodesic binning not aligning with the neurite progression direction, but that's another topic).

Screenshot 2026-04-09 at 14 18 22

There are two types of organelles, which I termed "isolated" and "pockets". "Isolated" are the organelles we intuitively get, which are disconnected components that are not connected to the surface mesh directly, and float "within" the surface mesh. The "pockets", on the other hand, takes me quite a while to understand what they are and how to detect them: they are not technically "within" the mesh, but part of the surface. Imagine your clothes' pockets, if you pull them out, they are just the surface, but they can be tucked in and "look" being "inside" the surface mesh. The pockets are found via their "mouths" (find_pocket_mouths), which is where the mesh started to "fold-in" and have a certain statistics (the outward dots of the faces transits from positive to negative). With this characteristic, we can find the existence of most pockets, but not all, and we can find majority of the faces belong to a pocket, but still not all, because sometimes the outward dots transition is just not obvious (e.g, the nucleus/soma never has a sharp transition, so we can never detect nucleus this way; sometimes the transition just barely missed by the threshold), and also because the pockets sometimes fold many times and creates many layers of boundaries that we don't know how to cut it (after writing this, latest commit 0e444f1 and a926f58 kind of fix this pos/neg boundary problem which caused incomplete coverage; now the remaining problematic ones are mostly the ones we can't detect the mouths in the beginning).

pre.find_organelles and pre.remove_organelles can find both types and remove them. but you don't want to remove them most of the time, just keep them until break_up_mesh at the last step and you can save them as Organelles, and skeletonization will ignore it by default.

Nucleus and Soma

Soma is supposed to be the biggest "Bulb" in the raw mesh. Sometimes you don't even have the soma, sometimes you have incomplete soma. If there's a soma, there will be the nucleus. But the nucleus is actually a "pocket" that doesn't fit the pocket organelle statistics. What it creates, though, is a double contours at z-sections:

output

And this becomes the clue to find the nucleus center and bounds, which leads us to find the the soma via BFS ring cutoffs (which means using the nucleus center to seed the first ring, then grow it up a few more rings, then cut them a bit if it goes off too much, or extends a bit if not grow enough). Yet again this method is not perfect, because it relies on z-cross-section contours, and if there are very thick neurites extending horizontally, it will be mistaken as the outer soma contour (it can be fixed if it's thin, but can't if it's thick). Another problem is if the soma has too many very big "holes" (which is not really holes in the watertight sense, but "holes" in the pocket mouths sense), then we can't even have enough contours to form the nucleus chains in z axis, resulting in incomplete soma detection.

Soma can be found via sk.pre.find_soma_via_ring_cutoff(mesh, organelles=organelles, mesh_stats=mesh_stats) and can be visualized with sk.plot.vis2d.diagnose_soma(mesh=mesh, soma=soma, organelles=organelles, figsize=(8,8))

output

So far with 160+ cells I use for testing, the soma detection (or the lack of it) is 100% accurate (if we ignore the fact that sometimes it overshoots or undershoots), but I am sure with more data things will start breaking again.

Even with mostly good soma, there still can be patches here and there that are not included in the first pass of running "find_soma*". Those patches will be merged back at the last step break_up_mesh (more on it later).

Fusions, Disconnected components, and Gaps

Two big problems with the raw mesh: things should be connected but not (because of gaps), things should not be connected but do (because of fusions). And they can happen at the same time: removing the fusions will create disconnected components that have gaps with the main mesh at the other end.

Example of fusions, "wrong" connections between two neurites that are very close, but should not be connected:

Screenshot 2026-04-09 at 14 54 42

we can detect them because those connections are abnormal, which can be one vertex connect to more than six edges, or an edge connect by more than 2 faces, or a face connected to more than 3 other faces via edges, etc...

Screenshot 2026-04-09 at 14 59 18

without knowing about the fusions, this branch will not be considered as a disconnected component, and we will miss a gap; or in another words, this branch will become disconnected if we remove the fusions after removing the gaps, because it was not recognized as a disconnected component in the first place, so it will not be bridged.

The correct order of the fusions/gaps problem is first find_fusions > find_disconnected > find_gaps > remove_gaps > remove_fusions.

The sad truth is that the method we bridge (or stitch) the gaps (and also the parallel gaps created by removing parallel patches) sometimes will misfire and create fusions again, so removing fusions will make the bridged gaps reappeared. It happens rarely, but still, it will happen. So far with 160+ cells I have, most of the time it happens to small pieces that can be discarded, but it's almost certain that worse case will happen nevertheless. I can only fix it by diagnosing on the bad results, which sadly I don't have yet.

Breaking up the mesh into components for skeletonization

Screenshot 2026-04-09 at 15 07 16

After "fixing" all the anomalies, we can do one clean-up step break_up_mesh, which breaks the mesh by the identified components (Soma and Organelles). This step will create disconnected pieces, if the pieces are patches around the soma, it will be merged into soma, if they are mostly enclosed by organelles, they will be merged into organelles, the rest, if they are long and big or branching they will be classified as neurites, fragments (small, not connecting to anything, hard-to-classified by threshold and statistics) will be assigned to Discarded.

The old sk.skeletonize(mesh) now is adapted to the new preprocessing output and can also take sk.skeletonize(mesh, components=components) and skip all the post processing steps and only skeletonizes the Neurites (ignoring the Organelles and Discarded) then stitch it back to the precomputed Soma.

Everything can be done with one call of mesh_cleaned, components = sk.preprocess(mesh, compact=True). The default params (not many, and not very informative anyway) are already good for almost all the cells I tested. I'd call this a completely automatic preprocessing pipeline

Where are we now and where are we going next?

This brings us to the end of this PR, which marks skeliner v0.3.0.

The raw mesh is a big pile of mess that requires a lot of compute to clean up, and we still can't make it perfectly clean after all the troubles. There are cells with dendritic loops that not via fusions (there's nothing wrong with the mesh, the chunk merging failures happened to yield perfectly normal mesh yet definitely wrong), there are still gaps that we can perfectly bridged, and there are still phantom branching in the skeleton after removing the organelles. All we can do with these preprocessing steps are "just" reduce the occurrence of the dendritic loop, the gaps and phantom branches.

v0.4.0 or some v0.3.* will focus on improving the skeletonization pipeline, given that now we have good soma locale estimated (which can be enabled even without preprocessed components, just set soma_init_guess to nuclues it will use the nucleus center as soma seed instead of the extreme tips we used before), we can actually do multi-source geodesic binning from the soma, or with components, from the soma-connected tips of each neurites, and we should be able to get better "neurite progression aligned/perpenticular-to-centerline" bins and also get the centerline radius of bins for "free'. But I am speculating now. Hopefully this can fix the phantom branches and radius estimation problem at one shot.

@huangziwei huangziwei force-pushed the mesh branch 2 times, most recently from bb0f648 to 7a02fad Compare March 20, 2026 18:12
@huangziwei huangziwei changed the title [WIP] Mesh Preprocessing Mesh Preprocessing Mar 20, 2026
@huangziwei huangziwei changed the title Mesh Preprocessing skeliner v0.3.0: Mesh Preprocessing Apr 10, 2026
@huangziwei huangziwei changed the title skeliner v0.3.0: Mesh Preprocessing skeliner==0.3.0: Mesh Preprocessing Apr 10, 2026
@huangziwei
Copy link
Copy Markdown
Collaborator Author

This PR is currently on hiatus due to me being sick and tired of it. And there's a (logic, not code) bug in find_gaps which blocked the progress. Will come back to it after May 1st.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant