-
Notifications
You must be signed in to change notification settings - Fork 0
feat: tiling #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/tissue-masks
Are you sure you want to change the base?
feat: tiling #5
Changes from all commits
700f916
65e86db
59965a4
234d611
4930434
180afe7
456d4ae
e88bfd8
b91d36d
413db4f
f5b4045
2a578d1
9262f3e
b852be9
bd8ca9f
08d3320
3ab6d5a
06e957e
de0fd75
e56216b
95cc81c
754698f
178f294
e4d2499
acfbf19
659f505
45ac3b3
3c7a5fc
774f24a
076ba93
2788656
25bec6e
5faa224
eb96f56
89a4582
18c556d
35cfb5c
c98608e
bc71668
6647d0d
2dfb32b
30d4494
88eb5dc
4269a3a
a9615ed
3b7f95c
659910e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,6 @@ | ||||||||||
| defaults: | ||||||||||
| - /dataset/processed/ftn@_here_ | ||||||||||
| - _self_ | ||||||||||
|
|
||||||||||
| tissue_mask_uri: mlflow-artifacts:/86/04778b10de254572b69ce0a101c1eee4/artifacts/tissue_masks # TODO update URI | ||||||||||
| qc_mask_uri: mlflow-artifacts:/86/c8edfb2541e84b44b1a28be3540c1a35/artifacts # TODO update URI | ||||||||||
|
Comment on lines
+5
to
+6
|
||||||||||
| tissue_mask_uri: mlflow-artifacts:/86/04778b10de254572b69ce0a101c1eee4/artifacts/tissue_masks # TODO update URI | |
| qc_mask_uri: mlflow-artifacts:/86/c8edfb2541e84b44b1a28be3540c1a35/artifacts # TODO update URI | |
| tissue_mask_uri: OVERRIDE_ME_TISSUE_MASK_URI | |
| qc_mask_uri: OVERRIDE_ME_QC_MASK_URI |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| defaults: | ||
| - /dataset/processed/ikem@_here_ | ||
| - _self_ | ||
|
|
||
| tissue_mask_uri: mlflow-artifacts:/86/13359cdd5d1a47ddabc352b9aa0d7635/artifacts/tissue_masks # TODO update URI | ||
| qc_mask_uri: mlflow-artifacts:/86/98443fe2b67445d5a56598bff15b7f27/artifacts # TODO update URI | ||
|
Comment on lines
+5
to
+6
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Adames4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,6 @@ | ||||||||||
| defaults: | ||||||||||
| - /dataset/processed/knl_patos@_here_ | ||||||||||
| - _self_ | ||||||||||
|
|
||||||||||
| tissue_mask_uri: mlflow-artifacts:/86/8ef6d6f0c9af4f35a087596960f675aa/artifacts/tissue_masks # TODO update URI | ||||||||||
| qc_mask_uri: mlflow-artifacts:/86/75fc3e53112f4634ae5238777d87e88c/artifacts # TODO update URI | ||||||||||
|
Comment on lines
+5
to
+6
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Comment on lines
+5
to
+6
|
||||||||||
| tissue_mask_uri: mlflow-artifacts:/86/8ef6d6f0c9af4f35a087596960f675aa/artifacts/tissue_masks # TODO update URI | |
| qc_mask_uri: mlflow-artifacts:/86/75fc3e53112f4634ae5238777d87e88c/artifacts # TODO update URI | |
| tissue_mask_uri: ${oc.env:TISSUE_MASK_URI,} | |
| qc_mask_uri: ${oc.env:QC_MASK_URI,} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ftn@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.17 # level 0 | ||
| tile_extent: 320 | ||
| stride: 320 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ftn@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.17 # level 0 | ||
| tile_extent: 430 # 75 / 0.17 ≈ 430 | ||
| stride: 430 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ftn@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.52 # level 1 | ||
| tile_extent: 224 | ||
| stride: 112 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ftn@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 1.55 # level 2 | ||
| tile_extent: 224 | ||
| stride: 112 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ikem@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.17 # level 0 | ||
| tile_extent: 320 | ||
| stride: 320 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ikem@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.17 # level 0 | ||
| tile_extent: 430 # 75 / 0.17 ≈ 430 | ||
| stride: 430 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ikem@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.52 # level 1 | ||
| tile_extent: 224 | ||
| stride: 112 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/ikem@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 1.55 # level 2 | ||
| tile_extent: 224 | ||
| stride: 112 | ||
|
|
||
| splits: | ||
| train: 0.7 | ||
| test_preliminary: 0.15 | ||
| test_final: 0.15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/knl_patos@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.17 # level 0 | ||
| tile_extent: 320 | ||
| stride: 320 | ||
|
|
||
| splits: | ||
| train: 0.0 | ||
| test_preliminary: 0.5 | ||
| test_final: 0.5 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/knl_patos@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.17 # level 0 | ||
| tile_extent: 430 # 75 / 0.17 ≈ 430 | ||
| stride: 430 | ||
|
|
||
| splits: | ||
| train: 0.0 | ||
| test_preliminary: 0.5 | ||
| test_final: 0.5 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/knl_patos@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 0.52 # level 1 | ||
| tile_extent: 224 | ||
| stride: 112 | ||
|
|
||
| splits: | ||
| train: 0.0 | ||
| test_preliminary: 0.5 | ||
| test_final: 0.5 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # @package _global_ | ||
|
|
||
| defaults: | ||
| - /dataset/processed_w_masks/knl_patos@dataset | ||
| - _self_ | ||
|
|
||
| mpp: 1.55 # level 2 | ||
| tile_extent: 224 | ||
| stride: 112 | ||
|
|
||
| splits: | ||
| train: 0.0 | ||
| test_preliminary: 0.5 | ||
| test_final: 0.5 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # @package _global_ | ||
|
|
||
| output_dir: ${project_dir}/quality_control/${dataset.institution} | ||
|
|
||
| request_timeout: 18000 | ||
| max_concurrent: 5 | ||
|
|
||
| qc_parameters: | ||
| mask_level: 3 | ||
| sample_level: 1 | ||
| check_residual: True | ||
| check_folding: False | ||
| check_focus: True | ||
| wb_correction: True | ||
|
|
||
|
|
||
| metadata: | ||
| run_name: "🎭 QC Masks: ${dataset.institution}" | ||
| description: Quality control masks for ${dataset.institution} institution | ||
| hyperparams: | ||
| mask_level: ${qc_parameters.mask_level} | ||
| sample_level: ${qc_parameters.sample_level} | ||
| check_residual: ${qc_parameters.check_residual} | ||
| check_folding: ${qc_parameters.check_folding} | ||
| check_focus: ${qc_parameters.check_focus} | ||
| wb_correction: ${qc_parameters.wb_correction} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # @package _global_ | ||
|
|
||
| mpp: ??? | ||
| tile_extent: ??? | ||
| stride: ??? | ||
| tissue_threshold: 0.5 | ||
|
|
||
| splits: | ||
| train: ??? | ||
| test_preliminary: ??? | ||
| test_final: ??? | ||
|
|
||
| metadata: | ||
| run_name: "🧱 Tiling: ${dataset.institution} ${tile_extent}" | ||
| description: Tile extraction for ${dataset.institution} institution with tile extent ${tile_extent} | ||
| hyperparams: | ||
| mpp: ${mpp} | ||
| tile_extent: ${tile_extent} | ||
| stride: ${stride} | ||
| tissue_threshold: ${tissue_threshold} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| # credits: https://gitlab.ics.muni.cz/rationai/digital-pathology/pathology/lymph-nodes/-/blob/develop/preprocessing/qc.py?ref_type=heads | ||
|
|
||
| import asyncio | ||
| from collections.abc import Generator | ||
| from pathlib import Path | ||
| from typing import TypedDict | ||
|
|
||
| import hydra | ||
| import mlflow.artifacts | ||
| import pandas as pd | ||
| import rationai | ||
| from omegaconf import DictConfig | ||
| from rationai.mlkit import autolog, with_cli_args | ||
| from rationai.mlkit.lightning.loggers import MLFlowLogger | ||
| from rationai.types import SlideCheckConfig | ||
| from tqdm.asyncio import tqdm | ||
|
|
||
|
|
||
| class QCParameters(TypedDict): | ||
| mask_level: int | ||
| sample_level: int | ||
| check_residual: bool | ||
| check_folding: bool | ||
| check_focus: bool | ||
| wb_correction: bool | ||
|
|
||
|
|
||
| def get_qc_masks(qc_parameters: QCParameters) -> Generator[tuple[str, str], None, None]: | ||
| if qc_parameters["check_focus"]: | ||
| yield ("Piqe_focus_score_piqe_median", "blur_per_tile") | ||
| yield ("Piqe_piqe_median_activity_mask", "blur_per_pixel") | ||
|
|
||
| if qc_parameters["check_residual"]: | ||
| yield ("ResidualArtifactsAndCoverage_cov_percent_heatmap", "artifacts_per_tile") | ||
| yield ("ResidualArtifactsAndCoverage_coverage_mask", "artifacts_per_pixel") | ||
|
|
||
| if qc_parameters["check_folding"]: | ||
| yield ("FoldingFunction_folding_test", "folds_per_pixel") | ||
|
|
||
|
|
||
| def organize_masks(output_path: Path, subdir: str, mask_prefix: str) -> None: | ||
| prefix_dir = output_path / subdir | ||
| prefix_dir.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| # Glob has to be wrapped in list, because we're modifying the directory!!! | ||
| for file in list(output_path.glob(f"{mask_prefix}_*.tiff")): | ||
| slide_name = file.name.replace(f"{mask_prefix}_", "") | ||
| destination = prefix_dir / slide_name | ||
| file.rename(destination) | ||
|
|
||
|
|
||
| async def qc_main( | ||
| output_path: str, | ||
| slides: list[str], | ||
| logger: MLFlowLogger, | ||
| request_timeout: int, | ||
| max_concurrent: int, | ||
| qc_parameters: QCParameters, | ||
| ) -> None: | ||
| async with rationai.AsyncClient() as client: # type: ignore[attr-defined] | ||
| async for result in tqdm( | ||
| client.qc.check_slides( | ||
| slides, | ||
| output_path, | ||
| config=SlideCheckConfig(**qc_parameters), | ||
| timeout=request_timeout, | ||
| max_concurrent=max_concurrent, | ||
| ), | ||
| total=len(slides), | ||
| ): | ||
| if not result.success: | ||
| with open(Path(output_path) / "qc_errors.log", "a") as log_file: | ||
| log_file.write( | ||
| f"Failed to process {result.wsi_path}: {result.error}\n" | ||
| ) | ||
|
|
||
| # Organize generated masks into subdirectories | ||
| for prefix, artifact_name in get_qc_masks(qc_parameters): | ||
| organize_masks(Path(output_path), artifact_name, prefix) | ||
|
|
||
| # Merge generated csv files | ||
| csvs = list(Path(output_path).glob("*.csv")) | ||
| pd.concat([pd.read_csv(f) for f in csvs]).to_csv( | ||
| Path(output_path, "qc_metrics.csv"), index=False | ||
| ) | ||
|
|
||
| # Remove individual csv files | ||
| for f in csvs: | ||
| f.unlink() | ||
|
|
||
| logger.log_artifacts(local_dir=output_path) | ||
|
|
||
|
|
||
| def download_dataset(uri: str) -> pd.DataFrame: | ||
| path = mlflow.artifacts.download_artifacts(artifact_uri=uri) | ||
| df = pd.read_csv(path) | ||
| return df | ||
|
|
||
|
|
||
| @with_cli_args(["+preprocessing=quality_control"]) | ||
| @hydra.main(config_path="../configs", config_name="preprocessing", version_base=None) | ||
| @autolog | ||
| def main(config: DictConfig, logger: MLFlowLogger) -> None: | ||
| dataset = download_dataset(config.dataset.uri) | ||
|
|
||
| output_path = Path(config.output_dir) | ||
| output_path.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| asyncio.run( | ||
| qc_main( | ||
| output_path=output_path.absolute().as_posix(), | ||
| slides=dataset["path"].to_list(), | ||
| logger=logger, | ||
| request_timeout=config.request_timeout, | ||
| max_concurrent=config.max_concurrent, | ||
| qc_parameters=config.qc_parameters, | ||
| ) | ||
| ) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The artifact URIs are hardcoded and marked with a
TODO. This should be updated with the final URIs before merging. For better maintainability, consider if these could be passed in via a more dynamic configuration method rather than being hardcoded in multiple files.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting for #3 , #4