Adds fq/lint for early validation of FASTQs #67

adamrtalbot · 2024-11-02T12:54:21Z

Validation of FASTQS early prevents running the pipeline on invalid FASTQ files which will make the pipeline more efficient at achieving it's ultimate objective of checking FASTQ validity.

It adds 3 more parameters:

~~--skip_linting which enables the linting of FASTQs~~ [update March 25] Replaced with --skip_tools 'fq'
--fq_lint_args which is a string of arguments to pass to the linting tool
--continue_with_lint_fail which is a boolean to determine whether to continue if the linting fails

Between these three options the user has a high degree of control over how the pipeline lints which should handle most use cases.

Implements tests for all cases using the rnaseq minimal test dataset which has invalid sequencing names 🙄 .

Closes #31

PR checklist

Co-authored-by: Adrien Coulier <adrien.coulier@medsci.uu.se>

Co-authored-by: Karthik Nair <35717861+KarNair@users.noreply.github.com>

Input workflow

Issue with the previous implementation was that sometimes MULTIQC_PER_LANE would execute before the extra files were collected into `ch_multiqc_extra_files`, causing `null` to be added to the list of files passed to multiqc.

Important! Template update for nf-core/tools v2.14.1

….2.dev0 Important! Template update for nf-core/tools v3.1.2.dev0

Add skip tools parameter for tool selection

…-3.1.2

…ge-3.2.0

Important! Template update for nf-core/tools v3.2.0

Set up nft-utils in tests

…mple

This reverts commit 0ba1652.

Replace hard-coded path to fastqscreen example csv with parameter-supplied one

Added missing citations to citation tool

FranBonath · 2025-03-25T12:58:27Z

Hej, @adamrtalbot, thanks for your PR :). Just to let you know that we decided in the last seqinspector meeting on a defined list of modules to add to version 1. So while this is great, we will only implement it in a version after the first release. It's basically just to keep the first release simple.

pontushojer

Hi, just had a minor comment on this PR.

pontushojer · 2025-07-16T13:09:01Z

nextflow_schema.json

                    "type": "string",
                    "description": "Comma-separated string of tools to skip",
-                    "pattern": "^((fastqc|fastqscreen|seqfu_stats|seqtk_sample)?,?)*(?<!,)$"
+                    "pattern": "^((fq|fastqc|fastqscreen|seqfu_stats|seqtk_sample)?,?)*(?<!,)$"


pontushojer · 2025-07-16T13:09:30Z

workflows/seqinspector.nf

+    //
+    // MODULE: Run FQ_LINT to catch early errors
+    //
+    if ( !("fq" in skip_tools) ) {


Suggested change

if ( !("fq" in skip_tools) ) {

if ( !("fq_lint" in skip_tools) ) {

pontushojer · 2025-07-16T15:25:23Z

Just had an idea regarding the --continue_with_lint_fail option. With this enabled, it would be great to have a section in the MultiQC report(s) listing which samples that failed linting. This could be a separate PR though.

Thinking further on this option, have you considered reversing the logic here so that the pipeline would continue by default even if some samples fail linting? For me, it would seem that the main purpose of this pipeline is to identify which samples are bad (failed lint, contamination, low quality, etc.) and good for continued analysis. Stopping everything early due to one failed samples would go against this.

adamrtalbot · 2025-07-16T15:59:22Z

Just had an idea regarding the --continue_with_lint_fail option. With this enabled, it would be great to have a section in the MultiQC report(s) listing which samples that failed linting. This could be a separate PR though.

Thinking further on this option, have you considered reversing the logic here so that the pipeline would continue by default even if some samples fail linting? For me, it would seem that the main purpose of this pipeline is to identify which samples are bad (failed lint, contamination, low quality, etc.) and good for continued analysis. Stopping everything early due to one failed samples would go against this.

Based on @FranBonath's comment here I've stopped any further development on this feature, but yes, I think "keep going and report on all samples" is a good strategy for handling FQ linting.

mahesh-panchal and others added 30 commits March 19, 2024 10:09

Update assets/schema_input.json

56e01a0

grop instead of project in a single place

6c94332

Co-authored-by: Adrien Coulier <adrien.coulier@medsci.uu.se>

Updated test profile input

4779844

Update assets/schema_input.json

28e0137

Co-authored-by: Karthik Nair <35717861+KarNair@users.noreply.github.com>

Merge pull request nf-core#2 from mahesh-panchal/input_workflow

d169eef

Input workflow

Generate reports per lane, group and rundir

a31040e

Improve formatting

0da5870

Improve output sorting

d233d8f

Use group instead of project

307e43c

Fix output channel

627cf94

Fix linting

c9ba028

Give credits back to NGI

2cfc91d

Fix file names

fbfb02d

Set up tests

8a19929

point test conf upstream

eba628e

project -> group

23f69d1

project -> group

1ebf3f1

make lane non-compulsory

b0bf471

remove unused file

9e0eca3

revamp nf-test, run once for each sequencing platform

98e60bf

Update modules and subworkflows

d95c660

Fix multiqc extra files

e0527cc

Issue with the previous implementation was that sometimes MULTIQC_PER_LANE would execute before the extra files were collected into `ch_multiqc_extra_files`, causing `null` to be added to the list of files passed to multiqc.

Merge branch 'dev' into nf-core-template-merge-2.14.1

ba72067

Add test snapshots

42159a1

Remove unused module configuration

ef61f9f

Update usage docs and restore example samplesheet

51b01e9

Update output docs

7c7f31f

Update changelog

1c4f6e0

Update samplesheet in readme file

211bfaf

Merge pull request nf-core#15 from nf-core/nf-core-template-merge-2.14.1

e93baf9

Important! Template update for nf-core/tools v2.14.1

Aratz and others added 21 commits January 17, 2025 14:16

Merge pull request nf-core#72 from nf-core/nf-core-template-merge-3.1…

4ed43fb

….2.dev0 Important! Template update for nf-core/tools v3.1.2.dev0

Merge pull request nf-core#68 from Aratz/tool_selector

ec9570d

Add skip tools parameter for tool selection

Merge remote-tracking branch 'origin/dev' into nf-core-template-merge…

5d4686e

…-3.1.2

Merge branch 'nf-core-template-merge-3.1.2' into nf-core-template-mer…

b259a87

…ge-3.2.0

Merge pull request nf-core#74 from nf-core/nf-core-template-merge-3.2.0

826a811

Important! Template update for nf-core/tools v3.2.0

Set up nft-utils in tests

734a412

Exclude fastqc/*_fastqc.html files

5053098

Update changelog

0372247

Merge pull request nf-core#75 from Aratz/nft-utils

80f0095

Set up nft-utils in tests

use parameter-supplied fastqscreen reference instead of hardcoded exa…

f79897c

…mple

Merge branch 'dev' into seqinspector-params

0c6fc55

typo

4bcca15

add assertions

0ba1652

Revert "add assertions"

2fd5f7c

This reverts commit 0ba1652.

Merge pull request nf-core#77 from kedhammar/seqinspector-params

487f081

Replace hard-coded path to fastqscreen example csv with parameter-supplied one

Added missing citations to citation tool

a99d0d6

TODOs removed

6c4bcde

CHANGELOG.md updated

7ca4053

Merge pull request nf-core#96 from Patricie34/dev

cd97466

Added missing citations to citation tool

Merge branch 'dev' into add_fq_lint_to_pipeline

b073192

Update syntax of fq skipping to use the skip_tools parameter

ad0817d

pontushojer reviewed Jul 16, 2025

View reviewed changes

pontushojer mentioned this pull request Jul 16, 2025

Add a paired-end test #55

Closed

maxulysse modified the milestones: 1.0.0, 1.1.0 Nov 6, 2025

maxulysse force-pushed the dev branch from be0c80c to a4ba47c Compare January 13, 2026 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds fq/lint for early validation of FASTQs #67

Adds fq/lint for early validation of FASTQs #67

Uh oh!

adamrtalbot commented Nov 2, 2024 •

edited

Loading

Uh oh!

FranBonath commented Mar 25, 2025

Uh oh!

pontushojer left a comment

Uh oh!

pontushojer Jul 16, 2025

Uh oh!

pontushojer Jul 16, 2025

Uh oh!

pontushojer commented Jul 16, 2025

Uh oh!

adamrtalbot commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

	"pattern": "^((fq\|fastqc\|fastqscreen\|seqfu_stats\|seqtk_sample)?,?)*(?<!,)$"
	"pattern": "^((fq_lint\|fastqc\|fastqscreen\|seqfu_stats\|seqtk_sample)?,?)*(?<!,)$"

	if ( !("fq" in skip_tools) ) {
	if ( !("fq_lint" in skip_tools) ) {

Adds fq/lint for early validation of FASTQs #67

Are you sure you want to change the base?

Adds fq/lint for early validation of FASTQs #67

Uh oh!

Conversation

adamrtalbot commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR checklist

Uh oh!

FranBonath commented Mar 25, 2025

Uh oh!

pontushojer left a comment

Choose a reason for hiding this comment

Uh oh!

pontushojer Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

pontushojer Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

pontushojer commented Jul 16, 2025

Uh oh!

adamrtalbot commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

adamrtalbot commented Nov 2, 2024 •

edited

Loading