Skip to content

[Bug]: validate_if with regex not working as expected #86

@nick-youngblut

Description

@nick-youngblut

Guidelines

  • I agree to follow this project's Contributing Guidelines.

Project Version

0.1.2

Platform and OS Version

macOS 13.3

Existing Issues

No response

What happened?

My validation function:

#' validate whether the table column contains nucleotide strings
is_nucleotide = function(val, col_name){
  msg = glue::glue('"{x}" column is a nucleotide sequence', x={{col_name}})
  validate_if(val, grepl('^[ACGTURYKMSWBHDV]+$', {{col_name}}, perl=TRUE), 
              description = msg) 
}

The validation workflow:

report = data_validation_report()
read.delim(infile) %>%
  validate(name = "Verifying samples table") %>%
  is_nuc("TARGET_COLUMN") %>%
  add_results(report)

render_semantic_report_ui(get_results(report))

Example values in the TARGET_COLUMN of the data.frame:

"ATTCGTCC" "GCCTAATG" "GAGTCAAA" "AGACGTGG" "GACGGGAG" "AGTAAAGA"

If I use ^.+$, the validation passes, but the validation does not pass when using ^[A-Z]+$.

All of the string values in table column are just comprised of [ATGC]+, so I don't see why ^[A-Z]+$ and ^[ACGTURYKMSWBHDV]+$ are failing.

Steps to reproduce

See above

Expected behavior

See above

Attachments

No response

Screenshots or Videos

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions