-
Notifications
You must be signed in to change notification settings - Fork 260
Description
Hi, I have a question regarding bcftools filtering behavior.
I’m currently testing test.vcf (10 samples, 20 bi-allelic SNP sites) and would like to verify whether my filtering conditions work as intended.
test_filter_vcf.txt
Here is a summary of the SNP categories:
My goal is to keep only sites that satisfy all of the following conditions:
(1) QUAL >= 30 (2) F_MISSING < 0.1 (3) FORMAT/DP >= 10 (4) AC > 0
According to the bcftools filtering manual (https://samtools.github.io/bcftools/howtos/filtering.html),
The expression -i 'FMT/DP>10 & FMT/GQ>20' selects sites where the conditions are satisfied within the same sample.
So I used the following command:
bcftools view -m2 -M2 --types snps
-i "QUAL>=30 & F_MISSING<0.1 & FORMAT/DP>=10 & AC>0"
test_filter.vcf -Ov -o filter_set1.vcf
However, after running this command, I noticed that site 15 is still present in filter_set1.vcf, even though Sample 1 (S1) has DP = 5 at that site, so it should not have passed the filter.
Is there something wrong with my command?
How should I modify it to correctly enforce the per-sample DP ≥ 10 requirement?
Any advice would be greatly appreciated. Thank you!