-
Notifications
You must be signed in to change notification settings - Fork 42
This page include the frequently asked questions:
1. Why the usearch stop working as out of memory?
A: As 32 bit usearch only work with small data sets with memory requirement less than 4Gbs, if the memory exceeds 4Gbps, 64 bit usearch should be used, or users can split their input files into smaller ones. We suggest users to divide the big input fastq files into smaller ones and finally merge all the output matrix of ARGs abundance to get the final results. Under linux system using split -n 10 will divide the input files into ten parts and then users can rename different parts to make it appropriate for ublastx_stage_one. At current stage user need to buy the 64 bit usearch to remove the memory limitation problem. In the version 2 updation, we added and option -s for big data to solve the out of memory problem.
2. Where to download the CARD and ARDB database?
A: The original sequence files should be download by users from the websites from CARD and ARDB. The links are CARD and ARDB, respectively.
3. Whether Ublastx could process single end fastq files or not?
A: Currently Ublastx could only process pair-end metagenomics sequences. There are optional ways to process single end sequences. As Ublastx currently do not consider the pair-end relationship and process reads separately, one way is that users can split the single end sequences into two files and pretend that the two separated files are a pair, the pipeline could run without any problem.
4. What are those output files in Ublastx stage one?
A: Some new users may be confused about the contents of each generated file in stage one pipeline:
extracted.fa Final extracted ARGs-like reads Fasta format
meta_data_online.txt Output meta data information for stage two
STAS_1.16s search of 16S reads output BLAST m6 tabular output format for 1.fastq
STAS_2.16s search of 16S reads output BLAST m6 tabular output format for 2.fastq
STAS_1.us search of SARG database output for 1.fastq
STAS_2.us search of SARG database output for 2.fastq
STAS.16s_1v6.us search of 16S hyper variable region (HVR) database (currently only support V6) for 1.fq
STAS.16s_2v6.us search of 16S hyper variable region database (currently only support V6) for 2.fq
STAS.extract_1.fa extracted ARGs-like sequences from 1.fastq
STAS.extract_2.fa extracted ARGs-like sequences from 2.fastq
STAS.16s_hyperout.txt extracted 16S hyper variable region reads fasta format
STAS.16s_hvr_community.txt The microbial community structure information derived by assignment of the extracted HVR sequences; the quantification is absolute abundance of that sample, fragment sequences are counted by the ratio of the fragment
STAS.16s_hvr_normal.copy.txt The calculated average copy number using community information and amplicon CopyRighter database
ublastx_bash_Mon-Feb-1-16:20:59-2016.sh the shell file contains all the commands running in the pipeline