OsburnLab.github.io/search_index.json at master · OsburnLab/OsburnLab.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
[
["index.html", "Osburn Lab Protocols 1 | About", " Osburn Lab Protocols Created: 2019-10-16 Last updated: 2021-01-20 1 | About This is a collection of protocols for the Osburn Lab 😎 Figure 1.1: Osburn Lab Spring 2019 "],
["data-access.html", " 2 | Data Access + Storage 2.1 MacOS Users 2.2 Windows Users", " 2 | Data Access + Storage Created by: Caitlin Casar on 2019-10-16 Last updated: 2019-10-16 The Osburn Lab data is backed up to RDSS at Northwestern. You can access this data if you have permissions using the directions here. If you do not currently have access, you will need to be added as a user by Caitlin or Maggie. 2.1 MacOS Users Open Finder and navigate to Go &gt; Connect to Server… Add this server address: smb://resfiles.northwestern.edu/OSBURN_LAB Add your netID and password. Finder will automatically open OSBURNLAB in your system Volumes. To navigate to the OSBURNLAB dirctory in your terminal: cd /Volumes/OSBURN_LAB 2.2 Windows Users Open windows file explore and add this server address: resfiles.northwestern.edu You will be prompted for your user name (netID) and password. The contents of OSBURNLAB is displayed in the window. "],
["version-control.html", " 3 | Version Control", " 3 | Version Control Created by: Caitlin Casar on 2019-10-16 Last updated: 2019-10-16 If you’re writing code, it’s very important to implement version control with Git. This guide will get you started! First you’ll need to install Git. Next, create an account on Github. If you want access to provate repositories (i.e. if you need to backup unpublished data or code), be sure to set up a student account. Now, you’ll need to set your credentials in Git. Open up your terminal. #set your user name on github git config --global user.name &quot;John Doe&quot; #set your user email on github git config --global user.email johndoe@example.com Now, go to Github and create a repository for your code. If you want this repository to be private, change the repo settings on Githib. Click on the settings button. Then set the reposotory to private. Then clone this repository to your computer. You may be prompted to enter your Github password. #change directories to the desired location for your repository cd ~/Desktop #clone your repository using the URL git clone https://github.com/OsburnLab/Protocols Now you can add your code files to this cloned repository. When you’re done editing your code, push it up to the Github server. #add your new files to the queue git add . #commit your changes and add a short description git commit -a -m &quot;short description here&quot; #push your changes to the Github server git push "],
["create-a-protocol.html", " 4 | Create a Protocol", " 4 | Create a Protocol Created by: Caitlin Casar on 2019-10-16 Last updated: 2019-10-16 Wanna share a cool protocol with your lab mates in this bookdown document? Follow this guide to learn how! First, you’ll need to clone this repository using Git in your terminal. #change directories to a desirable location cd ~/Desktop git clone https://github.com/OsburnLab/Protocols Next, open the bookdown-demo.Rproj file in RStudio. Then, select File &gt; New File &gt; R Markdown… Give this file a name in the format ‘number-name.Rmd’, where number is in sequential order with the other .Rmd files. Add an H1 element chapter title to the file. # | Chapter Title Next, add some paragraph content below this. # | Chapter Title Here is some paragraph content. To add a code chunk, equations, or figures, check out this link. When you’re done editing the R markdown file, render the book. bookdown::render_book(&quot;index.rmd&quot;, &quot;bookdown::gitbook&quot;) Now it’s time to update your changes online! #change directories to the protocols folder cd ~/Desktop/Protocols #add all new files you created git add. #commit all of your changes and add a short description about your update git commit -a -m &quot;short description here&quot; #push your changes to the github server git push #update the rendered html file on the github-hosted page cd _book git add . git commit -a -m &quot;short description here&quot; git push "],
["qiime2-workflow.html", " 5 | Qiime2 workflow 5.1 Import data 5.2 Demultiplexing 5.3 Denoising and ASV generation 5.4 Taxonomy 5.5 Taxa barplots and diversity analyses in Qiime2", " 5 | Qiime2 workflow Created by: Matt Selensky on 2019-11-19 Last updated: 2019-11-30 Workflow for 16S amplicon sequence analysis in Qiime2 5.1 Import data This protocol is designed for the processing of 16S rRNA gene amplicon data using 515F/806R primers. Qiime2 requires us to convert the raw data the sequencing center sends us into Qiime-zipped artifacts, or a .qza extension. We must first import our data (paired-end .fastq files) into this format: qiime tools import \\ --type EMPPairedEndSequences \\ --input-path emp-paired-end-sequences \\ --output-path emp-paired-end-sequences.qza The above function requires three .fastq files in the folder emp-paired-end-sequences. One file is for the forward reads, one is for the reverse reads, and another is for the barcodes. They must be named forward.fastq.gz, reverse.fastq.gz, and barcodes.fastq.gz, respectively. qiime tools import will yield a single output, emp-paired-end-sequences.qza, that will contain all of the barcoded reads from every single sample submitted to the sequencing center. 5.2 Demultiplexing At the sequencing center, DNA sequences were given a barcode specific to each sample so we can track which sample our reads in the emp-paired-end-sequences.qza file originated from. We do this by demultiplexing our sequences. In Qiime2, we need to create a metadata file that contains the barcodes used for each sample. Check out this example from the Qiime2 documentation of how this metadata file should be formatted. The sequencing center should send a mapping file from which you can obtain the barcodes for each sample. Be sure to save the metadata file as a .tsv. Specify the barcodes and other associated metadata only for the samples you are interested in analyzing. As is often the case in our lab, sequencing data is typically sent back as a mix of samples from different projects. Only including your samples in the metadata file will subset the large .qza file and will significantly cut down on computation time. Because we are demultiplexing EMP paired-end sequences, we should use the demux emp-paired command. The column which contains the barcode in the metadata file for each sample must be specified using the argument barcodes-column: qiime demux emp-paired \\ --m-barcodes-file sample-metadata.tsv \\ --m-barcodes-column barcode-sequence \\ --i-seqs emp-paired-end-sequences.qza \\ --o-per-sample-sequences demux.qza Note: if you have reverse complement sequences, you must pass the argument, --p-rev-comp-mapping-barcodes to your demux command to account for this. You can look at this on the Qiime2 viewer by producing a Qiime-zipped visualization file, .qzv, from your now-demultiplexed .qza output: qiime demux summarize \\ --i-data demux.qza \\ --o-visualization demux.qzv From the interactive quality plot in demux.qzv, we can see the distribution of quality scores for each sequenced base. If analyzing paired-end data, you will see two plots: one for the forward read, and one for the reverse read. We use this visualization to inform how we will trim and truncate the ends of the reads in the next denoising step using dada2 denoise-paired 5.3 Denoising and ASV generation We will use the DADA2 algorithm to denoise our data and generate amplicon sequence variants (ASVs). DADA2 is a robust way to filter out noisy sequences, correct errors in marginal sequences, remove chimeras, remove singletons, join denoised paired-end reads, and dereplicate sequences. Previously, each of these functions would require separate commands, but DADA2 does it all-in-one. Therefore, this is a particularly computationally intense process. One should consider running dada2 on a computer that can handle it (perhaps by accessing Northwestern’s high-performance computing cluster, Quest). qiime dada2 denoise-paired \\ --i-demultiplexed-seqs demux.qza \\ --p-trim-left-f 13 \\ --p-trim-left-r 13 \\ --p-trunc-len-f 150 \\ --p-trunc-len-r 150 \\ --o-table table.qza \\ --o-representative-sequences rep-seqs.qza \\ --o-denoising-stats denoising-stats.qza By inspecting the interactive demux.qzv file produced in the previous step on the Qiime2 viewer, we observe that sequence quality scores are lower than average until base #14 in both the forward and reverse reads. We will want to trim these low-quality sequences from our data. Use the argument p-trim to specify the number of nucleotides that should be trimmed from the left end of the forward (left-f) and reverse (left-r) reads, which we define as 13 here. Similarly, the p-trunc-len argument is used to trim the right ends of, or truncate, our reads. Since we have paired-end data, our amplicons are 150 nucleotides long. We define p-trunc-len for the forward and reverse reads as 150 because we do not observe a drop off in quality scoring on their right ends in our demux.qzv file. The output rep-seqs.qza contains a list of the ASVs found across all samples, and be used in the next step of our processing workflow: assigning taxonomy. 5.4 Taxonomy At this point, our ASVs lack any meaningful identification - we don’t know whether ASV ‘A’ comes from the bacterium E. coli or the archaeum S. solfataricus! We determine who is present in our samples by assigning taxonomic IDs to each “query” sequence (from rep-seqs.qza). We do this by comparing query sequences to a database of known reference sequences ( Silva is an excellent choice for our purposes). A major advantage of using Qiime2 is that it contains the classify-sklearn algorithm, which uses machine learning via Naive Bayes to classify sequences. As is the case with other machine learning applications, the classifier must be trained. Classifier training is required for every new reference database/amplicon pair, and would be the most resource-intensive step in our workflow by far. Luckily for us, Silva is routinely used to classify sequences coming from 16S rRNA gene amplification using the 515F/806R primers, and the Qiime2 developers provide a free, pre-trained Silva classifier in their documentation for just that! Download this classifier - you will need it! At the time this was written, I used silva-132-99-515-806-nb-classifier-2018.qza, the latest version of the pre-trained Silva classifier. Even without the extra training step, it is highly recommended to run classify-sklearn on a high-performance computing cluster, as it is very memory intensive and slow (budget several hours or even a day for this to complete!). Please refer to our Quest tutorial to get started on how to submit jobs on Northwestern’s cluster. qiime feature-classifier classify-sklearn \\ --i-classifier classifier.qza \\ --i-reads rep-seqs.qza \\ --o-classification taxonomy.qza If you so choose, you can visualize the resultant taxonomy file on the Qiime 2 viewer to verify that classification was successful: qiime metadata tabulate \\ --m-input-file taxonomy.qza \\ --o-visualization taxonomy.qzv 5.5 Taxa barplots and diversity analyses in Qiime2 You can quickly visualize the community composition of your samples via the taxa barplot command. This requires your clustered feature table and taxonomy.qza from the previous step as inputs. In the Qiime 2 viewer, you can export the data that feeds the taxa barplot as .csv files specific to each level of taxonomic classification. Use those files in R to produce publication-quality figures. qiime taxa barplot \\ --i-table table.qza \\ --i-taxonomy taxonomy.qza \\ --m-metadata-file metadata.tsv \\ --o-visualization taxa-bar-plots.qzv In fact, we don’t really want to use Qiime2 for making any sort of figure for presentations or publications, but it does have a few handy tools to quickly visualize your data to inform which types of figures you want to make. Let’s start with the built-in diversity analyses offered by Qiime2. Many diversity analyses compute diversity by incorporating phylogeny. That means we have to generate a phylogenetic tree of how our sequences are related to each other! Both rooted and unrooted trees are outputs of the align-to-tree-mafft-fasttree command. UniFrac and Faith’s Phylogenetic Diverstiy require the use of a rooted tree. qiime phylogeny align-to-tree-mafft-fasttree \\ --i-sequences rep-seqs.qza \\ --o-alignment aligned-rep-seqs.qza \\ --o-masked-alignment masked-aligned-rep-seqs.qza \\ --o-tree unrooted-tree.qza \\ --o-rooted-tree rooted-tree.qza Additionally, the analyses we are about to perform will be subsampling our data to estimate diversity. This rarefaction is done so we can compare diversity across samples of different sizes, thereby minimizing bias. We need to know the sequencing depth we should take so we don’t miss out on too many rare sequences (by choosing too low of a depth) or too many samples themselves (by choosing too high of a depth). By making an alpha rarefaction plot that we can visualize on the Qiime2 viewer, we can make an informed decision: qiime diversity alpha-rarefaction \\ --i-table table.qza \\ --i-phylogeny rooted-tree.qza \\ --p-max-depth 20000 \\ --m-metadata-file metadata.tsv \\ --o-visualization alpha-rarefaction.qzv From this visualization, you should choose a sequencing depth at which the observed OTUs from most samples level off, without excluding too many samples. In our example, we will choose a depth of 6500. After determining the degree of rarefaction, we can compute core diversity metrics in Qiime2: qiime diversity core-metrics-phylogenetic \\ --i-phylogeny rooted-tree.qza \\ --i-table table.qza \\ --p-sampling-depth 6500 \\ --m-metadata-file metadata.tsv \\ --output-dir core-metrics-results The core-metrics-results folder will contain both alpha and beta diversity metrics. For each metric, you can determine diversity significance using diversity alpha-group-significance or diversity beta-group-significance: qiime diversity alpha-group-significance \\ --i-alpha-diversity core-metrics-results/alpha-div-metric-of-interest.qza \\ --m-metadata-file metadata.tsv \\ --o-visualization core-metrics-results/metric-group-significance.qzv qiime diversity beta-group-significance \\ --i-distance-matrix core-metrics-results/distance_matrix.qza \\ --m-metadata-file metadata.tsv \\ --m-metadata-column comparisonofinterest \\ --o-visualization core-metrics-results/unweighted-unifrac-comparisonofinterest-significance.qzv Beta diversity visualizations can be viewed via Qiime2 View’s Emperor, which offers an interactive three-dimensional platform to explore relationships in your data. "],
["quest-tutorial.html", " 6 | Quest tutorial 6.1 Getting acquainted with Quest 6.2 Using Qiime2 on Quest 6.3 Best practices in a shared computing environment 6.4 Interactive jobs on Quest 6.5 Batch jobs on Quest 6.6 A note on partitions 6.7 More information", " 6 | Quest tutorial Created By: Matt Selensky on 2019-11-19 Last updated: 2019-11-30 Getting an allocation on Quest You may find that you are unable to process the large volume of sequencing data on your personal computer. Thankfully, Northwestern IT offers free access to its high-performance computing cluster, Quest, to students, postdocs, and faculty. To use Quest, you first need to apply for an allocation granted by IT. Please visit this webpage to learn more about the application process. 6.1 Getting acquainted with Quest Once you obtain an allocation, you can start using Quest for any manner of processing needs. Quest is remotely accessed from your personal computer by way of a secure shell in the command line. If you use Windows, download GitBASH to be able to more easily interact with the Unix command line in Quest. Note - this is not necessary if you use Mac or *nix. Before logging in to Quest, I recommend downloading Cyberduck, a FTP/SFTP client that facilitates the transfer of files between your personal computer and Quest. See this page for instructions on how to correctly download and install Cyberduck. To log in to Quest, enter the following into the command line (or GitBASH): ssh -X netid@quest.it.northwestern.edu You will be prompted to enter your netID password (don’t worry, it is normal to not see the characters as you type!). In the command line, you can navigate Quest via Unix commands. For example, use cd .. to move up the file directory, then cd /projects/&lt;allocation-id&gt; to enter your project directory. Your project allocation ID will be a unique string given to you by Northwestern IT. It should be noted that your home directory (/home/&lt;net-id&gt;) is regularly backed up (up to 80 GB), but your project directory is not. 6.2 Using Qiime2 on Quest The Qiime2 software is currently available as a Docker image on DockerHub. On Quest, you can download this image via Singularity. Navigate to your project directory on Quest and run the following command: singularity pull --name qiime2-core2018-8.simg docker://qiime2/core:2018.8 This will install Qiime2 in the folder you’re currently in (which is hopefully your project directory). To use Qiime2, you will have to call the Singularity container in which it resides (/projects/&lt;allocation-id&gt;/qiime2-core2018-8.simg) every time you run a Qiime2 command. Let’s check if it correctly installed by running a help command: singularity exec /projects/&lt;allocation-id&gt;/qiime2-core2018-8.simg qiime --help If you received a bunch of “help” text as an output, congratulations, Qiime2 installed correctly and is ready to be used! Before you do anything, let’s lay some ground rules first. 6.3 Best practices in a shared computing environment Quest is used by hundreds of people on campus doing Very Important Things, so following a few guidelines is in all of our best interests. First and foremost never ever move or delete files in any folder that isn’t yours that you somehow have access to. IT will find out about it, and you will be hearing from them if you do (and rightly so). With that out of the way, feel free to store files in your home and/or project directories. Though your project directory likely has more storage, your home directory is regularly backed-up (up to 80GB). I recommend storing programming scripts or other such files in your home directory for this reason. In Quest, you shouldn’t ever run jobs on the main head node or login node. This will slow Quest’s performance for everyone. You should instead submit every script as “interactive” or “batch” jobs on designated compute nodes, following standard Slurm commands. 6.4 Interactive jobs on Quest Interactive jobs are best used for short jobs. If you submit an interactive job, your command line will be tied up for the time it takes to process your submission. If you exit the command line, your job submission will be terminated. srun --account=&lt;allocation-id&gt; --time=&lt;hh:mm:ss&gt; --partition=&lt;queue_name&gt; --mem=&lt;memory per node&gt;G --pty bash -l Running the above command to will do several things. The --account argument should be your allocation ID, which helps IT “bill” the number of compute hours to the right project. --time specifies how long you would like to have a node from which you can submit jobs. --partition is defined by the requested amount of --time and is used to determine how long your allocation request will be queued (see below). Finally, --mem is the amount of memory requested for the job. Submitting srun will bring you to an “allocation queue,” where you will eventually be given resources to run your Qiime2 commands. Running the same qiime --help command above in an interactive job command will look something like: srun --account=a12345 --time=01:00:00 --partition=short --mem=18G --pty bash -l module load singularity singularity exec /projects/&lt;allocation-id&gt;/qiime2-core2018-8.simg qiime --help 6.5 Batch jobs on Quest It is generally more efficient to submit scripts on Quest as batch jobs. This allows you to disconnect from Quest without prematurely stopping your submission. This is helpful if you have multi-day commands such as classifier training using sklearn in Qiime2! A batch job submission script should have the following structure (save it with a .sh file extension and upload it to Quest). To run the same help command, write the following script and save it with a .sh file extension: #!/bin/bash #SBATCH -A a12345 # Allocation #SBATCH -p short # Queue #SBATCH -t 04:00:00 # Walltime/duration of the job #SBATCH -N 1 # Number of Nodes #SBATCH --mem=18G # Memory per node in GB needed for a job. Also see --mem-per-cpu #SBATCH --ntasks-per-node=6 # Number of Cores (Processors) #SBATCH --mail-user=&lt;my_email&gt; # Designate email address for job communications #SBATCH --mail-type=&lt;event&gt; # Events options are job BEGIN, END, NONE, FAIL, REQUEUE #SBATCH --job-name=&quot;help&quot; # Name of job # unload any modules that carried over from your command line session module purge module load singularity singularity exec /projects/a12345/qiime2-core2018-8.simg qiime --help If this script is called qiime2-help.sh, simply navigate to the folder in Quest where it is stored and enter into the command line: sbatch qiime2-help.sh 6.6 A note on partitions Quest has several “partitions,” which are defined by how long you expect your job to take to run. Shorter jobs have shorter queues, so it would behoove you to choose the shortest partition as possible. Keep in mind, however, that your job will terminate if it runs past the time you alloted to it! Visit this webpage to learn about the different partitions and their associated maximum walltimes. 6.7 More information For more information on Quest, visit the Quest User Guide, which is excellently documented by Northwestern IT. Happy Questing! "],
["DAPI.html", " 7 | DAPI + Cell Counting 7.1 Sample prep 7.2 Operating the Microscope 7.3 Cell Counting 7.4 Troubleshooting", " 7 | DAPI + Cell Counting Created By: Caitlin Casar on 2020-01-27 Last updated: 2020-02-13 7.1 Sample prep Prepare your cells by fixation for 30 minutes at room temperature. Add 0.1 ml 4% paraformaldehyde/PBS solution to a 2ml tube Add 0.9ml cell suspension Figure 7.1: Fix your sample. Use sterile foreceps to place a glass microfiber Whatman filter on the filter mount. Wet the filter with filtered DI water and vacuum so that it is damp. Use sterile forceps to place a polycarbonate Whatman filter on top of the glass microfiber filter. Wet the filter with filtered DI water and vacuum dry to flatten evenly. (Beware of the “taco” effect - the surface tension of the water on the polycarbonate filter may cause it to fold up on itself and it is very difficult to re-flatten). Place the filter column over the stack filters and clamp down. Add a few ml of filtered DI water to the column, then add your fixed cell suspension avoiding the sides of the column. The water helps to evenly disperse the cells across the filter. If you accidentally drop your cells on the side of the column, use filtered DI water to wash them down. Figure 7.2: Add your sample to the filter column. Vacuum the fluids through. Open the vacuum line on the flask before the next step or it may draw a vacuum and pull your dye through. Figure 7.3: Vacuum your sample through the filter. Dim the lights and close the curtain, DAPI is extremely photo-sensitive. Drip filtered DAPI down the side of the filter column so as not to disturb your cells on the filter. Add enough to completely cover the entire filter. Figure 7.4: Dye your sample with DAPI. Set a timer for ten minutes. After the timer goes off, vacuum the dye to dry the filter. Figure 7.5: Vacuum the dye through the filter. Add a drop of immersion oil to a glass slide. Use sterile forceps to place the filter onto the drop of oil. Add a drop of immersion oil on top of the filter. Do not touch the filter with the oil dropper or it may contaminate the oil. Place a glass coverslip over the filter, taking care to push out the bubbles. Figure 7.6: Prepare your slide. Add a drop of immersion oil to the top of the glass coverslip. You are now ready to place the slide on the microscope stage for use with the 100x oil objective. 7.2 Operating the Microscope Remove the dust cover and turn on the power source, then power on the microscope. Power on the X-Cite lamp. Once the lamp is turned on, do not power off for at least 30 minutes. Open the Zen software. If you open the software before the microscope has been powered on, the software will try to communicate with the scope and get confused and will need to be rebooted. Turn on the DAPI RL (reflected light) Carefully lower the objective onto the drop of oil on your coverslip. When the objective makes contact with the oil, you will see a flash of light. Continue to lower the objective until you encounter the focal plane. The focal plane tends to be far down in the Z-direction. If you go too far in the Z-direction you will crack your slide and potentially damage the objective lense, so take your time! To view the image on the software, pull the eyepiece rod halfway out to direct 50% light to your eye and 50% to the camera. Pull the rod all the way out to direct light 100% to the camera. Click the “Live” button on the Locate tab in the laft panel and set the camera exposure. To acquire an image, click the “snap” button. If you want to save the images in their proprietary format, right click the image on the right panel and save. If you want them in jpg format, switch to the processing tab in the left panel and export the images to your file folder. You can add a scale bar with the graphics tool. When you’re finished, turn off the reflected light and remove your slide. Use lens paper to clean the oil off of the objective. Power off the software, then the microscope, then the power supply. Turn off the X-Cite lamp. Replace the dust cover on the microscope. 7.3 Cell Counting Bacteria densities (with the proper dilution) should be at least 30 organisms per field. Count at least 10 fields (to achieve a final count of 300 bacterial cells). Calculate final bacterial densities using the following equation (from Wetzel and ikens, 1991). Bacteria ml-1 = (membrane conversion factor * ND) Membrane conversion factor = Filtration area/area of micrometer field N = Total number of bacteria counted/number of micrometer fields counted D = Dilution factor; volume of sample stained/total volume of sample available Figure 7.7: Glowing cells on a filter. Figure 7.8: Count total of at least 300 cells, calculate average # cells/frame Figure 7.9: #cells/frame x #frames/filter = total cell count/volume 7.4 Troubleshooting This is a step by step guide to troubleshooting issues that arise during microscopy. Many things can go wrong and it’s important to approach it as a process of elimination. 1. Is something wrong with the microscope? The microscope kit includes a DAPI test slide. Can you can see the specimen on the slide? Yes! There is no technical issue with the microscope. Proceed to step 2. No! Check that the eye piece shutter is open and the light path is directed 100% to the eye piece (not to the camera). Additionally, check that the Apotome filter is not blocking the light path. Once you’ve checked for these issues, if you still can’t see the specimen it may be time to call a technician. 2. Is something wrong with the antifade reagent? The antifade reagent has a shelf life. When precipitates form in the solution, it alters the refraction index of the solution and can obscure your image. Prepare an overnight culture of E. coli - do not use your own sample or an old culture of E. coli. Why? Because your sample may have issues (for example, your cell density may be too low), and because unhealthy cells do not stain as well as healthy cells. Check that you can see cells by mounting 15 microliters of the culture on a slide and imaging via DIC. If you cannot see cells via DIC, your culture is not turbid enough. Wait until the culture becomes more turbid, then repeat this step before moving on to the next step. Next, stain 100 microliters of the culture and mount the filter with Citifluor, then repeat but instead replace the Citifluor with immersion oil. Make sure you do not have two coverslips stacked on top of your filter and be sure to add immersion oil on top of the coverslip - failing to do this will result in a blurry image! Can you see cells on the filter without Citifluor but cannot see cells on the filter with Citifluor? Yes! The Citifluor is the issue. No! If you can’t see cells on either filter, the Citifluor is not the issue. Proceed to step 3. 3. Is something wrong with the dye? The working solution of DAPI in the fridge has a shelf life of ~3 months due to exposure to oxygen and light. Prepare a fresh solution of DAPI from the freezer stock. The freezer stock concentration is 1mg/mL. Dilute 15 microliters of the stock solution in a 15ml falcon tube by topping to 15mL with filter-sterilized milliQ water for a final concentration of 1microgram/mL. Stain 100 microliters of E. coli with both the old and new dye. Can you see cells on the filter dyed with new dye but cannot see cells on the filter dyed with old dye? Yes! The dye is the issue. No! If you cannot see cells on either filter, the dye is not the issue. Proceed to step 4. 4. What else could be wrong? The most common issues are with the scope, Citifluor, and dye. If these are not the issues, there are a few other things to check - inspect the filter tower components for defects or filter residue on the frit that might be preventing a seal. Try increasing/decreasing the excitation lamp intensity - the higher your intensity, the quicker the dye will bleach and prevent you from seeing cells. Test a batch of dye that works in another lab - maybe our freezer stock has degraded (this happens within a couple years!). "],
["submit-sequence-data.html", " 8 | Submit Sequence Data 8.1 New Submission 8.2 1 Subission Type 8.3 2 Submitter 8.4 3 Sequencing Technology 8.5 4 Sequences 8.6 5 Sequence Processing 8.7 6 Source Info 8.8 7 BioProject Info 8.9 8 BioSample Type 8.10 9 BioSample Attributes 8.11 10 References 8.12 11 Review &amp; Submit 8.13 Correcting Submission", " 8 | Submit Sequence Data Created by: Caitlin Casar on 2020-10-08 Last updated: 2020-10-08 When you publish a manuscript with DNA sequence data, you will need to submit your data to the appropriate data repository. For 16S rRNA sequence data, we have submitted to the GenBank on the National Center for Biotechnology Information (NCBI) website. 8.1 New Submission Click here to navigate to the NCBI submission portal. You will need to create an account and log in. Click the blue 16S rRNA button and select GenBank from the dropdown menu. Click the blue submit button. Then, click the blue New Submission button. 8.2 1 Subission Type Fill out the Submission Type form, then click the blue Continue button. Fill out the general information form - select No for the Bioproject and Biosample sections, and set the release date sometime in the distant future if the data is not yet published. Click the blue Continue button. 8.3 2 Submitter Next, fill out the Subitter form with your contact information then click the blue Continue button. 8.4 3 Sequencing Technology Next, fill out the Sequencing Technology form with the appropriate sequencing platform and assembler. Click the blue Continue button. Select Upload a file using Excel or text format... from the Attributes options. Then, click on the Download Excel link to get the attribute submission template. 8.5 4 Sequences Fill out the Sequences form. Set the release date to sometime in the distant future if your data is not yet published. If you removed chimeric sequences, select yes from the Chimera check options and enter the program you used. Select the appropriate options for the rest of the form, then cick the Choose File button to upload your sequence data. Note that these files should not exceed ~250MB and should be &lt; 1 million reads. If you are submiting a very large dataset, you should split the data into chunks and submit each chunk separately. If you do this, be sure to contact GenBank support to communicate that these data comrprise a single dataset. Note that the submission form mentions submitting OTUs - I was able to submit the unbinned reads with a written explanation of why I wanted to submit reads instead of OTUs. Once your file has finished uploading, click the blue Continue button. You will see a yellow bar at the top indicating the progress of the data processing. This may take ~10 minutes. 8.6 5 Sequence Processing Your sequences will now be screened, check the box and click the blue Continue button. 8.7 6 Source Info Next, fill out the Source Information form. If you did not already create a BioProject or BioSample, select no for both sections and click the blue Continue button. If the file has reads from multiple samples, select the Batch/Multiple BioSamples option. 8.8 7 BioProject Info Fill out the BioProject info, then click the blue Continue button. 8.9 8 BioSample Type Select the appropriate biosample type, then click the blue Continue button. 8.10 9 BioSample Attributes Fill out the BioSample attributes form, then click the blue Continue button. 8.11 10 References Fil out the References form, then click the blue Continue button. 8.12 11 Review &amp; Submit Double check your submission and make any necessary corrections, then click the blue Submit button. 8.13 Correcting Submission Once your submission has processed, you may get an automated email saying there were errors in your submission that need to be corrected before the submission can be processed. The email will have a link for you to follow to the submission that needs correcting, here you will find links to download html reports with details about the errors. Your submission will be automatically checked for chimeric sequences using NCBI’s algorithm. If chimeric sequences are found, you will need to remove these, then re-upload and re-submit the data. Be sure to also remove these reads from your mapping file. I think this is where you also get prompted for a mapping file if you didn’t submit one with your sequence data. There will be a template to download called biosample_assignment.tsv that has all of your sequence IDs in a tab-delimited file with a blank column for biosample_accession. You can add the appropriate accession numbers to each sequence ID quickly in R. Below is a script I wrote for the DeMMO fluid community data submission mapping file. This script assigns accession numbers to each sequence ID and removes chimeric sequences identified by the GenBank chimera checker listed in the seq error report: pacman::p_load(tidyverse) files &lt;- list.files(&quot;orig_data&quot;, full.names = T, pattern = &quot;biosample_assignment&quot;) chimeric_seqs &lt;- list.files(&quot;orig_data&quot;, full.names = T, pattern = &quot;chimericSeqs&quot;) metadata &lt;- read_csv(&quot;orig_data/metadata.csv&quot;) %&gt;% select(sample_id, site) %&gt;% mutate(biosample_accession = recode(site, D1 = &quot;SAMN12684770&quot;, D2 = &quot;SAMN12684830&quot;, D3 = &quot;SAMN12684819&quot;, D4 = &quot;SAMN12684826&quot;, D5 = &quot;SAMN12684828&quot;, D6 = &quot;SAMN12684862&quot;, DuselD = &quot;SAMN12768991&quot;, ambient.control = &quot;SAMN12768991&quot;)) assign_biosample &lt;- function(file){ samples &lt;- read_delim(file, delim=&quot;\\t&quot;) %&gt;% select(-biosample_accession) %&gt;% mutate(sample_id = str_remove(Sequence_ID, &quot;_.*&quot;)) %&gt;% left_join(metadata) %&gt;% select(Sequence_ID, biosample_accession) if(any(str_extract(chimeric_seqs, &#39;\\\\d&#39;) %in% str_extract(file, &#39;\\\\d&#39;))){ chimeras &lt;- chimeric_seqs[which(str_extract(chimeric_seqs, &#39;\\\\d&#39;) %in% str_extract(file, &#39;\\\\d&#39;))] %&gt;% read_delim(delim=&quot;\\t&quot;, col_names = F) samples &lt;- samples %&gt;% filter(!Sequence_ID %in% chimeras$X1) } samples %&gt;% write_delim(paste0(&quot;Osburn2020_&quot;, str_extract(file, &quot;\\\\d&quot;), &quot;_biosample_assignment.tsv&quot;), delim = &quot;\\t&quot;) } lapply(files, assign_biosample) Now that your data is submitted, check your emails regularly for any communications from the GenBank submission team - failure to reposond promptly to these emails may result in your submission being removed. "],
["metabolic.html", " 9 | Functional Gene Annotation with METABOLIC", " 9 | Functional Gene Annotation with METABOLIC Created by: Caitlin Casar on 2021-01-20 Last updated: 2021-01-20 This is a protocol for annotating your metagenome or binned genome data with METABOLIC. First you will need to install METABOLIC on Quest. Save the following to a .sh file to your home directory and run as a batch job on Quest: module load anaconda3 # create env conda create -n metabolic source activate metabolic # conda install the required tools conda install sambamba conda install bamtools conda install coverm # installs perl 5.32 conda install gtdbtk conda install diamond conda install bowtie2 conda install R=3.6.0 # conda install R dependencies conda install r-tidyverse=1.3.0 conda install r-diagram conda install r-ggthemes conda install r-ggalluvial conda install r-ggraph conda install r-openxlsx conda install r-pdftools # conda install perl dependencies conda install perl-data-dumper # downgrades perl to 5.26.2 conda install perl-excel-writer-xlsx conda install perl-posix conda install perl-getopt-long conda install perl-statistics-descriptive conda install perl-bioperl # get the one pesky perl dependency not available through conda conda install perl-app-cpanminus env PERL5LIB=&quot;&quot; PERL_LOCAL_LIB_ROOT=&quot;&quot; PERL_MM_OPT=&quot;&quot; PERL_MB_OPT=&quot;&quot; cpanm Array::Split #conda install gdown conda install gdown # conda install the perl package to solve the first (and so far only) error conda install perl-parallel-forkmanager #clone github repo git clone https://github.com/AnantharamanLab/METABOLIC.git #run setup script - make sure you are working from your metabolic conda environment or this script will fail cd METABOLIC sh run_to_setup.sh #test installation - this is optional but good for troubleshooting installation errors perl /home/cpc7770/METABOLIC/METABOLIC-G.pl -test true perl /home/cpc7770/METABOLIC/METABOLIC-C.pl -test true Now you’re ready to run METABOLIC on your metagenome data. If you want to run it on unbinned metagenome data, follow this example batch script. Note that for METABOLIC-G.pl you only need to specify the input and output directory. You can run it on either nucleotide files with a .fasta extension, or amino acid files with a .faa extension. module load anaconda3 source activate metabolic #run on metagenomes - note that the file extensions MUST be &#39;.fasta&#39;, &#39;.fa&#39; is not acceptable. perl /home/cpc7770/METABOLIC/METABOLIC-G.pl -in-gn /projects/p30777/metagenome_data/contigs -o /projects/p30777/metagenome_data/metabolic_metagenome_annotations Wanna run it on binned genomes? Use the METABOLIC-C.pl script. Note that for this script you need to specify the paired read files. You can run the following as a batch job on Quest: #run on binned genomes #rename from .fa to .fasta #cd /projects/p30777/metagenome_data/genomes #find . -name &quot;*.fa&quot; -exec rename .fa .fasta {} + #run on metagenomes - note that the file extensions MUST be &#39;.fasta&#39;, &#39;.fa&#39; is not acceptable. #loop over directories for path in /projects/p30777/metagenome_data/genomes/*; do [ -d &quot;${path}&quot; ] || continue # if not a directory, skip dirname=&quot;$(basename &quot;${path}&quot;)&quot; echo annotating $dirname genomes... perl ~/METABOLIC/METABOLIC-C.pl -in-gn $path -r /projects/p30777/metagenome_data/reads/$dirname -o /projects/p30777/metagenome_data/metabolic_genome_annotations/$dirname done echo done! "]
]