Skip to content

NW-PaGe/measles

 
 

Repository files navigation

Measles Virus (MeV) Washington-Focused Build

Build Overview

  • Build Name: Measles Virus Washington-Focused Build

  • Pathogen/Strain: Measles virus (MeV)

  • Scope: Full-genome sequences representing all strains (A-H)

  • Purpose: This repository contains the Nextstrain build for Measles virus (MeV). Sequences are included from Washington State, with contextual sequences of North America and global origin included using a tiered-subsampling scheme. Full-genomes are curated for the purposes of inferring strain, assessing patterns of epidemiological linkage, and exploring sources of introduction.

  • Nextstrain Build/s Location/s: https://nextstrain.org/groups/wadoh/measles/wa/genome/

Table of Contents

Pathogen Epidemiology

  • Overview:

    • MeV is an RNA virus in the family Paramyxoviridae
    • Currently, B3 and D8 are the only circulating genotypes globally. 'A' genotypes are vaccine strains.
    • Transmission occurs through contact - either directly or through aerosolized nasal secretions.
  • Taxonomic designations include clades (A-H) and subclades (numbered).

  • Geographic Distribution and Seasonality

    • MeV is distributed globally, with the highest case numbers in areas with low vaccination coverage. Children of school age are at higher risk.
    • In temperate regions, higher transmission can occur in based on patterns in schooling, while agricultural patterns can drive transmission in less developed areas.
  • Public health importance

    • Why are genomic data useful for this pathogen:
      • Full-genome data allows for outbreak identification and investigation.
      • Most genotypes have been declared inactive. Genomic surveillance can help detect emergence of novel genotypes, or reemergence of previously circulating genotypes.
      • Identification of genotypes outside of clade A can rule out vaccination-related infections.
      • Full genomes can assist in monitoring the effectiveness of established PCR-based diagnostic assays.
      • Vaccine escape has not been observed, but should be monitored.
  • Additional Resources

Scientific Decisions

Nextstrain builds are designed for specific purposes and not all types of builds for a particular pathogen will answer the same questions. The following are critical decisions that were made during the development of this build that should be kept in mind when analyzing the data and using this build.

  • Nomenclature: The nomenclature used in this build to designate clade names is determined by the Global Measles and Rubella Laboratory Network.
  • Subsampling: This build incorporates all known sequences from Washington State, a maximum of 2,000 additional sequences from North America and a maximum of 2,000 additional samples of global origin.
  • Root selection: The root sequence is not specified, but inferred by augur ancestral.
  • Reference selection: Ichinose B95a strain (Genbank accession #NC_001498) was the reference for full genome alignment.
  • Inclusion/Exclusion: Strains isolated from subacute sclerosing panencephalitis (SSPE) cases are excluded, as they contain hypermutations that prevent strain designation, and do not shed typically, making them very atypical strains overall. Vaccine reference strains (A- genotypes) are force-included following all other subsampling procedures.

Data Sources & Inputs

This build pulls from data from the Nextstrain dataset, which is curated from NCBI and Pathoplexus. Note that the Pathoplexus data includes restricted data, and must comply with Terms of Use.

  • Sequence Data: All sequence data originate from Pathoplexus and NCBI.
  • Metadata: All metadata originate from Pathoplexus and NCBI.
  • Expected Inputs:
    • measles/phylogenetic/data/sequences.fasta.zst (containing viral genome sequences)
    • measles/phylogenetic/data/metadata.tsv.zst (with relevant sample information)

Setup & Dependencies

Installation

Ensure that you have Nextstrain installed.

To check that Nextstrain is installed:

nextstrain check-setup

Clone the repository:

git clone https://github.com/DOH-DAH0303/measles.git
cd measles/phylogenetic/

Run the Washington-Focused Build

Make sure you are located in the build folder phylogenetic before running the build command:

nextstrain build . --configfile build-configs/state_focused/public-config.yaml

When you run the build using the above command, Nextstrain uses Snakemake as the workflow manager to automate genomic analyses. The Snakefile in a Nextstrain build defines how raw input data (sequences and metadata) are processed step-by-step in an automated way. Nextstrain builds are powered by Augur (for phylogenetics) and Auspice (for visualization), and Snakemake is used to automate the execution of these steps using Augur and Auspice based on file dependencies.

Alternative configuration files can be specified to customize the workflow. In this case, --configfile build-configs/state_focused/public-config.yaml tweaks the workflow such that samples are pulled preferentially from Washington state, then North America, then globally, with numbers of samples from each layer specified in the public-config.yaml.

Run the Build with Test Data (Optional)

An alternative configuration file is present for running the phylogenetic workflow on a smaller example data set. In this case, --configfile build-configs/ci/config.yaml tweaks the workflow such this dataset located in phylogenetic/example_data gets copied to phylogenetic/data, and bypasses the default steps of downloading and decompressing the full dataset provided by Nextstrain.

nextstrain build . --configfile build-configs/ci/config.yaml

Expected Outputs

The file structure of the phylogenetic/ directory is as follows with *" folders denoting folders that are the build's expected outputs.

.
├── README.md
├── Snakefile
├── auspice*
├── build-configs
├── data
├── defaults
├── example_data
├── results*
└── rules

More details on the file structure of this build can be found here

After successfully running the build there will be two output folders containing the build results.

  • auspice/ folder contains measles_genome.json. This is the final result viewable by auspice.
  • results/ folder contains the genome folder containing intermediate outputs from the respective workflows.

Visualize Results

  • Option 1: Open auspice.us in a web browser, and drop in measles_genome.json as input.

  • Option 2: Run nextstrain view . from your measles/phylogenetic/ folder.

  • To learn more about how to make epidemiologic inferences from phylogenetic trees, see The Applied Genomic Epidemiology Handbook.

Customization for Local Adaptation

This build can be customized for use by other states. This is configurable by editing a single file, measles/phylogenetic/build-configs/state_focused/public-config.yaml. To change the focal state, change the division in the config file under [custom_subsample] -> [genome] -> [samples] -> [state] -> [query]. Replace "Washington" with your state of interest.

Contributing

For any questions please submit them to our Discussions page. Software issues and requests can be logged as a Git Issue.

License

This project is licensed under a modified GPL-3.0 License. You may use, modify, and distribute this work, but commercial use is strictly prohibited without prior written permission.

Acknowledgements

We gratefully acknowledge the contributions of the AMD teams (Microbiology, MEP, Bioinformatics, DIQA), Washington State Public Health Laboratories (WA PHL), and our colleagues at the Washington State Department of Health, whose expertise and dedication made this work possible. We also extend our sincere thanks to the Nextstrain development team for their ongoing collaboration and support.

About

Nextstrain build for measles virus

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 83.6%
  • Shell 16.4%