This repository is a template for a reproducible data analysis project or paper. The default example uses R, Quarto, Git, and GitHub, but the structure is workflow-first so projects can add Python, Julia, shell scripts, or other tools without major reorganization.
This template also includes lightweight guidance for AI-supported work. The goal is not to make AI do the project for you. The goal is to make AI tools useful for coding, documentation, review, and troubleshooting while keeping the project transparent, reproducible, and human-reviewed.
The default example uses R, Quarto, GitHub, and a reference manager that can handle BibTeX. Zotero with the Better BibTeX plugin is a good choice.
It is also useful to have a word processor installed, such as MS Word or LibreOffice. To produce PDF output, you need a TeX distribution. TinyTeX is one option; see the Quarto PDF instructions.
The example files use these R packages: broom, dplyr, ggplot2, here,
knitr, readxl, skimr, and tidyr. Install them before running the example
workflow:
install.packages(c("broom", "dplyr", "ggplot2", "here", "knitr",
"readxl", "skimr", "tidyr"))The template comes with a folder structure and example files to show the kinds of content you would place in each folder. See the folder-specific readme files for more detail.
ai/: AI workflow notes, prompt templates, review checklists, and a short AI-use log. Seeai/readme-ai.md.assets/: static non-code materials such as references, CSL files, PDFs, and manually created figures. Seeassets/readme-assets.md.code/: analysis code organized by workflow stage. Seecode/readme-code.md.data/: raw, processed, private, and large data folders. Seedata/readme-data.md.products/: final or near-final deliverables such as reports, manuscripts, presentations, posters, and apps. Seeproducts/readme-products.md.results/: outputs generated by code, such as figures, tables, and model summaries. Seeresults/readme-results.md.
Important project-level files:
readme.md: this project overview.usage.md: practical instructions for running and reproducing the project.AGENTS.md: extra instructions for AI coding assistants and collaborators using AI tools.project-metadata.yml: concise metadata about the project, software, data, and AI-use policy.
Use descriptive file and folder names. In general:
- use lower-case names;
- separate words with
-; - avoid spaces, underscores, and CamelCase unless a standard file name or file extension requires otherwise.
For example, this template has code/analysis/statistical-analysis.R.
Readme files are named by folder context, such as readme-code.md,
readme-data.md, and readme-exploration.md.
Document the software and package setup your project needs. The default example uses manually installed R packages because that is approachable for students and short projects.
For R projects, renv can help
manage R packages and improve long-term reproducibility. This template does not
enable renv by default because it adds complexity for new users and classroom
settings.
If you decide to use renv, commit the lockfile (renv.lock) and the files
needed to activate the environment, but do not commit the local package library.
For Python, Julia, or other languages, document the chosen environment manager
in this readme or in project-metadata.yml. Examples include virtual
environments, Conda, Poetry, Julia project files, or containers. These are
optional; use them when they solve a real project need.
This is a GitHub template repository. The best way to start a new project is to create a repository from this template.
For the example project, run the code manually. See usage.md for the run
order, what each code file does, and how to render products.
AI tools can help explain code, draft first-pass code, improve documentation, suggest checks, and review for reproducibility problems. They should not be treated as final authority for scientific claims, model choice, data privacy, citation accuracy, or interpretation of results.
When using AI tools:
- Point the tool to
readme.md,usage.md,AGENTS.md,project-metadata.yml, anddata/readme-data.md. - Do not paste sensitive, private, regulated, or identifiable data into external AI tools unless the project owner has explicitly approved that workflow.
- Ask for small, reviewable changes.
- Rerun affected scripts or rerender affected products after meaningful changes.
- Add a short entry to
ai/ai-use-log.mdfor meaningful AI-assisted work.
GitHub Actions and other automated workflows can be useful for advanced users. They are intentionally not enabled by default in this template because many users will be new to Git and GitHub.