Skip to content

One-command setup for reproducible research projects (local + GitHub).

License

Notifications You must be signed in to change notification settings

sharears/mkproj

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mkproj

mkproj workflow

One-command setup for reproducible research projects (local + GitHub)


TL;DR

mkproj lets you start a clean, reproducible research project with one command. It creates a standardized project structure, initializes Git safely, and synchronizes with GitHub without uploading data. Run mkproj <project_name> and start working—no Git setup, no folder chaos.


Table of Contents


Why this workflow matters

1️⃣ Reproducibility starts at project creation

Many reproducibility problems begin before analysis even starts—unclear folder structures, mixed data and code, or missing documentation. This workflow enforces a clean separation between data, code, results, and logs from day one, making reproducibility a default rather than an afterthought.

2️⃣ It removes Git friction for researchers

Git errors, merge conflicts, and accidental data uploads are common pain points, especially for new lab members. mkproj automates best practices (correct .gitignore, safe merges, license handling), allowing users to focus on scientific reasoning instead of version-control mechanics.

3️⃣ It scales from a single project to an entire lab

By standardizing how projects are created, this workflow creates a shared structure and mental model across the lab. Students, postdocs, and collaborators can move between projects without relearning conventions, simplifying onboarding and long-term maintenance.


What this tool is (and is not)

✅ What it does

  • standardizes project setup
  • enforces good Git hygiene
  • keeps data out of GitHub
  • works with private or public repositories
  • supports later transfer to a lab organization

❌ What it does NOT do

  • upload raw data
  • track results or generated files
  • modify existing projects
  • rewrite Git history

GitHub is used only for code, documentation, and logs.


Repository contents

mkproj/
├── README.md
├── mkproj.sh    # project creation logic
└── install.sh   # installs mkproj into your shell
  • mkproj.sh contains the mkproj function
  • install.sh installs that function into ~/.zshrc or ~/.bashrc

Requirements

Required

  • Git

    git --version
  • GitHub CLI (gh)

    gh --version

Not required

  • GitHub Desktop
  • SSH keys (HTTPS is used)

Platform compatibility

This workflow is cross-platform and works on all major operating systems as long as a POSIX-compatible shell is available.

Supported platforms

  • macOS

    • Default shell: zsh
    • Fully supported
  • Linux (Ubuntu, Debian, Fedora, CentOS, Arch, HPC systems)

    • Default shell: bash
    • Fully supported
  • Windows (with a Linux-like shell)

    • WSL (Windows Subsystem for Linux) — recommended
    • Git Bash — supported

Not supported

  • Windows PowerShell
  • Windows CMD

Installation (one-time)

Step 1: Clone this repository

git clone https://github.com/<your-username>/mkproj.git
cd mkproj

Step 2: Run the installer

chmod +x install.sh
./install.sh

This appends the mkproj function to your shell configuration file:

  • ~/.zshrc (zsh)
  • ~/.bashrc (bash)

Step 3: Reload your shell

source ~/.zshrc   # or ~/.bashrc

First-time GitHub authentication (one-time)

Run:

gh auth login

When prompted, choose:

  • GitHub.com
  • HTTPS
  • Authenticate Git with your GitHub credentials → Yes
  • Login with a web browser

Verify:

gh auth status

Everyday usage

To start a new project:

mkproj my_new_project

That’s it.


Project folder structure (generated by mkproj)

my_new_project/
├── README.md
├── .gitignore
├── data/
│   ├── raw/
│   ├── intermediate/
│   └── processed/
├── scripts/
├── notebooks/
├── metadata/
├── results/
└── logs/

Folder descriptions

  • README.md — project overview and usage

  • .gitignore — prevents tracking large or derived files

  • data/ — local data only (never uploaded)

    • raw/ — original input data
    • intermediate/ — temporary processing files
    • processed/ — cleaned, analysis-ready data
  • scripts/ — reusable analysis and processing code

  • notebooks/ — exploratory Jupyter notebooks

  • metadata/ — data dictionaries and mappings

  • results/ — derived outputs (not tracked)

  • logs/ — daily analysis notes (run_log.md)


Recommended daily workflow

git status
git add scripts notebooks metadata logs README.md
git commit -m "Describe what changed today"
git push

You generally do not commit data or results.


Lab ownership and collaboration

  • Projects can start under a personal GitHub account
  • Repositories can later be transferred to a lab organization
  • Commit history and license are preserved

Safety guarantees

  • no force-push
  • no data uploads
  • no history rewriting
  • license-safe merges
  • works offline until GitHub sync

Who should use this?

  • computational research labs
  • graduate students and postdocs
  • undergraduate research groups
  • instructors teaching reproducible research

License

MIT License — free to use, modify, and share.


Acknowledgements

This project was developed with the assistance of ChatGPT (OpenAI) for workflow design and documentation refinement. The overview image at the top was generated by Gemini (Google).


About

One-command setup for reproducible research projects (local + GitHub).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages