helpers is a public collection of Python utilities, configuration patterns, and developer tooling extracted from real production work at Causify.
It exists for one reason:
Make engineering repeatable.
Turn "tribal knowledge" into composable primitives: less glue code, fewer one-off scripts, and more reliable systems.
This repo is useful in three common contexts:
- Platform/product engineering: predictable building blocks for I/O, debugging, system operations, Git/Docker, dates/time, dataframes, and more.
- Repo hygiene at scale: linting, import validation, CI utilities, pre-commit workflows; things large repos need to stay healthy.
- LLM/agentic workflows: lightweight wrappers for completions, structured outputs, caching modes, and cost tracking.
flowchart LR
subgraph Lib["Python Library (helpers/)"]
direction TB
Core["`**Core Helpers**
hdbg · hio · hsystem · hgit · hdocker · hdatetime`"]
Data["`**Data Helpers**
hpandas and related modules`"]
LLM["`**LLM and Agentic Helpers**
hllm · hllm_cost · hllm_cli · hchatgpt`"]
end
subgraph Config["Configuration Patterns (config_root/)"]
Conf["`Env-aware config objects and builders`"]
end
subgraph Tooling["Tooling and Automation"]
direction TB
Scripts["`Dev Scripts
dev_scripts_helpers/`"]
Import["`Import Hygiene
import_check/`"]
Linters["`Linting Framework
linters/ · linters2/`"]
Tasks["`Repo Tasks
tasks.py · invoke.yaml`"]
end
subgraph Docs["Documentation and Examples"]
direction TB
Human["`Human Docs
docs/`"]
Mk["`MkDocs Site
docs_mkdocs/`"]
NB["`Tutorial Notebooks
helpers/notebooks/`"]
end
subgraph CI["Continuous Integration and Hygiene"]
direction TB
GH["`GitHub Workflows
.github/`"]
PC["`Pre-commit and Scanning
.pre-commit-config.yaml · semgrep`"]
end
Lib --> Config
Config --> Tooling
Tooling --> CI
Docs -.-> CI
Most engineering orgs accumulate dozens of tiny scripts and helper snippets. Over time that becomes:
- duplicated logic across repos,
- inconsistent behavior,
- hard-to-test workflows,
- brittle operations.
helpers is the opposite: small, well-scoped primitives that are easy to discover and safe to reuse. The code is intentionally "boring"; because boring tools are what make fast teams.
These examples are deliberately small. The repo is optimized for everyday leverage.
import helpers.hdbg as hdbg
hdbg.dassert_eq(2 + 2, 4)
hdbg.dassert(10 > 3, "Math still works")import helpers.hsystem as hsystem
out = hsystem.system_to_string("python --version")
print(out)import helpers.hio as hio
payload = {"hello": "world"}
hio.to_json("tmp.json", payload)
loaded = hio.from_json("tmp.json")
print(loaded)import pandas as pd
import helpers.hpandas as hpandas
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(hpandas.convert_df_to_json_string(df))helpers.hllm supports providers like OpenAI and OpenRouter (via openrouter/<model>). It also supports caching modes and structured output.
from pydantic import BaseModel
import helpers.hllm as hllm
import helpers.hllm_cost as hllmcost
class Summary(BaseModel):
summary: str
risks: list[str]
# Track cost for a specific provider/model.
tracker = hllmcost.LLMCostTracker(provider_name="openai", model="gpt-4o-mini")
result = hllm.get_structured_completion(
"Summarize key risks of shipping without observability.",
response_format=Summary,
system_prompt="Be concise and practical.",
model="gpt-4o-mini",
cache_mode="DISABLE_CACHE",
print_cost=True,
cost_tracker=tracker,
)
print(result.summary)
print("Cost so far ($):", tracker.get_current_cost())The main Python library lives in helpers/. Many modules follow the h<name>.py naming pattern (e.g., hdbg.py, hio.py, hsystem.py) to make discovery fast and avoid collisions.
A few high-signal examples:
helpers/hdbg.py: defensive checks and assertionshelpers/hsystem.py: safe system command executionhelpers/hio.py: practical file I/O helpershelpers/hgit.py: git utilitieshelpers/hdocker.py: docker helpershelpers/henv.py: environment utilitieshelpers/hdatetime.py: date/time utilities
helpers/hpandas.py re-exports a suite of pandas helpers implemented across hpandas_* modules (conversion, compare, display, cleaning, assertions, IO, etc.). This makes the most common "pandas hygiene" tasks consistent across projects.
This repo includes lightweight building blocks for LLM workflows:
helpers/hllm.py: completions + structured outputs + caching modeshelpers/hllm_cost.py: cost tracking and provider cost utilities (incl.LLMCostTracker)helpers/hllm_cli.py: CLI-oriented LLM workflowshelpers/hchatgpt.py/helpers/hchatgpt_instructions.py: utilities around assistants/instructions workflows
Tutorial notebooks live under helpers/notebooks/ (see hllm.tutorial.py).
Reusable configuration patterns: config objects/builders and environment-aware configuration.
This repo also ships "how we keep repos healthy" primitives:
dev_scripts_helpers/: automation scripts for common workflowsimport_check/: import hygiene validationlinters/,linters2/: lint framework and configs.github/: CI workflows and checks.claude/: Claude Code configuration and hooksCLAUDE.md: Architecture overview and development patterns for Claude Codeconftest.py: Pytest configuration and shared test fixturesinstr.md: Development instructions and task specificationsmain_pytest.py: Main pytest runner and test execution controllertasks.py: Entry point for pyinvoke task automation system- pre-commit and scanning configs (
.pre-commit-config.yaml,.semgrepignore, etc.)
- Small modules > big frameworks Helpers should be composable and easy to understand.
- Safety by default Defensive checks, predictable failures, sensible defaults.
- Fast discovery Naming conventions encourage "find the right tool quickly."
- Reproducibility If it matters, it should be runnable in CI and documented.
- Low dependency footprint Dependencies are added only when they buy real leverage.
git submodule add https://github.com/causify-ai/helpers.git helpers_root
git submodule update --init --recursivepython -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e .If contributing, enable pre-commit hooks:
pip install pre-commit
pre-commit installRun tests:
pytest -q
# or
python main_pytest.pyRun pre-commit:
pre-commit run -aList repo automation tasks (if using invoke):
pip install invoke
invoke -lHuman docs live under docs/. A browsable site is maintained under docs_mkdocs/.
Serve MkDocs locally:
pip install mkdocs mkdocs-material
cd docs_mkdocs
mkdocs serve- Keep changes small and reviewable.
- Add tests when behavior changes.
- Prefer reusable utilities over one-off scripts.
- Keep backwards compatibility in mind for downstream consumers.
This repo includes secret-scanning and standard hygiene. Do not commit secrets. If you suspect a leak: rotate credentials and open an incident.
See LICENSE.