Skip to content

misc feature requests #15

@ofrei

Description

@ofrei

CLI syntax refinements

  • Consider console-command aliases without the .py suffix and with the input
    format as a subcommand:
    • project_payload.py --input-format bfile ... -> project_payload bfile ...
    • prepare_variants.py --input-format bim ... -> prepare_variants bim ...
    • same question for prepare_variants_sharded.py.
  • Decide singular/plural spelling before adding aliases. Current repo/docs use
    project_payload.py; the idea was phrased as project_payloads.py /
    project_payloads.
  • If implemented, preserve the existing .py entry points as compatibility
    aliases unless a deliberate breaking-release plan says otherwise.

Payload input naming

  • Revisit project_payload.py --input ...bim / --input ...pvar. For payload
    projection, a .bim or .pvar path is being used as a PLINK payload prefix
    locator, which reads strangely because the actual payload is the file set
    (.bed/.bim/.fam or .pgen/.pvar/.psam).
  • Same issue exists in the underlying apply tools:
    apply_vmap_to_bfile.py and apply_vmap_to_pfile.py.
  • Possible direction: expose payload-prefix terminology at the wrapper level
    while keeping compatibility with existing --input / --source-prefix
    flags.

derived_from contract

  • Tighten the normative contract for derived_from in .vmap metadata.
  • Consider allowing project_payload.py --input to be omitted when the source
    payload path can be resolved from the mapping object's derived_from.
  • Sumstats are trickier because payload resolution may require both raw input
    and metadata. Consider whether derived_from should be a copyable single
    string encoding a pair such as input=<path>;metadata=<path>, or whether the
    object metadata should use structured fields.
  • Clarify how derived_from propagates through preparation, sharded
    preparation, matching, intersection, and projection.

Combined sumstats prepare/project wrapper

  • Consider a wrapper for the common summary-statistics sequence:
    prepare_variants.py followed by project_payload.py.
  • Optional --target could constrain between prepare and project:
    prepare raw sumstats -> optionally match/intersect to target -> project
    cleaned or raw sumstats output.
  • Define whether this is only a convenience wrapper around existing retained
    artifacts or whether it owns its own retained intermediate namespace,
    --resume, and --force behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions