Skip to content
View anthony-maio's full-sized avatar

Highlights

  • Pro

Block or report anthony-maio

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
anthony-maio/README.md

Anthony Maio

Independent AI safety researcher. Former Staff Software Engineer (20 years shipping enterprise systems). Took a sabbatical to work on problems I think matter more.

Focus areas: scalable oversight, agentic system evaluation, audit-shielding detection, inter-agent coordination security.


Research

Scalable oversight & verification failure

How weak verifiers (humans, smaller models) fail to catch persuasive but wrong reasoning. The CMED benchmark +
HDCS swarm architecture.

Agentic safety & audit-shielding

How systems behave differently under benchmark-shaped prompts vs. realistic high-trust contexts. Model organisms of misalignment.

Cognitive architecture & continuity

Long-horizon agent coherence, memory systems, epistemic stress detection.

Inter-agent communication

Protocol design targeting tokenization economics. Efficiency + detectability for coordination channels.


Current work

  • Red-team → blue-team pipelines for agentic deployments (prompt evolution + heterogeneous verification)
  • Protocol security for semantic quantization channels
  • Reproducible oversight failure evaluations (CMED-style trap suites)
  • EAP contribution to Bloom

Contact

Building agentic systems and want help with evaluation harnesses, oversight swarms, or agent communication
protocols? I'm currently looking for a full-time role where I can bring my 20 years of shipping production code to innovative AI use cases or research.

anthony@making-minds.ai

Pinned Loading

  1. stop-llm-bullshit stop-llm-bullshit Public

    Three system prompts to stop LLMs from agreeing with you when you're wrong

    4

  2. slipcore slipcore Public

    SLIPCore - Streamlined Interagent Protocol for LLM agent communication

    Python 1 1

  3. argos-swarm argos-swarm Public

    Automated LLM red/blue teaming using evolutionary robustness for multi-turn social engineering

    Python 1

  4. pv-eat pv-eat Public

    PV-EAT: Activation-Measured Adversarial Testing for Audit-Shielding Detection"

    Python

  5. slipstream-control-plane slipstream-control-plane Public

    Source Code for the Slipstream Control Plane

    JavaScript

  6. slipstream-governance-env slipstream-governance-env Public

    OpenEnv RL environment for training AI agents to use inter-agent protocols safely

    Jupyter Notebook