Skip to content

Optional IV-LLM + readme explanation#24

Merged
rahulbshrestha merged 12 commits intomainfrom
iv_refactored
Apr 7, 2026
Merged

Optional IV-LLM + readme explanation#24
rahulbshrestha merged 12 commits intomainfrom
iv_refactored

Conversation

@akhkim
Copy link
Copy Markdown
Collaborator

@akhkim akhkim commented Apr 6, 2026

No description provided.

Copilot AI review requested due to automatic review settings April 6, 2026 14:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional “IV‑LLM” instrument-discovery pipeline (gated by --iv_llm) and documents it, alongside supporting tooling/models and some test/script updates.

Changes:

  • Introduces an IV discovery tool + IV‑LLM subpackage (prompts/agents/critics/utils) and wires it into the CAIS workflow when IV is selected.
  • Adds CLI/script flags (--iv_llm) and README documentation for the optional IV discovery stage.
  • Updates dataset analysis/cleaning plumbing and adds/updates tests.

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/cais/test_e2e_iv_new.py New bulk E2E test for IV-related queries (logs LLM outputs).
tests/cais/methods/test_diff_in_diff.py New unit test scaffold for DiD output structure.
run_cais.py Minor formatting/indentation fix.
run_cais_new.py Adds --iv_llm flag and attempts to pass it to CausalAgent.
README.md Documents optional IV discovery stage and adds --iv_llm usage to run instructions.
pyproject.toml Adds extra Sphinx/doc tooling dependencies.
cais/utils/llm_helpers.py Updates LangChain BaseChatModel import and adds a generic invoke_llm helper.
cais/utils/agent.py Adds an alternative LangChain agent implementation (LCEL/ReAct) and custom output parser.
cais/tools/iv_discovery_tool.py New LangChain tool wrapper around IV discovery pipeline.
cais/tools/dataset_analyzer_tool.py Plumbs use_iv_pipeline flag into dataset analysis component call.
cais/tools/data_analyzer.py Removes deprecated DataAnalyzer implementation.
cais/tools/controls_selector_tool.py Enables LangChain @tool decorator usage for the controls selector tool.
cais/models.py Adds Pydantic models for IV discovery tool input/output.
cais/iv_llm/src/variable_utils.py Utilities for extracting/mapping candidate variable names to dataset columns.
cais/iv_llm/src/prompts/prompt_loader.py Loader/formatter for IV‑LLM prompt templates.
cais/iv_llm/src/prompts/independence_critic.txt Prompt template for independence critic.
cais/iv_llm/src/prompts/hypothesizer.txt Prompt template for IV hypothesizer.
cais/iv_llm/src/prompts/exclusion_critic.txt Prompt template for exclusion critic.
cais/iv_llm/src/prompts/confounder_miner.txt Prompt template for confounder miner.
cais/iv_llm/src/prompts/init.py Declares prompts package.
cais/iv_llm/src/llm/client.py Thin adapter around CAIS get_llm_client for IV‑LLM code.
cais/iv_llm/src/llm/init.py Declares llm package.
cais/iv_llm/src/experiments/iv_co_scientist.py Experimental end-to-end IV “co-scientist” pipeline runner.
cais/iv_llm/src/critics/independence_critic.py Independence critic implementation using prompts + LLM calls.
cais/iv_llm/src/critics/exclusion_critic.py Exclusion critic implementation using prompts + LLM calls.
cais/iv_llm/src/critics/init.py Declares critics package.
cais/iv_llm/src/agents/hypothesizer.py Hypothesizer agent for proposing IV candidates.
cais/iv_llm/src/agents/confounder_miner.py Confounder miner agent for proposing confounders.
cais/iv_llm/src/agents/init.py Declares agents package.
cais/iv_llm/src/init.py Declares IV‑LLM src package.
cais/iv_llm/init.py Adds package-level logging setup for IV‑LLM to a jsonl file.
cais/components/iv_discovery.py Component wrapper that orchestrates hypothesizer/critics to validate IVs.
cais/components/dataset_cleaner.py Tightens cleaning script path handling and safer placeholder replacement.
cais/components/dataset_analyzer.py Adds use_iv_pipeline option to IV detection (IV‑LLM first, fallback otherwise).
cais/cli.py Adds --iv_llm flag for single/batch CLI and passes through to analysis.
cais/agent.py Wires IV discovery into the agent workflow when IV is selected + flag enabled.
cais/init.py Exposes iv_discovery_tool at package level (via cais.tools).
Comments suppressed due to low confidence (2)

run_cais_new.py:172

  • CausalAgent.run_analysis(...) in cais/agent.py only accepts (query, llm_method_selection=...) and does not accept dataset_path, dataset_description, or use_decision_tree. This call will raise TypeError unless the API is updated. Consider constructing the agent with dataset_path/dataset_description and using llm_method_selection, or call the module-level run_causal_analysis(...) function instead.
    cais/init.py:32
  • cais.tools currently doesn't export iv_discovery_tool (it's not imported in cais/tools/__init__.py), so this import will fail when importing cais. Either import it directly from cais.tools.iv_discovery_tool here, or add it to cais/tools/__init__.py (and __all__) to keep this package-level import working.
# Import tools
from cais.tools import (
    input_parser_tool,
    dataset_analyzer_tool,
    query_interpreter_tool,
    iv_discovery_tool,
    method_selector_tool,
    method_validator_tool,
    method_executor_tool,
    explanation_generator_tool,
    output_formatter_tool
)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +6 to +8
# Import the function to test
from cais.methods.diff_in_diff import estimate_effect

Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cais.methods.diff_in_diff (and the patched helper paths below) don't exist in this repo; the DiD implementation appears to live under cais/methods/difference_in_differences/. As written, this test will fail at import time and the patch targets will also be invalid. Please update the import and patch module paths to the actual DiD estimator module used by CAIS.

Copilot uses AI. Check for mistakes.
Comment on lines 163 to 167
print('Starting run!')

cais = CausalAgent()
cais = CausalAgent(use_iv_pipeline=args.iv_llm)

cais.run_analysis(
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CausalAgent requires a dataset_path argument in its constructor, so instantiating it here without one will raise a TypeError. Either pass dataset_path when constructing the agent (and adjust the rest of the call sites accordingly) or switch this script to use run_causal_analysis(...) which accepts dataset_path per call.

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +18
class TestE2EIVNewPipeline(unittest.TestCase):
def test_iv_llm_pipeline_bulk(self):
"""Run several queries from the CSV data and log LLM outputs."""
csv_path = os.path.join(ROOT, "data", "checked_real_data - Final.csv")
df = pd.read_csv(csv_path)
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This E2E test makes real LLM calls and will be flaky/slow in CI unless it is skipped when credentials are missing. Other E2E tests (e.g. test_e2e_did.py) load dotenv and SkipTest when OPENAI_API_KEY is not set; please apply the same pattern here.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +18
csv_path = os.path.join(ROOT, "data", "checked_real_data - Final.csv")
df = pd.read_csv(csv_path)
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test assumes data/checked_real_data - Final.csv exists but doesn't check for it (or skip/fail with a clear message) before calling pd.read_csv. Please add an existence check (and likely SkipTest in CI) so failures are actionable and not just a pandas FileNotFoundError.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +32
results_log = []
output_file = os.path.join(os.path.dirname(__file__), "llm_outputs.json")

print(f"--- Running Bulk E2E Test (5 queries) ---")

Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing llm_outputs.json into the tracked tests/ directory makes the test non-hermetic and will dirty the working tree in CI/local runs. Prefer writing to a temporary directory (e.g., tempfile.TemporaryDirectory()/NamedTemporaryFile) or gating output behind an opt-in env var.

Copilot uses AI. Check for mistakes.
cais/agent.py Outdated
Comment on lines 50 to 58

# Set up basic logging
os.makedirs('./logs/', exist_ok=True)
logging.basicConfig(
filename='./logs/agent_debug.log',
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuring logging.basicConfig(...) at module import time changes global logging configuration for any consumer importing cais.agent, and forces file I/O to ./logs/agent_debug.log. Consider moving this configuration behind a CLI entry point / if __name__ == "__main__" guard, or only adding a module-level logger handler without calling basicConfig.

Suggested change
# Set up basic logging
os.makedirs('./logs/', exist_ok=True)
logging.basicConfig(
filename='./logs/agent_debug.log',
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
logger = logging.getLogger(__name__)
if not logger.handlers:
logger.addHandler(logging.NullHandler())

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +110
def parse(self, text: str) -> Union[List[AgentAction], AgentFinish]:
includes_answer = FINAL_ANSWER_ACTION in text
print('-------------------')
print(text)
print('-------------------')
# Grab every Action / Action Input block
pattern = (
r"Action\s*\d*\s*:[\s]*(.*?)\s*"
r"Action\s*\d*\s*Input\s*\d*\s*:[\s]*(.*?)(?=(?:Action\s*\d*\s*:|$))"
)
matches = list(re.finditer(pattern, text, re.DOTALL))

# If we found tool calls…
if matches:
if includes_answer:
# both a final answer *and* tool calls is ambiguous
raise OutputParserException(
f"{FINAL_ANSWER_AND_PARSABLE_ACTION_ERROR_MESSAGE}: {text}"
)

actions: List[AgentAction] = []
for m in matches:
tool_name = m.group(1).strip()
tool_input = m.group(2).strip().strip('"')
print('\n--------------------------')
print(tool_input)
print('--------------------------')
actions.append(AgentAction(tool_name, json.loads(tool_input), text))
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReActMultiInputOutputParser.parse prints the full raw LLM output (and parsed tool inputs) to stdout. This is noisy in normal runs and can inadvertently leak prompt/response contents into logs. Prefer using the module logger at DEBUG level (or removing the prints entirely) so output is controllable.

Copilot uses AI. Check for mistakes.
Comment on lines +263 to +271
executor = AgentExecutor(
agent=agent,
tools=agent_tools,
memory=memory, # Pass the memory object
verbose=True,
callbacks=[ConsoleCallbackHandler()], # Optional: for console debugging
handle_parsing_errors=True, # Let AE handle parsing errors
max_retries = 100
)
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AgentExecutor is instantiated with max_retries=100, but AgentExecutor typically uses max_iterations / max_execution_time (and may not accept max_retries). If this kwarg isn't supported by the installed LangChain version, this will raise TypeError at runtime. Please verify the correct parameter name for limiting agent loops and update accordingly.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +38
# Configure a file handler on the "cais.iv_llm" logger so that every child
# logger (agents, critics, etc.) automatically writes to the IV-LLM log file.
_logger = logging.getLogger(__name__) # "cais.iv_llm"
if not _logger.handlers:
_logger.setLevel(logging.DEBUG)
_logger.propagate = True # still propagate to root for console output
try:
_log_path = _get_log_path()
_log_path.parent.mkdir(parents=True, exist_ok=True)
_handler = logging.FileHandler(str(_log_path), encoding="utf-8")
_handler.setLevel(logging.INFO)
_handler.setFormatter(logging.Formatter("%(message)s"))
_logger.addHandler(_handler)
except Exception:
pass No newline at end of file
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module configures a FileHandler and creates directories at import time. Since cais.iv_llm will be imported whenever the IV pipeline modules are imported, this can cause unexpected file I/O (and can fail in read-only environments) even if the user isn't explicitly trying to write logs. Consider moving logger/file-handler setup behind an explicit initialization function or an opt-in env flag.

Suggested change
# Configure a file handler on the "cais.iv_llm" logger so that every child
# logger (agents, critics, etc.) automatically writes to the IV-LLM log file.
_logger = logging.getLogger(__name__) # "cais.iv_llm"
if not _logger.handlers:
_logger.setLevel(logging.DEBUG)
_logger.propagate = True # still propagate to root for console output
try:
_log_path = _get_log_path()
_log_path.parent.mkdir(parents=True, exist_ok=True)
_handler = logging.FileHandler(str(_log_path), encoding="utf-8")
_handler.setLevel(logging.INFO)
_handler.setFormatter(logging.Formatter("%(message)s"))
_logger.addHandler(_handler)
except Exception:
pass
def _should_enable_file_logging() -> bool:
return os.getenv("IV_LLM_ENABLE_FILE_LOGGING", "").strip().lower() in {
"1",
"true",
"yes",
"on",
}
# Configure the "cais.iv_llm" logger itself at import time, but defer any
# file-system work until explicit initialization or an opt-in env flag.
_logger = logging.getLogger(__name__) # "cais.iv_llm"
_logger.setLevel(logging.DEBUG)
_logger.propagate = True # still propagate to root for console output
def initialize_file_logging(force: bool = False) -> bool:
"""Attach the IV-LLM file handler if explicitly enabled.
Returns True if file logging is enabled after this call, otherwise False.
"""
if not force and not _should_enable_file_logging():
return False
for existing_handler in _logger.handlers:
if getattr(existing_handler, "_iv_llm_file_handler", False):
return True
try:
log_path = _get_log_path()
log_path.parent.mkdir(parents=True, exist_ok=True)
handler = logging.FileHandler(str(log_path), encoding="utf-8")
handler.setLevel(logging.INFO)
handler.setFormatter(logging.Formatter("%(message)s"))
handler._iv_llm_file_handler = True
_logger.addHandler(handler)
return True
except Exception:
return False
if _should_enable_file_logging():
initialize_file_logging()

Copilot uses AI. Check for mistakes.
@rahulbshrestha rahulbshrestha merged commit 14a65b7 into main Apr 7, 2026
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants