Optional IV-LLM + readme explanation by akhkim · Pull Request #24 · causalNLP/causal-agent

akhkim · 2026-04-06T14:55:58Z

No description provided.

…data

Copilot

Pull request overview

Adds an optional “IV‑LLM” instrument-discovery pipeline (gated by --iv_llm) and documents it, alongside supporting tooling/models and some test/script updates.

Changes:

Introduces an IV discovery tool + IV‑LLM subpackage (prompts/agents/critics/utils) and wires it into the CAIS workflow when IV is selected.
Adds CLI/script flags (--iv_llm) and README documentation for the optional IV discovery stage.
Updates dataset analysis/cleaning plumbing and adds/updates tests.

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
tests/cais/test_e2e_iv_new.py	New bulk E2E test for IV-related queries (logs LLM outputs).
tests/cais/methods/test_diff_in_diff.py	New unit test scaffold for DiD output structure.
run_cais.py	Minor formatting/indentation fix.
run_cais_new.py	Adds `--iv_llm` flag and attempts to pass it to `CausalAgent`.
README.md	Documents optional IV discovery stage and adds `--iv_llm` usage to run instructions.
pyproject.toml	Adds extra Sphinx/doc tooling dependencies.
cais/utils/llm_helpers.py	Updates LangChain BaseChatModel import and adds a generic `invoke_llm` helper.
cais/utils/agent.py	Adds an alternative LangChain agent implementation (LCEL/ReAct) and custom output parser.
cais/tools/iv_discovery_tool.py	New LangChain tool wrapper around IV discovery pipeline.
cais/tools/dataset_analyzer_tool.py	Plumbs `use_iv_pipeline` flag into dataset analysis component call.
cais/tools/data_analyzer.py	Removes deprecated DataAnalyzer implementation.
cais/tools/controls_selector_tool.py	Enables LangChain `@tool` decorator usage for the controls selector tool.
cais/models.py	Adds Pydantic models for IV discovery tool input/output.
cais/iv_llm/src/variable_utils.py	Utilities for extracting/mapping candidate variable names to dataset columns.
cais/iv_llm/src/prompts/prompt_loader.py	Loader/formatter for IV‑LLM prompt templates.
cais/iv_llm/src/prompts/independence_critic.txt	Prompt template for independence critic.
cais/iv_llm/src/prompts/hypothesizer.txt	Prompt template for IV hypothesizer.
cais/iv_llm/src/prompts/exclusion_critic.txt	Prompt template for exclusion critic.
cais/iv_llm/src/prompts/confounder_miner.txt	Prompt template for confounder miner.
cais/iv_llm/src/prompts/init.py	Declares prompts package.
cais/iv_llm/src/llm/client.py	Thin adapter around CAIS `get_llm_client` for IV‑LLM code.
cais/iv_llm/src/llm/init.py	Declares llm package.
cais/iv_llm/src/experiments/iv_co_scientist.py	Experimental end-to-end IV “co-scientist” pipeline runner.
cais/iv_llm/src/critics/independence_critic.py	Independence critic implementation using prompts + LLM calls.
cais/iv_llm/src/critics/exclusion_critic.py	Exclusion critic implementation using prompts + LLM calls.
cais/iv_llm/src/critics/init.py	Declares critics package.
cais/iv_llm/src/agents/hypothesizer.py	Hypothesizer agent for proposing IV candidates.
cais/iv_llm/src/agents/confounder_miner.py	Confounder miner agent for proposing confounders.
cais/iv_llm/src/agents/init.py	Declares agents package.
cais/iv_llm/src/init.py	Declares IV‑LLM src package.
cais/iv_llm/init.py	Adds package-level logging setup for IV‑LLM to a jsonl file.
cais/components/iv_discovery.py	Component wrapper that orchestrates hypothesizer/critics to validate IVs.
cais/components/dataset_cleaner.py	Tightens cleaning script path handling and safer placeholder replacement.
cais/components/dataset_analyzer.py	Adds `use_iv_pipeline` option to IV detection (IV‑LLM first, fallback otherwise).
cais/cli.py	Adds `--iv_llm` flag for single/batch CLI and passes through to analysis.
cais/agent.py	Wires IV discovery into the agent workflow when IV is selected + flag enabled.
cais/init.py	Exposes `iv_discovery_tool` at package level (via `cais.tools`).

Comments suppressed due to low confidence (2)

run_cais_new.py:172

CausalAgent.run_analysis(...) in cais/agent.py only accepts (query, llm_method_selection=...) and does not accept dataset_path, dataset_description, or use_decision_tree. This call will raise TypeError unless the API is updated. Consider constructing the agent with dataset_path/dataset_description and using llm_method_selection, or call the module-level run_causal_analysis(...) function instead.
cais/init.py:32
cais.tools currently doesn't export iv_discovery_tool (it's not imported in cais/tools/__init__.py), so this import will fail when importing cais. Either import it directly from cais.tools.iv_discovery_tool here, or add it to cais/tools/__init__.py (and __all__) to keep this package-level import working.

# Import tools
from cais.tools import (
    input_parser_tool,
    dataset_analyzer_tool,
    query_interpreter_tool,
    iv_discovery_tool,
    method_selector_tool,
    method_validator_tool,
    method_executor_tool,
    explanation_generator_tool,
    output_formatter_tool
)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-06T15:02:38Z

tests/cais/methods/test_diff_in_diff.py

+# Import the function to test
+from cais.methods.diff_in_diff import estimate_effect
+


cais.methods.diff_in_diff (and the patched helper paths below) don't exist in this repo; the DiD implementation appears to live under cais/methods/difference_in_differences/. As written, this test will fail at import time and the patch targets will also be invalid. Please update the import and patch module paths to the actual DiD estimator module used by CAIS.

Copilot · 2026-04-06T15:02:38Z

run_cais_new.py

                print('Starting run!')

-                cais = CausalAgent()
+                cais = CausalAgent(use_iv_pipeline=args.iv_llm)

                cais.run_analysis(


CausalAgent requires a dataset_path argument in its constructor, so instantiating it here without one will raise a TypeError. Either pass dataset_path when constructing the agent (and adjust the rest of the call sites accordingly) or switch this script to use run_causal_analysis(...) which accepts dataset_path per call.

Copilot · 2026-04-06T15:02:38Z

tests/cais/test_e2e_iv_new.py

+class TestE2EIVNewPipeline(unittest.TestCase):
+    def test_iv_llm_pipeline_bulk(self):
+        """Run several queries from the CSV data and log LLM outputs."""
+        csv_path = os.path.join(ROOT, "data", "checked_real_data - Final.csv")
+        df = pd.read_csv(csv_path)


This E2E test makes real LLM calls and will be flaky/slow in CI unless it is skipped when credentials are missing. Other E2E tests (e.g. test_e2e_did.py) load dotenv and SkipTest when OPENAI_API_KEY is not set; please apply the same pattern here.

Copilot · 2026-04-06T15:02:38Z

tests/cais/test_e2e_iv_new.py

+        csv_path = os.path.join(ROOT, "data", "checked_real_data - Final.csv")
+        df = pd.read_csv(csv_path)


This test assumes data/checked_real_data - Final.csv exists but doesn't check for it (or skip/fail with a clear message) before calling pd.read_csv. Please add an existence check (and likely SkipTest in CI) so failures are actionable and not just a pandas FileNotFoundError.

Copilot · 2026-04-06T15:02:38Z

tests/cais/test_e2e_iv_new.py

+        results_log = []
+        output_file = os.path.join(os.path.dirname(__file__), "llm_outputs.json")
+
+        print(f"--- Running Bulk E2E Test (5 queries) ---")
+


Writing llm_outputs.json into the tracked tests/ directory makes the test non-hermetic and will dirty the working tree in CI/local runs. Prefer writing to a temporary directory (e.g., tempfile.TemporaryDirectory()/NamedTemporaryFile) or gating output behind an opt-in env var.

Copilot · 2026-04-06T15:02:39Z

cais/agent.py


 # Set up basic logging
 os.makedirs('./logs/', exist_ok=True)
+logging.basicConfig(
+    filename='./logs/agent_debug.log',
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
 logger = logging.getLogger(__name__)


Configuring logging.basicConfig(...) at module import time changes global logging configuration for any consumer importing cais.agent, and forces file I/O to ./logs/agent_debug.log. Consider moving this configuration behind a CLI entry point / if __name__ == "__main__" guard, or only adding a module-level logger handler without calling basicConfig.

Suggested change

# Set up basic logging

os.makedirs('./logs/', exist_ok=True)

logging.basicConfig(

filename='./logs/agent_debug.log',

level=logging.INFO,

format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'

)

logger = logging.getLogger(__name__)

logger = logging.getLogger(__name__)

if not logger.handlers:

logger.addHandler(logging.NullHandler())

Copilot · 2026-04-06T15:02:39Z

cais/utils/agent.py

+    def parse(self, text: str) -> Union[List[AgentAction], AgentFinish]:
+        includes_answer = FINAL_ANSWER_ACTION in text
+        print('-------------------')
+        print(text)
+        print('-------------------')
+        # Grab every Action / Action Input block
+        pattern = (
+            r"Action\s*\d*\s*:[\s]*(.*?)\s*"
+            r"Action\s*\d*\s*Input\s*\d*\s*:[\s]*(.*?)(?=(?:Action\s*\d*\s*:|$))"
+        )
+        matches = list(re.finditer(pattern, text, re.DOTALL))
+
+        # If we found tool calls…
+        if matches:
+            if includes_answer:
+                # both a final answer *and* tool calls is ambiguous
+                raise OutputParserException(
+                    f"{FINAL_ANSWER_AND_PARSABLE_ACTION_ERROR_MESSAGE}: {text}"
+                )
+
+            actions: List[AgentAction] = []
+            for m in matches:
+                tool_name = m.group(1).strip()
+                tool_input = m.group(2).strip().strip('"')
+                print('\n--------------------------')
+                print(tool_input)
+                print('--------------------------')
+                actions.append(AgentAction(tool_name, json.loads(tool_input), text))


ReActMultiInputOutputParser.parse prints the full raw LLM output (and parsed tool inputs) to stdout. This is noisy in normal runs and can inadvertently leak prompt/response contents into logs. Prefer using the module logger at DEBUG level (or removing the prints entirely) so output is controllable.

Copilot · 2026-04-06T15:02:39Z

cais/utils/agent.py

+    executor = AgentExecutor(
+        agent=agent,
+        tools=agent_tools,
+        memory=memory, # Pass the memory object
+        verbose=True,
+        callbacks=[ConsoleCallbackHandler()], # Optional: for console debugging
+        handle_parsing_errors=True, # Let AE handle parsing errors
+        max_retries = 100
+    )


AgentExecutor is instantiated with max_retries=100, but AgentExecutor typically uses max_iterations / max_execution_time (and may not accept max_retries). If this kwarg isn't supported by the installed LangChain version, this will raise TypeError at runtime. Please verify the correct parameter name for limiting agent loops and update accordingly.

Copilot · 2026-04-06T15:02:39Z

cais/iv_llm/__init__.py

+# Configure a file handler on the "cais.iv_llm" logger so that every child
+# logger (agents, critics, etc.) automatically writes to the IV-LLM log file.
+_logger = logging.getLogger(__name__)  # "cais.iv_llm"
+if not _logger.handlers:
+    _logger.setLevel(logging.DEBUG)
+    _logger.propagate = True  # still propagate to root for console output
+    try:
+        _log_path = _get_log_path()
+        _log_path.parent.mkdir(parents=True, exist_ok=True)
+        _handler = logging.FileHandler(str(_log_path), encoding="utf-8")
+        _handler.setLevel(logging.INFO)
+        _handler.setFormatter(logging.Formatter("%(message)s"))
+        _logger.addHandler(_handler)
+    except Exception:
+        pass


This module configures a FileHandler and creates directories at import time. Since cais.iv_llm will be imported whenever the IV pipeline modules are imported, this can cause unexpected file I/O (and can fail in read-only environments) even if the user isn't explicitly trying to write logs. Consider moving logger/file-handler setup behind an explicit initialization function or an opt-in env flag.

Suggested change

# Configure a file handler on the "cais.iv_llm" logger so that every child

# logger (agents, critics, etc.) automatically writes to the IV-LLM log file.

_logger = logging.getLogger(__name__) # "cais.iv_llm"

if not _logger.handlers:

_logger.setLevel(logging.DEBUG)

_logger.propagate = True # still propagate to root for console output

try:

_log_path = _get_log_path()

_log_path.parent.mkdir(parents=True, exist_ok=True)

_handler = logging.FileHandler(str(_log_path), encoding="utf-8")

_handler.setLevel(logging.INFO)

_handler.setFormatter(logging.Formatter("%(message)s"))

_logger.addHandler(_handler)

except Exception:

pass

def _should_enable_file_logging() -> bool:

return os.getenv("IV_LLM_ENABLE_FILE_LOGGING", "").strip().lower() in {

"1",

"true",

"yes",

"on",

}

# Configure the "cais.iv_llm" logger itself at import time, but defer any

# file-system work until explicit initialization or an opt-in env flag.

_logger = logging.getLogger(__name__) # "cais.iv_llm"

_logger.setLevel(logging.DEBUG)

_logger.propagate = True # still propagate to root for console output

def initialize_file_logging(force: bool = False) -> bool:

"""Attach the IV-LLM file handler if explicitly enabled.

Returns True if file logging is enabled after this call, otherwise False.

"""

if not force and not _should_enable_file_logging():

return False

for existing_handler in _logger.handlers:

if getattr(existing_handler, "_iv_llm_file_handler", False):

return True

try:

log_path = _get_log_path()

log_path.parent.mkdir(parents=True, exist_ok=True)

handler = logging.FileHandler(str(log_path), encoding="utf-8")

handler.setLevel(logging.INFO)

handler.setFormatter(logging.Formatter("%(message)s"))

handler._iv_llm_file_handler = True

_logger.addHandler(handler)

return True

except Exception:

return False

if _should_enable_file_logging():

initialize_file_logging()

akhkim added 6 commits March 28, 2026 14:51

IV LLM refactoring

e009ff1

Dataset cleaner change

d910f0f

Added optional IV_LLM argument + documentation update

4cb8992

Restored old run_cais.py for referencing

11187f2

Added sphinx to pyproject to address readthedocs error

97836e2

Fix agent to raise error, might be better than risk running on messy …

a61ac71

…data

Copilot AI review requested due to automatic review settings April 6, 2026 14:56

Copilot started reviewing on behalf of akhkim April 6, 2026 14:56 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

akhkim and others added 6 commits April 6, 2026 14:30

Merge branch 'main' into iv_refactored

8da0911

Removed invoke_llm function and use llm.invoke instead

f3f3fde

Merge commit 'f3f3fde' into iv_refactored

036af17

Merge remote-tracking branch 'origin/main' into iv_refactored

d1f7baf

Added IV-LLM demo notebook

99c769c

Merge main into iv_refactored and resolve conflicts in README.md

e00631d

rahulbshrestha merged commit 14a65b7 into main Apr 7, 2026
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optional IV-LLM + readme explanation#24

Optional IV-LLM + readme explanation#24
rahulbshrestha merged 12 commits intomainfrom
iv_refactored

akhkim commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# Import the function to test
		from cais.methods.diff_in_diff import estimate_effect

		csv_path = os.path.join(ROOT, "data", "checked_real_data - Final.csv")
		df = pd.read_csv(csv_path)

Conversation

akhkim commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants