Skip to content

Bump inspect-ai from 0.3.220 to 0.3.223#83

Merged
amrit110 merged 1 commit into
mainfrom
dependabot/uv/inspect-ai-0.3.223
May 20, 2026
Merged

Bump inspect-ai from 0.3.220 to 0.3.223#83
amrit110 merged 1 commit into
mainfrom
dependabot/uv/inspect-ai-0.3.223

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github May 19, 2026

Bumps inspect-ai from 0.3.220 to 0.3.223.

Changelog

Sourced from inspect-ai's changelog.

0.3.223 (18 May 2026)

  • Config: Add inspect log export-config command to export a run config from an existing log file.
  • Anthropic: Skip thinking blocks when placing lookback cache_control.
  • AsyncFilesystem: Add get_file() and exists() methods.
  • Inspect View: Fix regression where switching task tabs would reload log, causing latency.

0.3.222 (16 May 2026)

  • Scanners: Declare Scanner import in a way that's compatible with pyright type checking.

0.3.221 (16 May 2026)

  • OpenAI: Add GPT 5.5 as computer use model and exclude 'chat' and 'instant' models from computer use.
  • OpenAI Compatible: Parse OpenRouter-style reasoning_details in OpenAI-compatible responses.
  • Anthropic: Capture extra_body fields from Message response.
  • OpenRouter: Enable Anthropic prompt caching by default for openrouter/anthropic/* models.
  • VLLM: Preserve dotted vLLM server arg keys.
  • Bedrock: Drop unsupported sampling params for Claude 4.7+.
  • Bedrock: Route top_k correctly for Nova models.
  • SageMaker: Add prompt_logprobs support in chat mode via GenerateConfig, parse prompt logprobs from completion mode responses, enabling perplexity() and target_perplexity() scorers end-to-end.
  • Model API: --adaptive-connections is now enabled by default (defaults to 100 per model connection).
  • Model API: Cache lookup of openai and anthropic packages at sample initialization.
  • Model API: Remove semaphore around calls to count_tokens() (they are already retried and gated by max_samples).
  • Model Info: Cache model info database lookup results so that failed lookups don't repeat fuzzy model name search.
  • Limits: Added suspend_token_limit() context manager for suspending token tracking and limit enforcement within a scope.
  • Datasets: hf_dataset retries transient Hugging Face errors (rate limits, timeouts, Hub-unreachable cache misses) up to 3 times (5 in CI) with exponential backoff. Pass retry=False to disable.
  • Datasets: Reject sample ids that collide under str() coercion.
  • Datasets: Treat NaN from HuggingFace dataset as None is treated (converted to "").
  • Datasets: Use HuggingFace revision in cache key for downloaded datasets.
  • Datasets: Propagate hf_dataset(..., shuffle=True) to EvalDataset.shuffled.
  • Tool Calling: Raise a ToolError if there is a null byte in command input.
  • Scoring: Store and aggregate results for cancelled eval runs.
  • Scoring: match(numeric=True) no longer matches digit-substrings (e.g. target 5 against 25); now correctly handles negative, decimal, and scientific-notation targets, and recognises unicode-formatted numbers (unicode minus, vulgar fractions like ½, Chinese numerals, fullwidth digits) in both targets and model output.
  • Scoring: match(numeric=True, location="exact") is now strict — values like "5 some text" no longer match target "5".
  • Analysis: Use score reducer in evals_df() column name when there are multiple reducers.
  • Hooks: Cache list of registered hooks (invalidate cache on registry_add()).
  • Config: Add --run-config option to inspect eval for single-file run configuration.
  • Eval Set: Run Inspect Scout scanners over each task's logs as part of eval_set (CLI --scanner / ScannerConfig). Scans incrementally as logs land, reuses prior results across resumes, and renders progress alongside the existing eval view.
  • Eval Set: Fail fast with "No inspect tasks were found at the specified paths." when a task spec resolves to nothing (e.g. uninstalled package); previously crashed with IndexError inside resolve_tasks after passing an empty task list to eval.
  • Eval Set: Add score_display argument to eval_set() function.
  • Eval Log: Preflight ETag check on S3 conditional write (required for S3 backends that don't implement conditional writes).
  • Eval Log: Make log_file_info() robust to non-standard filenames; added log_file_info_async() / log_files_from_ls_async() so view-server header reads don't block the event loop.
  • Imports: Delay importing heavier dependencies (e.g. s3fs, boto3, numpy, rich.markdown) for faster imports of inspect_ai module.
  • Logging: INSPECT_PY_LOGGER_FORMAT env var (rich/plain/json) for non-TTY-friendly single-line console logs.
  • Docker Compose: accept depends_on / pull_policy / privileged / shm_size / ulimits in ComposeService.
  • Task Display: Honor terminal COLUMNS and LINES for dumb terminals.
  • Validation: Reject unknown GenerateConfig fields with an error.
  • Memory: Log condensing no longer retains unchanged JSON copies in long evals.
  • Memory: Don't retain message lists in buffer DB (memory leak on long agentic samples).

... (truncated)

Commits
  • 90a7b1c Update CHANGELOG for version 0.3.223
  • df27f82 Bump to latest (#3970)
  • 60c6326 AsyncFilesystem: add get_file and exists (#3964)
  • 644f46f Merge branch 'kaifronsdal-fix/cache-control-skip-thinking'
  • 61ad3ec changelog / lint
  • c7a73dc Merge branch 'main' into fix/cache-control-skip-thinking
  • c4a8f2f Add inspect log export-config command (#3959)
  • a634db7 Skip thinking blocks when placing lookback cache_control
  • eea5a68 Mount transcript search API from Scout when Scout is installed (#3947)
  • 7b0f474 docs: document running vLLM solver and judge on separate servers (#3957)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [inspect-ai](https://github.com/UKGovernmentBEIS/inspect_ai) from 0.3.220 to 0.3.223.
- [Changelog](https://github.com/UKGovernmentBEIS/inspect_ai/blob/main/CHANGELOG.md)
- [Commits](UKGovernmentBEIS/inspect_ai@0.3.220...0.3.223)

---
updated-dependencies:
- dependency-name: inspect-ai
  dependency-version: 0.3.223
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels May 19, 2026
@amrit110 amrit110 merged commit a2ae98e into main May 20, 2026
1 check passed
@amrit110 amrit110 deleted the dependabot/uv/inspect-ai-0.3.223 branch May 20, 2026 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant