Skip to content

Releases: dgunning/edgartools

v5.29.0

12 Apr 18:29

Choose a tag to compare

What's New

  • exact parameter for FactQuery.by_date_range() — New exact=True option matches facts with period dates exactly equal to the specified date, instead of the default <=/>= range behavior (#767)
  • Company.reit_subtype property — Distinguishes equity REITs from mortgage REITs by checking for mortgage-related XBRL concepts
  • Filing agent fingerprinting — Detect the filing agent (Donnelley, EDGAR Online, Workiva, Toppan Merrill) from HTML structure patterns
  • Agent-aware TOC parsing — Section detection now uses agent-specific strategies for the top 4 filing agents

Bug Fixes

  • Extra newlines in viewer.search() output (#768)
  • business_category misclassifications across 4 patterns (#774)
  • YTD periods missing fiscal_period classification (#771)
  • 61 cash flow gaap_mappings defaulting to section totals
  • Duplicate facts in XBRL DataFrame (#769)
  • period_of_report triggering network calls for local storage users

Performance

  • Cache parsed lxml tree — Eliminate redundant HTML parsing by caching across document operations

Full Changelog: https://github.com/dgunning/edgartools/blob/main/CHANGELOG.md

v5.28.5

10 Apr 02:02

Choose a tag to compare

Bug Fixes

  • Fix HTML in to_dataframe() for disclosure TextBlock concepts (#762) — Disclosure/notes statements (e.g., segment tables) contained raw HTML markup in DataFrame cells. Now sanitized to plain text.
  • Add DividendsEquity standard concept for equity statement dividends (#763) — GOOGL's dividend concept (AdjustmentsToAdditionalPaidInCapitalDividendsInExcessOfRetainedEarnings) had no standard_concept mapping on the equity statement. Added DividendsEquity to the equity vocabulary.

Documentation

  • Equity statement data layers guide — New guide explaining why face statement totals, component breakdowns, and disclosure note values differ in XBRL equity data.

Thanks to @BaraVaq for the detailed bug reports.

v5.28.4

05 Apr 18:50

Choose a tag to compare

Bug Fixes

  • fix: remove incorrect StockRepurchasesEquity mapping for tax withholding concept (#760) — Removed a wrong gaap_mappings.json entry that mapped AdjustmentsRelatedToTaxWithholdingForShareBasedCompensation to StockRepurchasesEquity (confidence 0.364). Tax withholding on RSU vests is not a stock repurchase — the misclassification caused duplicate rows in the statement of equity for companies like AAPL.

  • fix: add Q/YTD/FY period labels to equity and comprehensive income statements (#759) — Added StatementOfEquity and ComprehensiveIncome to the allow-list in rendering.py, so column headers now show friendly period labels (e.g., "Q1", "YTD") instead of raw date ranges, consistent with other financial statements.

v5.28.3

03 Apr 19:26

Choose a tag to compare

Fixed

  • Wrong quarter labels for non-calendar fiscal years — Quarter labels in financial statement columns now use the company's fiscal year end month instead of hardcoded calendar months. Affects companies like AAPL (Sep FY), WMT (Jan FY), NKE (May FY) (#752)

  • Period-type suffixes always present on DataFrame columnsto_dataframe() now always includes period-type suffixes (Q1/Q2/Q3/Q4/YTD/FY) on all duration columns, not just when end dates collide (#753)

  • Incorrect Q4 fiscal year label for Jan-Mar FYE companies — Companies with fiscal years ending in January through March (e.g., NVDA, WMT, HD, CRM) now receive the correct Q4 label rather than a label belonging to the following calendar year (#754)

  • Capex extraction broken by label regex — Capital expenditure extraction now uses XBRL concept names instead of fragile label regex matching, fixing NVDA capex from $101M (wrong) to $6,042M (correct) (#756)

v5.28.2

02 Apr 16:13

Choose a tag to compare

What's Changed

Bug Fixes

  • business_category misclassifications — Fix ETFs, SPACs, commodity trusts, and BDCs being misclassified (#561)
  • to_dataframe() — Now includes both quarterly and YTD columns instead of dropping one (#743)
  • 13F values — Normalize to dollars across all periods (#749)
  • obj() routing — SC 13D/G forms now correctly route to Schedule13D/13G parsers (#748)
  • find_ticker() — Fix wrong company ticker returned for CIK 1506307 (#745)
  • download_filings — Support download in Jupyter notebooks (#744)
  • reverse_name — Replace with improved implementation
  • Punctuation normalization — Fix handling of digits and percent signs

New Features

  • FDUS investment parser — Add support for FDUS BDC investment parsing (#747)

Documentation

  • Improve SEC Viewer guide with images, ConceptGraph section, and nav entry

v5.28.1

31 Mar 14:27

Choose a tag to compare

Bug Fixes

  • TOC section detection for split-link filings — Filings where TOC item labels and descriptive titles link to different anchors (e.g., TSLA 10-K) now validate anchor targets against expected section headings, picking the correct anchor (#742)

  • Non-accrual extraction false positives — Footnotes that explicitly deny non-accrual status (e.g., "there were no investments on non-accrual status") are no longer treated as positive matches. Replaced naive substring matching with two-stage negation-then-affirmation classification

  • Non-accrual period resolutionextract_nonaccrual() now uses filing.period_of_report as anchor for period selection instead of picking the max instant date, which could resolve to filing dates or DEI dates instead of balance sheet dates

Full Changelog: v5.28.0...v5.28.1

v5.28.0

30 Mar 17:05

Choose a tag to compare

What's New

SEC Interactive Data Viewer

New FilingViewer class provides access to the SEC's XBRL interactive data viewer for any filing — structured period headers, numeric values, scaling information, and concept-level metadata via MetaLinks.json parsing.

ConceptGraph — XBRL Knowledge Graph

New ConceptGraph builds a traversable graph of XBRL concepts and their relationships, enabling structured navigation across the taxonomy hierarchy.

BDC Non-Accrual Extraction

New extract_nonaccrual() extracts non-accrual investment data from BDC filings using three layered strategies: XBRL footnotes (investment-level detail), custom XBRL concepts, and standard us-gaap fallback.

LLM Integration

  • to_markdown() on Statement, Note, Notes, RenderedStatement, and StatementLineItem — GFM tables optimized for LLM context windows
  • to_context() expanded with detail parameter (minimal/standard/full) across all financial objects
  • compare_context() for LLM-based cross-validation of parsed values against SEC viewer output

Fixes

  • Text extraction quality — Abbreviations like U.S., D.C., e.g. no longer get split into U. S., D. C. in iXBRL filings (AAPL 10-K had 79 occurrences)
  • LinkBlock.get_text() — Missing f-string prefix caused literal {self.tag} in plain-text output (#740)
  • TOC part metadata — Correct part context propagation for 10-K table of contents (#737)
  • Code quality — 533 ruff issues resolved across the codebase

Documentation

  • SEC Viewer guide with full API reference
  • BDC guide updated with non-accrual analysis section
  • AI integration docs expanded with to_context(), to_markdown(), and compare_context() coverage

Full Changelog: v5.27.0...v5.28.0

v5.27.0

29 Mar 00:01

Choose a tag to compare

6 New Data Objects

  • SixK — Form 6-K (Report of Foreign Private Issuer) with cover page metadata, exhibit access, and press release filtering
  • RegistrationS1 — S-1/F-1 registration statements with cover page extraction and prospectus section access
  • DraftRegistrationStatement — DRS/DRS-A confidential draft registration statements
  • XmlFiling — Generic XML+XSLT SEC forms (X-17A-5, TA-1, TA-2, SBSE, ATS-N-C, CFPORTAL, etc.)
  • FundFeeNotice — 24F-2NT annual fund fee notices
  • Prospectus497K — 497K fund summary prospectus filings

Improvements

  • EightK — New content_type property (earnings, cybersecurity, restructuring, etc.), is_amendment, get_exhibit()/get_exhibits(), and context-aware to_context()
  • F-1/F-1A and F-3/F-3A/F-3ASR foreign form support added to RegistrationS1 and RegistrationS3

Bug Fixes

  • 8-K section boundary now captures full body text, not just headings (#733)
  • gaap_mappings: PaymentsToDevelopSoftware and PaymentsForSoftware corrected (#739)
  • Infinite recursion in html() for XML-primary S-1/S-3 filings fixed
  • MunicipalAdvisorForm assert narrowed to MA-I only

Documentation

  • New data object guides for Form 6-K, S-1, DRS, EFFECT, 24F-2NT, and XML filings
  • F-3 foreign shelf forms added to S-3 guide
pip install edgartools==5.27.0

v5.26.1

26 Mar 10:38

Choose a tag to compare

Fixed

  • MCP outputSchema validation error — Claude Desktop rejected every MCP tool call with "outputSchema defined but no structured output returned." Removed invalid outputSchema from tool definitions, restoring full MCP functionality (#735)
  • edgar_notes next-steps reference — referenced a non-existent tool name; corrected to valid tool
  • edgar_screen state filter silently dropped — state filter was discarded on exchange-only queries, returning unfiltered results
  • edgar_compare growth metrics broken — growth calculation failed due to insufficient time_series fetch

Improved

Full Changelog: v5.26.0...v5.26.1

v5.26.0

25 Mar 10:17

Choose a tag to compare

What's New

Features

  • S-3 shelf registration data object — New RegistrationS3 class with fee table extraction, shelf lifecycle tracking, prospectus section access, and auto-shelf detection for well-known seasoned issuers (#728)

  • CORRESP/UPLOAD correspondence support — New Correspondence and CorrespondenceThread classes parse SEC correspondence filings with automatic classification and metadata extraction. Filing.correspondence() works on any filing type to find related SEC review threads

  • Point-in-Time mode for EntityFactsEntityFacts.to_dataframe() now accepts a pit_mode parameter for lookahead-bias-free backtesting (#697)

  • TTM unification on EntityFacts — Unified TTM access with caching and fiscal year quarter labels (PR #721, @ghedo44)

Bug Fixes

  • TOC named-anchor targets now correctly resolved (#727)
  • Revenue included in promoted income statement dedup
  • Shares concepts preserved in statements and concept lookup (PR #725, @ghedo44)
  • Fixed TypeError in _get_statement_concepts when stmt type is None
  • Unit filter documentation updated

Performance

  • EntityFacts memory usage reduced 27% via string interning and allocation cleanup (AAPL: 20.5 MB → 15.0 MB)

Data

  • Bundled ticker and CUSIP reference data refreshed

Docs

Contributors

Thanks to @ghedo44 for PRs #721 and #725!