Skip to content

Normalize percent-encoded URIs used as workspace dict keys#594

Merged
tombh merged 1 commit intoopenlawlibrary:mainfrom
edvilme:fix/normalize-uri-dict-keys
Mar 6, 2026
Merged

Normalize percent-encoded URIs used as workspace dict keys#594
tombh merged 1 commit intoopenlawlibrary:mainfrom
edvilme:fix/normalize-uri-dict-keys

Conversation

@edvilme
Copy link
Copy Markdown
Contributor

@edvilme edvilme commented Mar 3, 2026

Problem

URIs used as dictionary keys in the Workspace class are stored and looked up without normalization. If a client sends a percent-encoded URI (e.g., file:///C%3A/foo) on one request and a decoded form (file:///C:/foo) on another, lookups fail with KeyError because the raw strings differ despite referring to the same resource.

This is the same class of issue as microsoft/pyright#11293, where percent-encoded Windows drive letter colons (%3A) caused path resolution failures.

Fix

Apply urllib.parse.unquote to all URIs before using them as dictionary keys in:

  • _text_documents
  • _notebook_documents
  • _cell_in_notebook
  • _folders

This ensures equivalent URIs that differ only in percent encoding consistently map to the same entry.

Tests

Added 7 new tests covering:

  • Text document put/get/remove with mixed encoded/decoded URIs
  • Notebook document put/get/update with mixed encoded/decoded URIs
  • Folder add/remove with mixed encoded/decoded URIs

All 26 tests pass (19 existing + 7 new).

Copy link
Copy Markdown
Collaborator

@tombh tombh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I think you just need to add feat: to the front of the commit meesage.

@edvilme edvilme force-pushed the fix/normalize-uri-dict-keys branch from 14c6408 to 29d04f5 Compare March 4, 2026 17:44
@edvilme
Copy link
Copy Markdown
Contributor Author

edvilme commented Mar 4, 2026

This looks good, I think you just need to add feat: to the front of the commit meesage.

Updated the commit message, thanks!

@edvilme edvilme requested a review from tombh March 4, 2026 18:05
URIs used as dictionary keys in the Workspace class were stored and
looked up without normalization. If a client sent a percent-encoded URI
(e.g. file:///C%3A/foo) on one request and a decoded form (file:///C:/foo)
on another, lookups would fail with KeyError since the raw strings differ.

Apply urllib.parse.unquote to all URIs before using them as keys in
_text_documents, _notebook_documents, _cell_in_notebook, and _folders
dictionaries. This ensures equivalent URIs that differ only in percent
encoding map to the same entry.
@edvilme edvilme force-pushed the fix/normalize-uri-dict-keys branch from 29d04f5 to 01fc44e Compare March 4, 2026 19:26
@tombh tombh merged commit 27c74d3 into openlawlibrary:main Mar 6, 2026
18 checks passed
@tombh
Copy link
Copy Markdown
Collaborator

tombh commented Mar 6, 2026

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants