Skip to content

Strip wrapper directory from generated doc URLs#49

Open
danceratopz wants to merge 3 commits intoSamWilsn:masterfrom
danceratopz:strip-non-package-path
Open

Strip wrapper directory from generated doc URLs#49
danceratopz wants to merge 3 commits intoSamWilsn:masterfrom
danceratopz:strip-non-package-path

Conversation

@danceratopz
Copy link
Copy Markdown

What

When discovery is configured with a wrapper directory like paths = ["src"], drop that prefix from every generated URL. URLs become docc/build.py.html instead of src/docc/build.py.html — the source-tree directory belongs to the repo, not the package being documented.

Approach

Auto-detect at discovery time, no new config:

A discovery root is stripped iff it is not itself a Python package (i.e. contains no __init__.py).

paths entry Has __init__.py? Behavior
src/ no → wrapper strip → pkg/...
. no → wrapper strip (no-op)
mypkg/ yes → package keep → mypkg/...

PythonSource.relative_path is unchanged (still project-rooted, used for source-link display); only output_path shifts.

Listing module

Once output_path and relative_path can disagree, the listing's hierarchy lookups need a single source of truth. Two small helpers do this:

  • _hierarchy_path(source) — navigation-tree position; for an index source (__init__.py or ListingSource) it is the directory indexed, otherwise output_path.
  • _display_path(source) — file-style display path; rejoins the original filename to the URL-relative directory so a stripped __init__.py still shows under its package name.

Listing.add_source / descendants / siblings and the listing sort keys all key off _hierarchy_path. siblings() of an index source returns its own descendants — a __init__.py is a member of the package it defines, not a sibling of the parent directory's entries.

Tests

Added tests/test_python_discover.py and tests/test_listing.py covering:

  • wrapper roots stripped, package roots kept, mixed roots, namespace-style wrappers
  • _hierarchy_path / _display_path for __init__.py, ListingSource, regular modules
  • Listing.add_source / descendants / siblings invariants, including the __init__.py-is-its-own-package case

Verification

Built this repo's docs locally: tree starts at docc/, docs/docc/index.html lists all package members with correct links, no src/ leaks anywhere in the navigation.

Auto-detect whether each discovery root is itself a Python package: if
the root has no __init__.py it's treated as a wrapper (e.g. ``src/``)
and dropped from output_path so generated URLs start at the top-level
package, not the source-tree directory. Roots that *are* packages keep
their name.

The listing plugin's hierarchy lookups now key off output_path (via a
small _hierarchy_path helper) so navigation matches the URL tree once
the wrapper is stripped, and listing entries use a separate
_display_path so a file-backed index like ``__init__.py`` still shows
under its directory rather than the source-tree name.
An `__init__.py` source's hierarchy position is the directory it
indexes, so `_hierarchy_path(...).parent` pointed one level too high
and `siblings()` returned the parent listing instead of the package's
own members. Treat an index source as a member of the directory it
indexes when computing siblings, matching what regular files in the
same package see.
Drop the per-discover strip cache: detect each root's wrapper status
inline in `discover()` and pass `strip_root` directly to the source
factory. Collapse `output_path`, `_hierarchy_path`, `_display_path`,
and `Listing.siblings` to their conditional one-liners and reuse
`descendants()` for index-source siblings. Pure simplification — no
behavior change.
Copy link
Copy Markdown
Owner

@SamWilsn SamWilsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two alternate proposals we could consider:

  • We use os.path.commonpath on all of the paths from tool.docc.plugins."docc.python.discover".paths, and just strip that from everything. For our case of just src, it works. If someone else were to have src1 and src2, it wouldn't smush everything into one listing.
  • Instead of stripping prefixes, we do what GitHub does and shortcut any listing with only one child. 1 Advantage here is that this isn't python-specific, and would work for any listing. For us, this would mean [./diffs/, ./src/] would become [./diffs/, ./src/ethereum/]. Even bigger advantage would be cleaning up the diffs listing.

Footnotes

  1. Image

Comment on lines +215 to +222

An index source like ``__init__.py`` is treated as a member of
the directory it indexes, so its siblings are that directory's
entries rather than entries one level higher in the tree.
"""
source_path = source.relative_path or source.output_path
return self.sources[source_path.parent]
if Listable._index_dir(source) is not None:
return self.descendants(source)
return self.sources[_hierarchy_path(source).parent]
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In add_source, if we're still adding the index page in two places, why do we need to change this?

Comment on lines +424 to +425
display = _display_path(source)
path = display.name if node.leaf else str(display)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
display = _display_path(source)
path = display.name if node.leaf else str(display)
path = _display_path(source).name if node.leaf else str(display)

Tiny micro optimization :P

Comment on lines +95 to +96
*,
strip_root: bool = False,
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind also bumping the major version in src/docc/__init__.py? I believe this is a breaking change.

Comment on lines +197 to +202
base = (
self.absolute_path.relative_to(self.root_path)
if self._strip_root
else self._relative_path
)
return base.with_name("index") if self._is_init() else base
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
base = (
self.absolute_path.relative_to(self.root_path)
if self._strip_root
else self._relative_path
)
return base.with_name("index") if self._is_init() else base
if self._strip_root:
base = self.absolute_path.relative_to(self.root_path)
else:
base = self._relative_path
return base.with_name("index") if self._is_init() else base

Comment on lines +26 to +32
return PluginSettings(
Settings(
root,
{"tool": {"docc": {"plugins": {"docc.python.discover": {}}}}},
),
{"paths": list(paths)},
)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit more repetitive I guess, but also more correct:

Suggested change
return PluginSettings(
Settings(
root,
{"tool": {"docc": {"plugins": {"docc.python.discover": {}}}}},
),
{"paths": list(paths)},
)
plugin = "docc.python.discover"
return Settings(
root,
{"tool": {"docc": {"plugins": {plugin: {"paths": list(paths)}}}}}
).for_plugin(plugin)

]


def test_package_root_is_kept(tmp_path: Path) -> None:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we preserve the directory if it contains an __init__.py?

At first I was like, yeah, that makes sense. Now I'm kinda questioning it.

),
)

discover = PythonDiscover(_settings(tmp_path, ("src", "lib")))
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants