Skip to content

Use UTF-8 for filesystem paths#50

Merged
cwbaker merged 25 commits into
mainfrom
use-utf8-for-filesystem-paths
May 29, 2026
Merged

Use UTF-8 for filesystem paths#50
cwbaker merged 25 commits into
mainfrom
use-utf8-for-filesystem-paths

Conversation

@cwbaker

@cwbaker cwbaker commented May 18, 2026

Copy link
Copy Markdown
Owner

Replaces boost::filesystem with std::filesystem so that the Boost dependency can be removed. Uses UTF-8 for all paths by specifying a manifest on Windows and explicitly converting paths to UTF-8 when they're detected in forge_hooks.dll.

@cwbaker cwbaker self-assigned this May 18, 2026
@cwbaker cwbaker force-pushed the use-utf8-for-filesystem-paths branch 12 times, most recently from 330bf18 to 13bb415 Compare May 25, 2026 09:28
@cwbaker cwbaker force-pushed the use-utf8-for-filesystem-paths branch from 13bb415 to 1d1ece3 Compare May 28, 2026 08:10
cwbaker added 15 commits May 28, 2026 20:57
Upgrades to C++17 for all platforms.
Build with C++17 in bootstrap scripts and forge.lua.  There's no
portable local time conversion in C++20 and Forge doesn't use any other
features from C++20.  Might as well revert to C++17 for the increased
compatibility.

Bump dependency graph serialization format version to 33.  The last
write time for a target has changed from 32-bit time_t to a 64-bit
file_time_type duration.  This breaks compatibility with the existing
format.
Changes tested behavior to require a drive letter for absolute paths on
Windows.  The update to use std::filesystem instead of Boost filesystem
has brought with it a change in behavior of `is_absolute()` -- it now
requires a drive letter to be considered an absolute path on Windows.
This is better behavior than before where drives that started with "/"
on Windows might end up being placed on different drives.

Fixed by updating the tests to reflect the change in behavior.
Converts wide character paths to UTF-8 in forge_hooks_windows.cpp.
The Windows Forge executable will ship with a manifest that sets its
expected code page to be UTF-8 but because the forge_hooks DLL is
injected into processes it needs to explicitly convert wide characters
to UTF-8.  The main Forge executable can then interpret these paths as
UTF-8 too.
Any files specified in the "manifests" attribute of a target are
collected, made absolute, and passed to "mt" invocations to be built
into the embedded manifest that is linked into the executable.
Not strictly necessary because the on disk format hasn't changed but
this will avoid any errors where wide character paths weren't converted
correctly on Windows and written to the dependency graph on disk.
The original console output code page is saved and restored around
running Forge.  This ensures that any output, in particular UTF-8
encoded paths, printed to the console is shown correctly.
This is to support the maximum path length on Windows which we
previously didn't do.  Buffers are 32768 * 3 + 4 + 1 to account for
maximum characters converted to UTF-8 plus "\\?\" and a null
terminator.
- Remove Boost from bootstrap scripts
- Remove Boost references from AGENTS.md
- Remove unused boost_integration.cpp from src/error/error.forge
- Remove BOOST_ALL_NO_LIB definitions that disable linking to Boost
- Remove now unused src/boost/**
@cwbaker cwbaker force-pushed the use-utf8-for-filesystem-paths branch from 1d1ece3 to 2ea1631 Compare May 28, 2026 08:59
@cwbaker cwbaker merged commit 2ea1631 into main May 29, 2026
3 checks passed
@cwbaker cwbaker deleted the use-utf8-for-filesystem-paths branch May 29, 2026 08:33
@cwbaker cwbaker temporarily deployed to github-pages May 29, 2026 08:34 — with GitHub Pages Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant