This document describes the format and purpose of function_index.json, a lightweight index used by AI agents to locate functions quickly without scanning hundreds of generated .cpp files or querying the SQLite database for basic metadata.
The index is generated only when C++/report generation is enabled (e.g., --generate-cpp) and at least one function exists in the functions table. It includes both functions with usable decompiled output and functions where decompilation failed.
The index is written into the per-module output directory:
{cpp_output_dir}/{module_name}/function_index.json
module_name is derived from the input file name as {stem}_{extension} (extension without the dot), then sanitized with CppGenerator.sanitize_filename() (:: -> _, non [a-zA-Z0-9_.-] replaced with _, truncated to 100 chars). For example: kernel32.dll -> kernel32_dll.
Output directory resolution:
main.pydefault: if--cpp-output-diris not provided, output goes to{sqlite_db_dir}/extracted_raw_code/{module_name}/.main.pywith--cpp-output-dir: output goes to{cpp_output_dir}/{module_name}/.headless_batch_extractor.ps1: passes--cpp-output-dir "{StorageDir}/extracted_code", so the index is written to{StorageDir}/extracted_code/{module_name}/.
The index maps each function name to the generated .cpp file(s) containing it, plus lightweight metadata (function_id, decompilation/assembly availability, and optional library tag).
{
"<function_name>": {
"files": ["string", ...],
"library": "WIL | STL | WRL | CRT | ETW/TraceLogging | null",
"function_id": 123,
"has_decompiled": true,
"has_assembly": true
}
}function_name(object key): The extracted function name, matchingfunctions.function_namefrom the analysis database. This includes C++ class methods (Class::Method), thunks, and demangled names when available.files: List of generated.cppfilenames containing the function. Files are located in the same directory as the index. Most functions appear in a single file (one-element list). Functions with the same demangled name across different grouping boundaries appear in multiple files. The list is empty ([]) when decompilation failed and no C++ output file exists.library: Optional tag for known library/runtime boilerplate. Values:WIL— Windows Implementation Library (wil::,wistd::, or mangled@wil@@,@wistd@@)STL— C++ standard library (std::,stdext::, or mangled@std@@,@stdext@@)WRL— Windows Runtime C++ Template Library (Microsoft::WRL::)CRT— C/C++ runtime support (__scrt_,__acrt_,_CRT_)ETW/TraceLogging— TraceLogging and ETW helpers (_tlgWrite,TraceLoggingCorrelationVector::)null— No library match (treat as application code)
function_id: Integer primary key fromfunctions.function_id.has_decompiled: Boolean flag.truemeans valid decompiled output was available and the function was emitted to a.cppfile;falsemeans decompilation failed or was unavailable.has_assembly: Boolean flag.truemeansfunctions.assembly_codewas present for this function;falsemeans no assembly listing was stored.
{
"IsFamilyProvisioned": {
"files": ["appinfo_dll_standalone_group_50.cpp"],
"library": null,
"function_id": 861,
"has_decompiled": true,
"has_assembly": true
},
"CSyncMLDPU::AppendAlertStatus": {
"files": ["coredpus_dll_CSyncMLDPU_group_1.cpp"],
"library": null,
"function_id": 1732,
"has_decompiled": true,
"has_assembly": true
},
"ClientBase::CreateInstallRequest": {
"files": [
"AppXDeploymentClient_dll_Windows_group_32.cpp",
"AppXDeploymentClient_dll_Windows_group_33.cpp",
"AppXDeploymentClient_dll_Windows_group_34.cpp"
],
"library": null,
"function_id": 4201,
"has_decompiled": true,
"has_assembly": true
},
"wil::details_abi::ProcessLocalStorageData<...>::MakeAndInitialize": {
"files": ["appinfo_dll_standalone_group_1.cpp"],
"library": "WIL",
"function_id": 37,
"has_decompiled": true,
"has_assembly": true
},
"SomeFailedFunc": {
"files": [],
"library": null,
"function_id": 42,
"has_decompiled": false,
"has_assembly": true
}
}- When a function name appears in multiple generated files (due to duplicate demangled names across grouping boundaries), all files are recorded in the
"files"list and a warning is logged. - Functions with failed or unavailable decompilation are still included with
"files": []and"has_decompiled": falseso agents can discover all database functions from the index alone. - Rich function-level metadata such as signatures and xref details remains in the SQLite database and
file_info.json.