[XeGPU] uArch definition (PR 1/N)#2
Conversation
4666e33 to
5dd0ebe
Compare
There was a problem hiding this comment.
this one should be on the top I guess
There was a problem hiding this comment.
Ah, yes, I think I auto-added by co-pilot suggestion. Will remove it :)
|
A general question on the structure - what's the benefit of maintaining both yaml and C++ definitions? |
There was a problem hiding this comment.
Opaque string mapping doesn't feel user friendly. Especially as it seems to be wrapping numerical bit width that could be a numerical key directly - I assume it's some limitation due to using (conversion from) yaml.
matrix_size is even worse offender in that regard. These wrappers are hard to use without yaml definitions which again makes me wonder if we really need to split it into two separate files.
There was a problem hiding this comment.
Opaque string mapping doesn't feel user friendly. Especially as it seems to be wrapping numerical bit width that could be a numerical key directly - I assume it's some limitation due to using (conversion from) yaml.
You are right, the limitation is due to the yaml mapping.
matrix_size is even worse offender in that regard. These wrappers are hard to use without yaml definitions which again makes me wonder if we really need to split it into two separate files.
Sorry, the matrix size is a mistake. The value should be a vector of uint. But yes, in general you are right, the structs and and yaml to some extent have to be used in conjunction.
The C++ structure actually comes out of a necessity. To use yaml mapping utlity in LLVM, we have to have a C++ object mapped to it, hence the structures. And since we need the C++ structures anyway, I wanted to make some of them to be re-usable across uArchs.
Not really, not in this context anyway. But we wanted to keep the base structs in a such a way that they can be used in a non-LLVM cases, but that's more of hope not necessasity.
I thought about this, but it takes me back to tablegen. Should we use tablegen? |
It feels like the overhead of this approach in its complexity and possible maintenance cost is not really justified. On its own it seems fine but these C++ bindings are not great. 😅
It's worth a try if what we need easily translates to tablegen entries and structure. That is, if you find yourself at any point fighting against tablegen or in need to introduce custom "hacks", I would not necessarily double down on it. I would suggest starting with user interfaces to flesh out what's really needed. Then it should become clearer where to store all the uarch details e.g., embed shared memory size directly in Next quick test would be to see how well it all holds up for another target like battlemage. |
Thanks, Adam. I think that's a good idea. Let me try to collect the uArch info battlemage. I looked into them during the design, but more details may help us finding the approach that would be easy to maintain. |
llvm#138091) Check this error for more context (https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531) This fails with ``` * thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3) * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99 frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830 frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58 frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838 frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13 frame llvm#5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98 frame llvm#6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102 frame llvm#7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13 frame llvm#8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919 ``` Problem : 1) The destructor currently handles no clearance for the DeviceParser and the DeviceAct. We currently only have this https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419 2) The ownership for DeviceCI currently is present in IncrementalCudaDeviceParser. But this should be similar to how the combination for hostCI, hostAction and hostParser are managed by the Interpreter. As on master the DeviceAct and DeviceParser are managed by the Interpreter but not DeviceCI. This is problematic because : IncrementalParser holds a Sema& which points into the DeviceCI. On master, DeviceCI is destroyed before the base class ~IncrementalParser() runs, causing Parser::reset() to access a dangling Sema (and as Sema holds a reference to Preprocessor which owns PragmaNamespace) we see this ``` * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99 frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830 ```
Fix for: `Assertion failed: (false && "Architecture or OS not supported"), function CreateRegisterContextForFrame, file /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/elf-core/ThreadElfCore.cpp, line 182. PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace. #0 0x000000080cd857c8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13 #1 0x000000080cd85ed4 /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:797:3 #2 0x000000080cd82ae8 llvm::sys::RunSignalHandlers() /usr/src/contrib/llvm-project/llvm/lib/Support/Signals.cpp:104:5 #3 0x000000080cd861f0 SignalHandler /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:403:3 #4 0x000000080f159644 handle_signal /usr/src/lib/libthr/thread/thr_sig.c:298:3 `
The mcmodel=tiny memory model is only valid on ARM targets. While trying this on X86 compiler throws an internal error along with stack dump. llvm#125641 This patch resolves the issue. Reduced test case: ``` #include <stdio.h> int main( void ) { printf( "Hello, World!\n" ); return 0; } ``` ``` 0. Program arguments: /opt/compiler-explorer/clang-trunk/bin/clang++ -gdwarf-4 -g -o /app/output.s -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics -mcmodel=tiny <source> 1. <eof> parser at end of file #0 0x0000000003b10218 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b10218) #1 0x0000000003b0e35c llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b0e35c) #2 0x0000000003a5dbc3 llvm::CrashRecoveryContext::HandleExit(int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dbc3) #3 0x0000000003b05cfe llvm::sys::Process::Exit(int, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b05cfe) #4 0x0000000000d4e3eb LLVMErrorHandler(void*, char const*, bool) cc1_main.cpp:0:0 llvm#5 0x0000000003a67c93 llvm::report_fatal_error(llvm::Twine const&, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67c93) llvm#6 0x0000000003a67df8 (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67df8) llvm#7 0x0000000002549148 llvm::X86TargetMachine::X86TargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x2549148) llvm#8 0x00000000025491fc llvm::RegisterTargetMachine<llvm::X86TargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x25491fc) llvm#9 0x0000000003db74cc clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3db74cc) llvm#10 0x0000000004460d95 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4460d95) llvm#11 0x00000000060005ec clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x60005ec) llvm#12 0x00000000044614b5 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44614b5) llvm#13 0x0000000004737121 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4737121) llvm#14 0x00000000046b777b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x46b777b) llvm#15 0x00000000048229e3 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48229e3) llvm#16 0x0000000000d50621 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd50621) llvm#17 0x0000000000d48e2d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0 llvm#18 0x00000000044acc99 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0 llvm#19 0x0000000003a5dac3 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dac3) llvm#20 0x00000000044aceb9 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0 llvm#21 0x00000000044710dd clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44710dd) llvm#22 0x0000000004472071 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4472071) llvm#23 0x000000000447c3fc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x447c3fc) llvm#24 0x0000000000d4d2b1 clang_main(int, char**, llvm::ToolContext const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd4d2b1) llvm#25 0x0000000000c12464 main (/opt/compiler-explorer/clang-trunk/bin/clang+++0xc12464) llvm#26 0x00007ae43b029d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90) llvm#27 0x00007ae43b029e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40) llvm#28 0x0000000000d488c5 _start (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd488c5) ``` --------- Co-authored-by: Shashwathi N <nshashwa@pe31.hpc.amslabs.hpecorp.net>
…142952) This was removed in llvm#135343 in favour of making it a format variable, which we do here. This follows the precedent of the `[opt]` and `[artificial]` markers. Before: ``` thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17 frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3 frame #4: 0x0000000186345be4 dyld`start + 7040 ``` After (note the `[inlined]` markers): ``` thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined] frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17 frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined] frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3 frame #4: 0x0000000186345be4 dyld`start + 7040 ``` rdar://152642178
|
Hi Adam (@adam-smnk), Igor (@Garra1980 ), Modified the uarch definition based on our discussion. Please let me know what you think. I only added implementation for DPAS. Will add Load/store/prefetch if we agree on the process. Thanks a lot in advance :). P.S. I still kept some old code intentionally, if we want to go back or use some of it. I'll clean up once we agree on the process. |
|
@silee2 , @nbpatel , @chencha3 , @charithaintc , @Jianhui-Li, @akroviakov |
|
also cc @tkarna |
|
Do you think xegpu could at some point support the Data Layout and Target Information (DLTI) dialect for querying the uarch info? |
|
Thanks for adding me, @tkarna ! Agreed, it would be great to collaborate on ways to expose this target info through DLTI. I think @tkarna's work on performant schedules for Xe - which have plentiful magic values derivable from hw info - should be a good exemplar of a user of Xe's DLTI. Showing how to automatically derive those values through DLTI would be a great success indicator. I had a quick attempt at what exposing this data through DLTI might look like (here using the transform dialect though you can also do the same queries from C++): module attribute { #xevm.target<arch = "intel_gpu_pvc", ... = ?any other flags that vary across hardware?> } {
...
}
module {transform.with_named_sequence} {
transform.named_sequence @__main__(%mod: !transform.any_op) {
%number_of_eus_per_xe_core = transform.dlti.query ["xe_core","#eus"] at %mod // #xevm.target answers 8
%number_of_threads_per_eu = transform.dlti.query ["xe_core","eu","#threads"] at %mod // #xevm.target answers [8, 4]
%shared_memory_size = transform.dlti.query ["xe","shared_memory","size"] at %mod // #xevm.target answers 524288
%shared_memory_alignment = transform.dlti.query ["xe","shared_memory","alighment"] at %mod // #xevm.target answers 64
%grf_bitwidth = transform.dlti.query ["xe","grf","bitwidth"] at %mod // #xevm.target answers 512
%grf_modes = transform.dlti.query ["xe","grf","modes"] at %mod // #xevm.target answers ["small", "large"]
%grf_count_per_thread = transform.dlti.query ["xe","grf","mode","small","per_thread"] at %mod // #xevm.target answers 128
%dpas_opcode = transform.dlti.query ["xe","inst","dpas","opcode"] at %mod // #xevm.target answers 0x83
%dpas_systolic_depth = transform.dlti.query ["xe","inst","dpas","systolic_depth"] at %mod // #xevm.target answers [8]
%dpas_repeat_count = transform.dlti.query ["xe","inst","dpas","repeat_count"] at %mod // #xevm.target answers 1,2,3,4,5,6,7,8 as multiple associations for single result handle OR [1, 8] OR #dlti.map<"min" = 1, "max" = 8> OR #dlti.range<1,2,...,8>/#tune.range<1,2,...,8> which would also respond to "min" and "max" queries,
%dpas_ops_per_channel = transform.dlti.query ["xe","inst","dpas","ops_per_channel"] at %mod // #xevm.target answers #dlti.map<19: 1, 16: 2, 8: 4, 4: 8, 2:8>
%dpas_supported_types = transform.dlti.query ["xe","inst","dpas","supported_types"] at %mod // #xevm.target answers #xevm.dpas_types so that ...
%dpas_supported_types_B_wildcard = transform.dlti.query ["xe","inst","dpas","supported_types", f16, f16, f16, #dlti._] at %mod // #xevm.dpas_types (yielded by #xevm.target which answered prefix) answers f16
%dpas_supported_types_Dst_wildcard = transform.dlti.query ["xe","inst","dpas","supported_types", #dlti._, f16, f16, f16] at %mod // #xevm.dpas_types (yielded by #xevm.target which answered prefix) answers f16,f32 (i.e. two type params associated to the handle)
%dpas_supported_types_Dst_and_Acc_wildcard:2 = transform.dlti.query ["xe","inst","dpas","supported_types", #dlti._, #dlti._, f16, f16] at %mod // #xevm.dpas_types (yielded by #xevm.target which answered prefix) answers f16,f32,f16,f32 for first result and f16,f16,f32,f32 for second result
... // transform ops would take the above Values as parameters
}
}That is, there's an Not every query above is (easily) implemented with current upstream DLTI. I am working on a PR (oriented towards DLTI attributes that respond to queries by querying for target info and applying cost models) that will enable all the above that I will push as a draft this week. |
|
@tkarna , yes, we would like to. As @rolfmorel pointed out, the current DLTI is too verbose and simple, @adam-smnk & I spoke about using DLTI directly, the issue was mapping all the hardware info in the DLTI attribute was making the IR way too verbose and big (e.g., think the .yaml file in the atttibute list :(). Thanks @rolfmorel for the detailed example. I remember you said that you envision that a query can essentially take a C++ function as key. That's why in the current iteration, I tried to make the utilities a first class citizen, this way, any ops that implement these interfaces can be queried via DLTI:
This would essentially call that specific uArch's implementation of getSupportedM() function. I'd like to discuss more with you all on this. |
adam-smnk
left a comment
There was a problem hiding this comment.
I like the general direction 👍
The overall concept to have common descriptors (the uArch.h) which can have platform specific implementations (PVC, battlemage variants etc.) is great.
I haven't paid too close attention to exact implementation yet. However, it is a bit hard to really judge these helper APIs and their exact abstraction right now. I think it might be best to just start with a small prototype as soon as possible and just iterate on the go.
@chencha3 I think the on-going XeGPU verifier relaxation could great opportunity to run more hands-on experiments with this infrastructure. Perhaps a small uArch-aware validation pass for XeGPU or XeVM ops?
@tkarna If you could share what key hardware properties you already use for tuning or which further info could help you in that, it'd be super valuable feedback to help steer this design.
There was a problem hiding this comment.
What exactly is the meaning of no_of_dims.
In this case, if dims.size() is different than no_of_dims then what happens?
There was a problem hiding this comment.
You are right, we could just remove no_of_dims and get it from dims.size().
Maybe remove the class itself. Let me see.
In my preliminary tests I've use the following parameters for 4k matmul benchmark. Fixed parameters:
Tunable parameters:
Implemented constrains: tile size must not exceed the parent tile size and divide parent tile evenly. SG tiling must not exceed max number of threads (32). Prefetch tile size must be compatible with the SG level tread configuration (coop prefetching). There are other hardware constraints I have not explicitly expressed yet, for example maximum/supported load tile size and VNNI packing constraints for B. Some configs do spill registers and run slower. |
Thank you so much, @adam-smnk for the feedback :-) . |
042311e to
94ae51e
Compare
The function already exposes a work list to avoid deep recursion, this commit starts utilizing it in a helper that could also lead to a deep recursion. We have observed this crash on `clang/test/C/C99/n590.c` with our internal builds that enable aggressive optimizations and hit the limit earlier than default release builds of Clang. See the added test for an example with a deeper recursion that used to crash in upstream Clang before this change with the following stack trace: ``` #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13 #1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0) #4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0 llvm#5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 llvm#17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3 llvm#18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7 llvm#19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5 ... 700+ more stack frames. ```
94ae51e to
af7098b
Compare
Extend support in LLDB for WebAssembly. This PR adds a new Process plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly targets. It adds support for WebAssembly's memory model with separate address spaces, and the ability to fetch the call stack from the WebAssembly runtime. I have tested this change with the WebAssembly Micro Runtime (WAMR, https://github.com/bytecodealliance/wasm-micro-runtime) which implements a GDB debug stub and supports the qWasmCallStack packet. ``` (lldb) process connect --plugin wasm connect://localhost:4567 Process 1 stopped * thread #1, name = 'nobody', stop reason = trace frame #0: 0x40000000000001ad wasm32_args.wasm`main: -> 0x40000000000001ad <+3>: global.get 0 0x40000000000001b3 <+9>: i32.const 16 0x40000000000001b5 <+11>: i32.sub 0x40000000000001b6 <+12>: local.set 0 (lldb) b add Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c (lldb) c Process 1 resuming Process 1 stopped * thread #1, name = 'nobody', stop reason = breakpoint 1.1 frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 1 int 2 add(int a, int b) 3 { -> 4 return a + b; 5 } 6 7 int (lldb) bt * thread #1, name = 'nobody', stop reason = breakpoint 1.1 * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12 frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12 frame #2: 0x40000000000001fe wasm32_args.wasm ``` This PR is based on an unmerged patch from Paolo Severini: https://reviews.llvm.org/D78801. I intentionally stuck to the foundations to keep this PR small. I have more PRs in the pipeline to support the other features/packets. My motivation for supporting Wasm is to support debugging Swift compiled to WebAssembly: https://www.swift.org/documentation/articles/wasm-getting-started.html
| Operation *op = getOperation(); | ||
|
|
||
| // Use XeVM target | ||
| auto gpuModuleOp = op->getParentOfType<gpu::GPUModuleOp>(); |
There was a problem hiding this comment.
can we wrap these code as an XeGPU utility function? say getXeGPUChipStr()
There was a problem hiding this comment.
+1, I think it is worth to have a getXeArch() to hide defining PVCuArch and BMGuArch instantiation at the beginning. It sounds not scalable.
Pointers and GEP are untyped. SPIR-V required structured OpAccessChain. This means the backend will have to determine a good way to retrieve the structured access from an untyped GEP. This is not a trivial problem, and needs to be addressed to have a robust compiler. The issue is other workstreams relies on the access chain deduction to work. So we have 2 options: - pause all dependent work until we have a good chain deduction. - submit this limited fix to we can work on both this and other features in parallel. Choice we want to make is #2: submitting this **knowing this is not a good** fix. It only increase the number of patterns we can work with, thus allowing others to continue working on other parts of the backend. This patch as-is has many limitations: - If cannot robustly determine the depth of the structured access from a GEP. Fixing this would require looking ahead at the full GEP chain. - It cannot always figure out the correct access indices, especially with dynamic indices. This will require frontend collaboration. Because we know this is a temporary hack, this patch only impacts the logical SPIR-V target. Physical SPIR-V, which can rely on pointer cast remains on the old method. Related to llvm#145002
There was a problem hiding this comment.
It should be IntelGpuXe2.h here.
There was a problem hiding this comment.
Do Load and Store belong to different unit?
There was a problem hiding this comment.
I am not sure whether it is ok and necessary to expose the opcode here.
| // For example, a restriction that checks if the number of dimensions in a | ||
| // std::vector<std::vector<uint32_t>> is 2 can be represented as: | ||
| // std::vector<std::vector<uint32_t>> rt = | ||
| // {{1, 32}, {2, 16}}; Restriction<std::vector<std::vector<uint32_t>>> r1(rt, |
There was a problem hiding this comment.
nit: a format issue? Happy to see the restriction abstraction.
| std::string name = ""; // optional name of the hierarchy component | ||
| // no. of lower hierarchy component it contains, e.g., for PVC XeCore it | ||
| // contains 8 threads, so no_of_component=8 | ||
| uint32_t no_of_component; |
There was a problem hiding this comment.
nit: num_components? what is the major use of it?
| namespace xegpu { | ||
| namespace uArch { | ||
| namespace Xe2Plus { | ||
| struct XeCoreInfo { |
There was a problem hiding this comment.
can XeCoreInfo be used by other archs?
| // MemoryAccessType memory_access_type; | ||
| // std::vector<std::string> supported_types; | ||
| std::vector<uint32_t> supported_types_bitwidth; | ||
| std::map<std::string, uint32_t> alignment; |
There was a problem hiding this comment.
what is the string used for in alingment? the type name?
| } | ||
| }; | ||
|
|
||
| namespace PVCuArch { |
There was a problem hiding this comment.
nit: maybe we can remove PVCuArch namespace if it is not hard required.
| assert(tile.size() == 2); | ||
| return tile[1] * array_len * | ||
| (dataType.getIntOrFloatBitWidth() / 8) <= | ||
| 64; |
There was a problem hiding this comment.
nit: format issue.
Is it also available for store2d, which doesn't have array length support in, e.g., pvc
| Operation *op = getOperation(); | ||
|
|
||
| // Use XeVM target | ||
| auto gpuModuleOp = op->getParentOfType<gpu::GPUModuleOp>(); |
There was a problem hiding this comment.
+1, I think it is worth to have a getXeArch() to hide defining PVCuArch and BMGuArch instantiation at the beginning. It sounds not scalable.
…lvm#152156) With this new A320 in-order core, we follow adding the FeatureUseFixedOverScalableIfEqualCost feature to A510 and A520 (llvm#132246), which reaps the same code generation benefits of preferring fixed over scalable when the cost is equal. So when we have: ``` void foo(float* a, float* b, float* dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` When compiling without the feature enabled, we get: ``` ... ld1b { z0.b }, p0/z, [x0, x10] ld1b { z2.b }, p0/z, [x1, x10] add x12, x0, x10 ldr z1, [x12, #1, mul vl] add x12, x1, x10 ldr z3, [x12, #1, mul vl] fadd z0.s, z2.s, z0.s add x12, x2, x10 fadd z1.s, z3.s, z1.s dech x11 st1b { z0.b }, p0, [x2, x10] incb x10, all, mul #2 str z1, [x12, #1, mul vl] ... ``` When compiling with, we get: ``` ... ldp q0, q1, [x12, #-16] ldp q2, q3, [x11, #-16] subs x13, x13, llvm#8 fadd v0.4s, v2.4s, v0.4s fadd v1.4s, v3.4s, v1.4s add x11, x11, llvm#32 add x12, x12, llvm#32 stp q0, q1, [x10, #-16] add x10, x10, llvm#32 ... ```
## Problem When the new setting ``` set target.parallel-module-load true ``` was added, lldb began fetching modules from the devices from multiple threads simultaneously. This caused crashes of lldb when debugging on android devices. The top of the stack in the crash look something like this: ``` #0 0x0000555aaf2b27fe llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/llvm/bin/lldb-dap+0xb87fe) #1 0x0000555aaf2b0a99 llvm::sys::RunSignalHandlers() (/opt/llvm/bin/lldb-dap+0xb6a99) #2 0x0000555aaf2b2fda SignalHandler(int, siginfo_t*, void*) (/opt/llvm/bin/lldb-dap+0xb8fda) #3 0x00007f9c02444560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 #4 0x00007f9c04ea7707 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) (usr/bin/../lib/liblldb.so.15+0x22a7707) llvm#5 0x00007f9c04ea5b41 lldb_private::ConnectionFileDescriptor::~ConnectionFileDescriptor() (usr/bin/../lib/liblldb.so.15+0x22a5b41) llvm#6 0x00007f9c04ea5c1e lldb_private::ConnectionFileDescriptor::~ConnectionFileDescriptor() (usr/bin/../lib/liblldb.so.15+0x22a5c1e) llvm#7 0x00007f9c052916ff lldb_private::platform_android::AdbClient::SyncService::Stat(lldb_private::FileSpec const&, unsigned int&, unsigned int&, unsigned int&) (usr/bin/../lib/liblldb.so.15+0x26916ff) llvm#8 0x00007f9c0528b9dc lldb_private::platform_android::PlatformAndroid::GetFile(lldb_private::FileSpec const&, lldb_private::FileSpec const&) (usr/bin/../lib/liblldb.so.15+0x268b9dc) ``` Our workaround was to set `set target.parallel-module-load ` to `false` to avoid the crash. ## Background PlatformAndroid creates two different classes with one stateful adb connection shared between the two -- one through AdbClient and another through AdbClient::SyncService. The connection management and state is complex, and seems to be responsible for the segfault we are seeing. The AdbClient code resets these connections at times, and re-establishes connections if they are not active. Similarly, PlatformAndroid caches its SyncService, which uses an AdbClient class, but the SyncService puts its connection into a different 'sync' state that is incompatible with a standard connection. ## Changes in this diff * This diff refactors the code to (hopefully) have clearer ownership of the connection, clearer separation of AdbClient and SyncService by making a new class for clearer separations of concerns, called AdbSyncService. * New unit tests are added * Additional logs were added (see llvm#145382 (comment) for details)
…namic (llvm#153420) Canonicalizing the following IR: ``` func.func @mul_zero_dynamic_nofold(%arg0: tensor<?x17xf32>) -> tensor<?x17xf32> { %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1x1xf32> %1 = "tosa.const"() <{values = dense<0> : tensor<1xi8>}> : () -> tensor<1xi8> %2 = tosa.mul %arg0, %0, %1 : (tensor<?x17xf32>, tensor<1x1xf32>, tensor<1xi8>) -> tensor<?x17xf32> return %2 : tensor<?x17xf32> } ``` resulted in a crash ``` #0 0x000056513187e8db backtrace (./build-release/bin/mlir-opt+0x9d698db) #1 0x0000565131b17737 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:838:8 #2 0x0000565131b187f3 PrintStackTraceSignalHandler(void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:918:1 #3 0x0000565131b18c30 llvm::sys::RunSignalHandlers() /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Signals.cpp:105:18 #4 0x0000565131b18c30 SignalHandler(int, siginfo_t*, void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:409:3 llvm#5 0x00007f2e4165b050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050) llvm#6 0x00007f2e416a9eec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 llvm#7 0x00007f2e4165afb2 raise ./signal/../sysdeps/posix/raise.c:27:6 llvm#8 0x00007f2e41645472 abort ./stdlib/abort.c:81:7 llvm#9 0x00007f2e41645395 _nl_load_domain ./intl/loadmsgcat.c:1177:9 llvm#10 0x00007f2e41653ec2 (/lib/x86_64-linux-gnu/libc.so.6+0x34ec2) llvm#11 0x00005651443ec4ba mlir::DenseIntOrFPElementsAttr::getRaw(mlir::ShapedType, llvm::ArrayRef<char>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:1361:3 llvm#12 0x00005651443f1209 mlir::DenseElementsAttr::resizeSplat(mlir::ShapedType) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:0:10 llvm#13 0x000056513f76f2b6 mlir::tosa::MulOp::fold(mlir::tosa::MulOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute>>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:0:0 ``` from the folder for `tosa::mul` since the zero value was being reshaped to `?x17` size which isn't supported. AFAIK, `tosa.const` requires all dimensions to be static. So in this case, the fix is to not to fold the op.
This can happen when JIT code is run, and we can't symbolize those
frames, but they should remain numbered in the stack. An example
spidermonkey trace:
```
#0 0x564ac90fb80f (/builds/worker/dist/bin/js+0x240e80f) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
#1 0x564ac9223a64 (/builds/worker/dist/bin/js+0x2536a64) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
#2 0x564ac922316f (/builds/worker/dist/bin/js+0x253616f) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
#3 0x564ac9eac032 (/builds/worker/dist/bin/js+0x31bf032) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
#4 0x0dec477ca22e (<unknown module>)
```
Without this change, the following symbolization is output:
```
#0 0x55a6d72f980f in MOZ_CrashSequence /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:248:3
#1 0x55a6d72f980f in Crash(JSContext*, unsigned int, JS::Value*) /builds/worker/checkouts/gecko/js/src/shell/js.cpp:4223:5
#2 0x55a6d7421a64 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:501:13
#3 0x55a6d742116f in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:597:12
#4 0x55a6d80aa032 in js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICFallbackStub*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) /builds/worker/checkouts/gecko/js/src/jit/BaselineIC.cpp:1705:10
#4 0x2c803bd8f22e (<unknown module>)
```
The last frame has a duplicate number. With this change the numbering is
correct:
```
#0 0x5620c58ec80f in MOZ_CrashSequence /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:248:3
#1 0x5620c58ec80f in Crash(JSContext*, unsigned int, JS::Value*) /builds/worker/checkouts/gecko/js/src/shell/js.cpp:4223:5
#2 0x5620c5a14a64 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:501:13
#3 0x5620c5a1416f in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:597:12
#4 0x5620c669d032 in js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICFallbackStub*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) /builds/worker/checkouts/gecko/js/src/jit/BaselineIC.cpp:1705:10
llvm#5 0x349f24c7022e (<unknown module>)
```
…build breakage from llvm#155943) (llvm#156103) ASan now detects dereferences of zero-sized allocations (llvm#155943; the corresponding MSan change is llvm#155944). This appears to have detected a bug in CrossOverTest.cpp, causing a buildbot breakage. This patch fixes the test. Buildbot report: https://lab.llvm.org/buildbot/#/builders/4/builds/8732 ``` 7: ==949882==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf169cfbe0010 at pc 0xb5f45efc6d1c bp 0xffffd933e460 sp 0xffffd933e458 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 8: READ of size 1 at 0xf169cfbe0010 thread T0 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 9: #0 0xb5f45efc6d18 in LLVMFuzzerTestOneInput /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/compiler-rt/test/fuzzer/CrossOverTest.cpp:48:7 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:20'1 ? possible intended match 10: #1 0xb5f45eec7288 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:619:13 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 11: #2 0xb5f45eec85d4 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:812:3 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 12: #3 0xb5f45eec8c60 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:872:3 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 13: #4 0xb5f45eeb5c64 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:923:6 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 14: llvm#5 0xb5f45eee09d0 in main /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10 check:20'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` For context, FuzzerLoop.cpp:812 tries empty input: ``` 810 // Test the callback with empty input and never try it again. 811 uint8_t dummy = 0; 812 ExecuteCallback(&dummy, 0); ```
Reverts llvm#154949 due to suspected buildbot breakage (https://lab.llvm.org/buildbot/#/builders/55/builds/16630/steps/11/logs/stdio). Previously commented on the original pull request: llvm#154949 (comment) ``` ******************** TEST 'MLIR :: Dialect/XeGPU/subgroup-distribute.mlir' FAILED ******************** ... # | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. # | Stack dump: # | 0. Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/mlir-opt -xegpu-subgroup-distribute -allow-unregistered-dialect -canonicalize -cse -split-input-file /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/test/Dialect/XeGPU/subgroup-distribute.mlir # | #0 0x0000c0af4b066df0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:834:13 # | #1 0x0000c0af4b060e20 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:105:18 # | #2 0x0000c0af4b0691b4 SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:426:38 # | #3 0x0000ee25a3dcb8f8 (linux-vdso.so.1+0x8f8) # | #4 0x0000ee25a36c7608 (/lib/aarch64-linux-gnu/libc.so.6+0x87608) # | llvm#5 0x0000ee25a367cb3c raise (/lib/aarch64-linux-gnu/libc.so.6+0x3cb3c) # | llvm#6 0x0000ee25a3667e00 abort (/lib/aarch64-linux-gnu/libc.so.6+0x27e00) # | llvm#7 0x0000c0af4ae7e4b0 __sanitizer::Atexit(void (*)()) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:168:10 # | llvm#8 0x0000c0af4ae7c354 __sanitizer::Die() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5 # | llvm#9 0x0000c0af4ae66a30 Unlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:250:16 # | llvm#10 0x0000c0af4ae66a30 ~GenericScopedLock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:386:51 # | llvm#11 0x0000c0af4ae66a30 __hwasan::ScopedReport::~ScopedReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:54:5 # | llvm#12 0x0000c0af4ae661b8 __hwasan::(anonymous namespace)::BaseReport::~BaseReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:477:7 # | llvm#13 0x0000c0af4ae63f5c __hwasan::ReportTagMismatch(__sanitizer::StackTrace*, unsigned long, unsigned long, bool, bool, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:1094:1 # | llvm#14 0x0000c0af4ae4f8e0 Destroy /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:532:31 # | llvm#15 0x0000c0af4ae4f8e0 ~InternalMmapVector /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:642:56 # | llvm#16 0x0000c0af4ae4f8e0 __hwasan::HandleTagMismatch(__hwasan::AccessInfo, unsigned long, unsigned long, void*, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:245:1 # | llvm#17 0x0000c0af4ae51e8c __hwasan_tag_mismatch4 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:764:1 # | llvm#18 0x0000c0af4ae67b30 __interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/interception/interception_linux.cpp:60:0 # | llvm#19 0x0000c0af5641cd24 getNumResults /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Operation.h:404:37 # | llvm#20 0x0000c0af5641cd24 getOpResultImpl /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Operation.h:1010:5 # | llvm#21 0x0000c0af5641cd24 getResult /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Operation.h:407:54 # | llvm#22 0x0000c0af5641cd24 mlir::OpTrait::detail::MultiResultTraitBase<mlir::gpu::WarpExecuteOnLane0Op, mlir::OpTrait::VariadicResults>::getResult(unsigned int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/OpDefinition.h:638:62 # | llvm#23 0x0000c0af56426b60 getType /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Value.h:63:33 # | llvm#24 0x0000c0af56426b60 getType /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/include/mlir/IR/Value.h:105:39 # | llvm#25 0x0000c0af56426b60 (anonymous namespace)::LoadDistribution::matchAndRewrite(mlir::gpu::WarpExecuteOnLane0Op, mlir::PatternRewriter&) const /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp:991:55 ... ```
A few improvements to logging when lldb-dap is started in **Server Mode** AND when the **`lldb-dap.logFolder`** setting is used (not `lldb-dap.log-path`). ### Improvement #1 **Avoid the prompt of restarting the server when starting each debug session.** That prompt is caused by the combination of the following facts: 1. The log filename changes every time a new debug session is starting (see [here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/logging.ts#L47)) 2. The log filename is passed to the server via an environment variable called "LLDBDAP_LOG" (see [here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/debug-adapter-factory.ts#L263-L269)) 3. All environment variables are put into the "spawn info" variable (see [here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/lldb-dap-server.ts#L170-L172)). 4. The old and new "spawn info" are compared to decide if a prompt should show (see [here](https://github.com/llvm/llvm-project/blob/9d6062c490548a5e6fea103e010ab3c9bc73a86d/lldb/tools/lldb-dap/src-ts/lldb-dap-server.ts#L107-L110)). The fix is to remove the "LLDBDAP_LOG" from the "spawn info" variable, so that the same server can be reused if the log path is the only thing that has changed. ### Improvement #2 **Avoid log file conflict when multiple users share a machine and start server in the same second.** The problem: If two users start lldb-dap server in the same second, they will share the same log path. The first user will create the log file. The second user will find that they cannot access the same file, so their server will fail to start. The fix is to add a part of the VS Code session ID to the log filename. ### Improvement #3 **Avoid restarting the server when the order of environment variables changed.** This is done by sorting the environment variables before putting them into the "spawn info".
Need this as `mlir/dialects/transform/smt.py` imports it: ```py from .._transform_smt_extension_ops_gen import * from .._transform_smt_extension_ops_gen import _Dialect ```
A recent change adding a new sanitizer kind (via Sanitizers.def) was reverted in c74fa20 ("Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind" (llvm#162413)"). The reason was this ASan report, when running the test cases in clang/test/Preprocessor/print-header-json.c: ``` ==clang==483265==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7d82b97e8b58 at pc 0x562cd432231f bp 0x7fff3fad0850 sp 0x7fff3fad0848 READ of size 16 at 0x7d82b97e8b58 thread T0 #0 0x562cd432231e in __copy_non_overlapping_range<const unsigned long *, const unsigned long *> zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:2144:38 #1 0x562cd432231e in void std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__init_with_size[abi:nn220000]<unsigned long const*, unsigned long const*>(unsigned long const*, unsigned long const*, unsigned long) zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:2685:18 #2 0x562cd41e2797 in __init<const unsigned long *, 0> zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:2673:3 #3 0x562cd41e2797 in basic_string<const unsigned long *, 0> zorg-test/libcxx_install_asan_ubsan/include/c++/v1/string:1174:5 #4 0x562cd41e2797 in clang::ASTReader::ReadString(llvm::SmallVectorImpl<unsigned long> const&, unsigned int&) clang/lib/Serialization/ASTReader.cpp:10171:15 llvm#5 0x562cd41fd89a in clang::ASTReader::ParseLanguageOptions(llvm::SmallVector<unsigned long, 64u> const&, llvm::StringRef, bool, clang::ASTReaderListener&, bool) clang/lib/Serialization/ASTReader.cpp:6475:28 llvm#6 0x562cd41eea53 in clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) clang/lib/Serialization/ASTReader.cpp:3069:11 llvm#7 0x562cd4204ab8 in clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) clang/lib/Serialization/ASTReader.cpp:3249:15 llvm#8 0x562cd42097d2 in clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) clang/lib/Serialization/ASTReader.cpp:5182:15 llvm#9 0x562cd421ec77 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, clang::serialization::ModuleFile**) clang/lib/Serialization/ASTReader.cpp:4828:11 llvm#10 0x562cd3d07b74 in clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) clang/lib/Frontend/CompilerInstance.cpp:1805:27 llvm#11 0x562cd3d0b2ef in clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<clang::IdentifierLoc>, clang::Module::NameVisibilityKind, bool) clang/lib/Frontend/CompilerInstance.cpp:1956:31 llvm#12 0x562cdb04eb1c in clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2423:49 llvm#13 0x562cdb042222 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2101:17 llvm#14 0x562cdb043366 in clang::Preprocessor::HandleDirective(clang::Token&) clang/lib/Lex/PPDirectives.cpp:1338:14 llvm#15 0x562cdafa84bc in clang::Lexer::LexTokenInternal(clang::Token&, bool) clang/lib/Lex/Lexer.cpp:4512:7 llvm#16 0x562cdaf9f20b in clang::Lexer::Lex(clang::Token&) clang/lib/Lex/Lexer.cpp:3729:24 llvm#17 0x562cdb0d4ffa in clang::Preprocessor::Lex(clang::Token&) clang/lib/Lex/Preprocessor.cpp:896:11 llvm#18 0x562cd77da950 in clang::ParseAST(clang::Sema&, bool, bool) clang/lib/Parse/ParseAST.cpp:163:7 [...] 0x7d82b97e8b58 is located 0 bytes after 3288-byte region [0x7d82b97e7e80,0x7d82b97e8b58) allocated by thread T0 here: #0 0x562cca76f604 in malloc zorg-test/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:67:3 #1 0x562cd1cce452 in safe_malloc llvm/include/llvm/Support/MemAlloc.h:26:18 #2 0x562cd1cce452 in llvm::SmallVectorBase<unsigned int>::grow_pod(void*, unsigned long, unsigned long) llvm/lib/Support/SmallVector.cpp:151:15 #3 0x562cdbe1768b in grow_pod llvm/include/llvm/ADT/SmallVector.h:139:11 #4 0x562cdbe1768b in grow llvm/include/llvm/ADT/SmallVector.h:525:41 llvm#5 0x562cdbe1768b in reserve llvm/include/llvm/ADT/SmallVector.h:665:13 llvm#6 0x562cdbe1768b in llvm::BitstreamCursor::readRecord(unsigned int, llvm::SmallVectorImpl<unsigned long>&, llvm::StringRef*) llvm/lib/Bitstream/Reader/BitstreamReader.cpp:230:10 llvm#7 0x562cd41ee8ab in clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) clang/lib/Serialization/ASTReader.cpp:3060:49 llvm#8 0x562cd4204ab8 in clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) clang/lib/Serialization/ASTReader.cpp:3249:15 llvm#9 0x562cd42097d2 in clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) clang/lib/Serialization/ASTReader.cpp:5182:15 llvm#10 0x562cd421ec77 in clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, clang::serialization::ModuleFile**) clang/lib/Serialization/ASTReader.cpp:4828:11 llvm#11 0x562cd3d07b74 in clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) clang/lib/Frontend/CompilerInstance.cpp:1805:27 llvm#12 0x562cd3d0b2ef in clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<clang::IdentifierLoc>, clang::Module::NameVisibilityKind, bool) clang/lib/Frontend/CompilerInstance.cpp:1956:31 llvm#13 0x562cdb04eb1c in clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2423:49 llvm#14 0x562cdb042222 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) clang/lib/Lex/PPDirectives.cpp:2101:17 llvm#15 0x562cdb043366 in clang::Preprocessor::HandleDirective(clang::Token&) clang/lib/Lex/PPDirectives.cpp:1338:14 llvm#16 0x562cdafa84bc in clang::Lexer::LexTokenInternal(clang::Token&, bool) clang/lib/Lex/Lexer.cpp:4512:7 llvm#17 0x562cdaf9f20b in clang::Lexer::Lex(clang::Token&) clang/lib/Lex/Lexer.cpp:3729:24 llvm#18 0x562cdb0d4ffa in clang::Preprocessor::Lex(clang::Token&) clang/lib/Lex/Preprocessor.cpp:896:11 llvm#19 0x562cd77da950 in clang::ParseAST(clang::Sema&, bool, bool) clang/lib/Parse/ParseAST.cpp:163:7 [...] SUMMARY: AddressSanitizer: heap-buffer-overflow clang/lib/Serialization/ASTReader.cpp:10171:15 in clang::ASTReader::ReadString(llvm::SmallVectorImpl<unsigned long> const&, unsigned int&) ``` The reason is this particular RUN line: ``` // RUN: env CC_PRINT_HEADERS_FORMAT=json CC_PRINT_HEADERS_FILTERING=direct-per-file CC_PRINT_HEADERS_FILE=%t.txt %clang -fsyntax-only -I %S/Inputs/print-header-json -isystem %S/Inputs/print-header-json/system -fmodules -fimplicit-module-maps -fmodules-cache-path=%t %s -o /dev/null ``` which was added in 8df194f ("[Clang] Support includes translated to module imports in -header-include-filtering=direct-per-file (llvm#156756)"). The problem is caused by an incremental build reusing stale cached module files (.pcm) that are no longer binary-compatible with the updated compiler. Adding a new sanitizer option altered the implicit binary layout of the serialized LangOptions data structure. The build + test system is oblivious to such changes. When the new compiler attempted to read the old module file (from the previous test invocation), it misinterpreted the data due to the layout mismatch, resulting in a heap-buffer-overflow. Unfortunately Clang's PCM format does not encode nor detect version mismatches here; a more graceful failure mode would be preferable. For now, fix the test to be more robust with incremental build + test.
**Mitigation for:** google/sanitizers#749 **Disclosure:** I'm not an ASan compiler expert yet (I'm trying to learn!), I primarily work in the runtime. Some of this PR was developed with the help of AI tools (primarily as a "fuzzy `grep` engine"), but I've manually refined and tested the output, and can speak for every line. In general, I used it only to orient myself and for "rubberducking". **Context:** The msvc ASan team (👋 ) has received an internal request to improve clang's exception handling under ASan for Windows. Namely, we're interested in **mitigating** this bug: google/sanitizers#749 To summarize, today, clang + ASan produces a false-positive error for this program: ```C++ #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } ``` The error reads as such: ``` C:\Users\dajusto\source\repros\upstream>type main.cpp #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } C:\Users\dajusto\source\repros\upstream>"C:\Users\dajusto\source\repos\llvm-project\build.runtimes\bin\clang.exe" -fsanitize=address -g -O0 main.cpp C:\Users\dajusto\source\repros\upstream>a.exe ================================================================= ==19112==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x7ff72c7c11d9 bp 0x0080000ff960 sp 0x0080000fcf50 T0) ==19112==The signal is caused by a READ memory access. ==19112==Hint: address points to the zero page. #0 0x7ff72c7c11d8 in main C:\Users\dajusto\source\repros\upstream\main.cpp:8 #1 0x7ff72c7d479f in _CallSettingFrame C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm:49 #2 0x7ff72c7c8944 in __FrameHandler3::CxxCallCatchBlock(struct _EXCEPTION_RECORD *) C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\frame.cpp:1567 #3 0x7ffb4a90e3e5 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18012e3e5) #4 0x7ff72c7c1128 in main C:\Users\dajusto\source\repros\upstream\main.cpp:6 llvm#5 0x7ff72c7c33db in invoke_main C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78 llvm#6 0x7ff72c7c33db in __scrt_common_main_seh C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 llvm#7 0x7ffb49b05c06 (C:\WINDOWS\System32\KERNEL32.DLL+0x180035c06) llvm#8 0x7ffb4a8455ef (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800655ef) ==19112==Register values: rax = 0 rbx = 80000ff8e0 rcx = 27d76d00000 rdx = 80000ff8e0 rdi = 80000fdd50 rsi = 80000ff6a0 rbp = 80000ff960 rsp = 80000fcf50 r8 = 100 r9 = 19930520 r10 = 8000503a90 r11 = 80000fd540 r12 = 80000fd020 r13 = 0 r14 = 80000fdeb8 r15 = 0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation C:\Users\dajusto\source\repros\upstream\main.cpp:8 in main ==19112==ABORTING ``` The root of the issue _appears to be_ that ASan's instrumentation is incompatible with Window's assumptions for instantiating `catch`-block's parameters (`ex` in the snippet above). The nitty gritty details are lost on me, but I understand that to make this work without loss of ASan coverage, a "serious" refactoring is needed. In the meantime, users risk false positive errors when pairing ASan + catch-block parameters on Windows. **To mitigate this** I think we should avoid instrumenting catch-block parameters on Windows. It appears to me this is as "simple" as marking catch block parameters as "uninteresting" in `AddressSanitizer::isInterestingAlloca`. My manual tests seem to confirm this. I believe this is strictly better than today's status quo, where the runtime generates false positives. Although we're now explicitly choosing to instrument less, the benefit is that now more programs can run with ASan without _funky_ macros that disable ASan on exception blocks. **This PR:** implements the mitigation above, and creates a simple new test for it. _Thanks!_ --------- Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>
…am (llvm#167724) This got exposed by `09262656f32ab3f2e1d82e5342ba37eecac52522`. The underlying stream of `m_os` is referenced by the `TextDiagnostic` member of `TextDiagnosticPrinter`. It got turned into a `llvm::formatted_raw_ostream` in the commit above. When `~TextDiagnosticPrinter` (and thus `~TextDiagnostic`) is invoked, we now call `~formatted_raw_ostream`, which tries to access the underlying stream. But `m_os` was already deleted because it is earlier in the order of destruction in `TextDiagnosticPrinter`. Move the `m_os` member before the `TextDiagnosticPrinter` to avoid a use-after-free. Drive-by: * Also move the `m_output` member which the `m_os` holds a reference to. The fact it's a reference indicates the expectation is most likely that the string outlives the stream. The ASAN macOS bot is currently failing with this: ``` 08:15:39 ================================================================= 08:15:39 ==61103==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600012cf40 at pc 0x00012140d304 bp 0x00016eecc850 sp 0x00016eecc848 08:15:39 READ of size 8 at 0x60600012cf40 thread T0 08:15:39 #0 0x00012140d300 in llvm::formatted_raw_ostream::releaseStream() FormattedStream.h:205 08:15:39 #1 0x00012140d3a4 in llvm::formatted_raw_ostream::~formatted_raw_ostream() FormattedStream.h:145 08:15:39 #2 0x00012604abf8 in clang::TextDiagnostic::~TextDiagnostic() TextDiagnostic.cpp:721 08:15:39 #3 0x00012605dc80 in clang::TextDiagnosticPrinter::~TextDiagnosticPrinter() TextDiagnosticPrinter.cpp:30 08:15:39 #4 0x00012605dd5c in clang::TextDiagnosticPrinter::~TextDiagnosticPrinter() TextDiagnosticPrinter.cpp:27 08:15:39 llvm#5 0x0001231fb210 in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 llvm#6 0x0001231fb3bc in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 llvm#7 0x000129aa9d70 in clang::DiagnosticsEngine::~DiagnosticsEngine() Diagnostic.cpp:91 08:15:39 llvm#8 0x0001230436b8 in llvm::RefCountedBase<clang::DiagnosticsEngine>::Release() const IntrusiveRefCntPtr.h:103 08:15:39 llvm#9 0x0001231fe6c8 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 08:15:39 llvm#10 0x0001231fe858 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 ... 08:15:39 08:15:39 0x60600012cf40 is located 32 bytes inside of 56-byte region [0x60600012cf20,0x60600012cf58) 08:15:39 freed by thread T0 here: 08:15:39 #0 0x0001018abb88 in _ZdlPv+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb88) 08:15:39 #1 0x0001231fb1c0 in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #2 0x0001231fb3bc in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #3 0x000129aa9d70 in clang::DiagnosticsEngine::~DiagnosticsEngine() Diagnostic.cpp:91 08:15:39 #4 0x0001230436b8 in llvm::RefCountedBase<clang::DiagnosticsEngine>::Release() const IntrusiveRefCntPtr.h:103 08:15:39 llvm#5 0x0001231fe6c8 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 08:15:39 llvm#6 0x0001231fe858 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 ... 08:15:39 08:15:39 previously allocated by thread T0 here: 08:15:39 #0 0x0001018ab760 in _Znwm+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4b760) 08:15:39 #1 0x0001231f8dec in lldb_private::ClangModulesDeclVendor::Create(lldb_private::Target&) ClangModulesDeclVendor.cpp:732 08:15:39 #2 0x00012320af58 in lldb_private::ClangPersistentVariables::GetClangModulesDeclVendor() ClangPersistentVariables.cpp:124 08:15:39 #3 0x0001232111f0 in lldb_private::ClangUserExpression::PrepareForParsing(lldb_private::DiagnosticManager&, lldb_private::ExecutionContext&, bool) ClangUserExpression.cpp:536 08:15:39 #4 0x000123213790 in lldb_private::ClangUserExpression::Parse(lldb_private::DiagnosticManager&, lldb_private::ExecutionContext&, lldb_private::ExecutionPolicy, bool, bool) ClangUserExpression.cpp:647 08:15:39 llvm#5 0x00012032b258 in lldb_private::UserExpression::Evaluate(lldb_private::ExecutionContext&, lldb_private::EvaluateExpressionOptions const&, llvm::StringRef, llvm::StringRef, std::__1::shared_ptr<lldb_private::ValueObject>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, lldb_private::ValueObject*) UserExpression.cpp:280 08:15:39 llvm#6 0x000120724010 in lldb_private::Target::EvaluateExpression(llvm::StringRef, lldb_private::ExecutionContextScope*, std::__1::shared_ptr<lldb_private::ValueObject>&, lldb_private::EvaluateExpressionOptions const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, lldb_private::ValueObject*) Target.cpp:2905 08:15:39 llvm#7 0x00011fc7bde0 in lldb::SBTarget::EvaluateExpression(char const*, lldb::SBExpressionOptions const&) SBTarget.cpp:2305 08:15:39 ==61103==ABORTING ... ```
… errors (llvm#169989) We can see the following while running clang-repl in C mode ``` anutosh491@vv-nuc:/build/anutosh491/llvm-project/build/bin$ ./clang-repl --Xcc=-x --Xcc=c --Xcc=-std=c23 clang-repl> printf("hi\n"); In file included from <<< inputs >>>:1: input_line_1:1:1: error: call to undeclared library function 'printf' with type 'int (const char *, ...)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1 | printf("hi\n"); | ^ input_line_1:1:1: note: include the header <stdio.h> or explicitly provide a declaration for 'printf' error: Parsing failed. clang-repl> #include <stdio.h> hi ``` In debug mode while dumping the generated Module, i see this ``` clang-repl> printf("hi\n"); In file included from <<< inputs >>>:1: input_line_1:1:1: error: call to undeclared library function 'printf' with type 'int (const char *, ...)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1 | printf("hi\n"); | ^ input_line_1:1:1: note: include the header <stdio.h> or explicitly provide a declaration for 'printf' error: Parsing failed. clang-repl> #include <stdio.h> === compile-ptu 1 === [TU=0x55556cfbf830, M=0x55556cfc13a0 (incr_module_1)] [LLVM IR] ; ModuleID = 'incr_module_1' source_filename = "incr_module_1" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @.str = private unnamed_addr constant [4 x i8] c"hi\0A\00", align 1 @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_incr_module_1, ptr null }] define internal void @__stmts__0() #0 { entry: %call = call i32 (ptr, ...) @printf(ptr noundef @.str) ret void } declare i32 @printf(ptr noundef, ...) #1 ; Function Attrs: noinline nounwind uwtable define internal void @_GLOBAL__sub_I_incr_module_1() #2 section ".text.startup" { entry: call void @__stmts__0() ret void } attributes #0 = { "min-legal-vector-width"="0" } attributes #1 = { "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" } attributes #2 = { noinline nounwind uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" } !llvm.module.flags = !{!0, !1, !2, !3, !4} !llvm.ident = !{!5} !0 = !{i32 1, !"wchar_size", i32 4} !1 = !{i32 8, !"PIC Level", i32 2} !2 = !{i32 7, !"PIE Level", i32 2} !3 = !{i32 7, !"uwtable", i32 2} !4 = !{i32 7, !"frame-pointer", i32 2} !5 = !{!"clang version 22.0.0git (https://github.com/anutosh491/llvm-project.git 81ad8fb)"} === end compile-ptu === execute-ptu 1: [TU=0x55556cfbf830, M=0x55556cfc13a0 (incr_module_1)] hi ``` Basically I see that CodeGen emits IR for a cell before we know whether DiagnosticsEngine has an error. For C code like `printf("hi\n");` without <stdio.h>, Sema emits a diagnostic but still produces a "codegen-able" `TopLevelStmt`, so the `printf` call is IR-generated into the current module. Previously, when `Diags.hasErrorOccurred()` was true, we only cleaned up the PTU AST and left the CodeGen module untouched. The next successful cell then called `GenModule()`, which returned that same module (now also containing the next cell’s IR), causing side effects from the failed cell (e.g. printf)
…er. (llvm#181941) The progress event reporter has a thread that reports events every 250 millisecond. and is destroyed in its destructor. When in event reporter desctructor, the event reporter may have pending event but the call mutex is destroyed leading to the crash. Relevant stack trace from CI. ``` [2026-02-13T17:46:13.577Z] libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument [2026-02-13T17:46:13.577Z] PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash report from ~/Library/Logs/DiagnosticReports/. [2026-02-13T17:46:13.577Z] #0 0x0000000102b6943c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x10008943c) [2026-02-13T17:46:13.577Z] #1 0x0000000102b67368 llvm::sys::RunSignalHandlers() (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x100087368) [2026-02-13T17:46:13.577Z] #2 0x0000000102b69f20 SignalHandler(int, __siginfo*, void*) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x100089f20) [2026-02-13T17:46:13.577Z] #3 0x000000018bbdb744 (/usr/lib/system/libsystem_platform.dylib+0x1804e3744) [2026-02-13T17:46:13.577Z] #4 0x000000018bbd1888 (/usr/lib/system/libsystem_pthread.dylib+0x1804d9888) [2026-02-13T17:46:13.577Z] llvm#5 0x000000018bad6850 (/usr/lib/system/libsystem_c.dylib+0x1803de850) [2026-02-13T17:46:13.577Z] llvm#6 0x000000018bb85858 (/usr/lib/libc++abi.dylib+0x18048d858) [2026-02-13T17:46:13.577Z] llvm#7 0x000000018bb744bc (/usr/lib/libc++abi.dylib+0x18047c4bc) [2026-02-13T17:46:13.577Z] llvm#8 0x000000018b7a0424 (/usr/lib/libobjc.A.dylib+0x1800a8424) [2026-02-13T17:46:13.577Z] llvm#9 0x000000018bb84c2c (/usr/lib/libc++abi.dylib+0x18048cc2c) [2026-02-13T17:46:13.577Z] llvm#10 0x000000018bb88394 (/usr/lib/libc++abi.dylib+0x180490394) [2026-02-13T17:46:13.577Z] llvm#11 0x000000018bb8833c (/usr/lib/libc++abi.dylib+0x18049033c) [2026-02-13T17:46:13.577Z] llvm#12 0x000000018bb01b90 (/usr/lib/libc++.1.dylib+0x180409b90) [2026-02-13T17:46:13.577Z] llvm#13 0x000000018bb01b34 (/usr/lib/libc++.1.dylib+0x180409b34) [2026-02-13T17:46:13.577Z] llvm#14 0x000000018bb038a0 (/usr/lib/libc++.1.dylib+0x18040b8a0) [2026-02-13T17:46:13.577Z] llvm#15 0x0000000102b6fbac lldb_dap::DAP::Send(std::__1::variant<lldb_dap::protocol::Request, lldb_dap::protocol::Response, lldb_dap::protocol::Event> const&) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x10008fbac) [2026-02-13T17:46:13.577Z] llvm#16 0x0000000102b6f890 lldb_dap::DAP::SendJSON(llvm::json::Value const&) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x10008f890) [2026-02-13T17:46:13.577Z] llvm#17 0x0000000102b78788 std::__1::__function::__func<lldb_dap::DAP::DAP(lldb_dap::Log&, lldb_dap::ReplMode, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, bool, llvm::StringRef, lldb_private::transport::JSONTransport<lldb_dap::ProtocolDescriptor>&, lldb_private::MainLoopPosix&)::$_0, std::__1::allocator<lldb_dap::DAP::DAP(lldb_dap::Log&, lldb_dap::ReplMode, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, bool, llvm::StringRef, lldb_private::transport::JSONTransport<lldb_dap::ProtocolDescriptor>&, lldb_private::MainLoopPosix&)::$_0>, void (lldb_dap::ProgressEvent&)>::operator()(lldb_dap::ProgressEvent&) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x100098788) [2026-02-13T17:46:13.577Z] llvm#18 0x0000000102b8939c lldb_dap::ProgressEventManager::ReportIfNeeded() (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x1000a939c) [2026-02-13T17:46:13.577Z] llvm#19 0x0000000102b8982c lldb_dap::ProgressEventReporter::ReportStartEvents() (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x1000a982c) [2026-02-13T17:46:13.577Z] llvm#20 0x0000000102b8a038 void* std::__1::__thread_proxy[abi:nn200100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, lldb_dap::ProgressEventReporter::ProgressEventReporter(std::__1::function<void (lldb_dap::ProgressEvent&)>)::$_0>>(void*) (/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake-os-verficiation/lldb-build/bin/lldb-dap+0x1000aa038) [2026-02-13T17:46:13.577Z] llvm#21 0x000000018bbd1c08 (/usr/lib/system/libsystem_pthread.dylib+0x1804d9c08) [2026-02-13T17:46:13.577Z] llvm#22 0x000000018bbccba8 (/usr/lib/system/libsystem_pthread.dylib+0x1804d4ba8) ``` rdar://170331108
I created an issue about this in llvm#179976. Clang's Address Sanitizer installs its own SEH filter which handles some types of uncaught exceptions. Along with register values and some other information, it also generates a stack trace. However, current logic is incomplete. It relies on DbgHelp's SymFunctionTableAccess64 and SymGetModuleBase64 which won't work with machine code that has its RUNTIME_FUNCTION entry registered with Rtl* (e.g. RtlAddFunctionTable) system calls. Most likely, this is because DbgHelp either relies on information in PDB files or considers PDATA and XDATA only from loaded EXE and DLL modules. Either way, consider the following example: ``` #include <windows.h> #include <iostream> #include <vector> typedef union _UNWIND_CODE { struct { BYTE CodeOffset; BYTE UnwindOp : 4; BYTE OpInfo : 4; }; USHORT FrameOffset; } UNWIND_CODE, * PUNWIND_CODE; typedef struct _UNWIND_INFO { BYTE Version : 3; BYTE Flags : 5; BYTE SizeOfProlog; BYTE CountOfCodes; BYTE FrameRegister : 4; BYTE FrameOffset : 4; UNWIND_CODE UnwindCode[1]; // Variable size } UNWIND_INFO, * PUNWIND_INFO; #define UWOP_PUSH_NONVOL 0 #define UWOP_ALLOC_LARGE 1 #define UWOP_ALLOC_SMALL 2 #define UWOP_SET_FPREG 3 #define UWOP_SAVE_NONVOL 4 #define UWOP_SAVE_NONVOL_FAR 5 #define UWOP_SAVE_XMM128 8 #define UWOP_SAVE_XMM128_FAR 9 #define UWOP_PUSH_MACHFRAME 10 int main() { // PUSH RBX (0x53) - Save non-volatile register // SUB RSP, 0x20 (0x48 0x83 0xEC 0x20) - Allocate 32 bytes (shadow space) // XOR RAX, RAX (0x48 0x31 0xC0) - Zero out RAX // MOV RAX, [RAX] (0x48 0x8B 0x00) - Dereference NULL std::vector<unsigned char> code = { 0x53, 0x48, 0x83, 0xEC, 0x20, 0x48, 0x31, 0xC0, 0x48, 0x8B, 0x00 }; size_t codeSize = code.size(); size_t totalSize = 100; LPVOID pMemory = VirtualAlloc(NULL, totalSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE); BYTE* pCodeBase = (BYTE*)pMemory; PUNWIND_INFO pUnwindInfo = (PUNWIND_INFO)(pCodeBase + codeSize); size_t alignmentPadding = 0; if ((size_t)pUnwindInfo % 4 != 0) { alignmentPadding = 4 - ((size_t)pUnwindInfo % 4); pUnwindInfo = (PUNWIND_INFO)((BYTE*)pUnwindInfo + alignmentPadding); } memcpy(pCodeBase, code.data(), codeSize); pUnwindInfo->Version = 1; pUnwindInfo->Flags = UNW_FLAG_NHANDLER; pUnwindInfo->Flags = 0; pUnwindInfo->SizeOfProlog = 5; pUnwindInfo->CountOfCodes = 2; pUnwindInfo->FrameRegister = 0; pUnwindInfo->FrameOffset = 0; pUnwindInfo->UnwindCode[0].CodeOffset = 5; pUnwindInfo->UnwindCode[0].UnwindOp = UWOP_ALLOC_SMALL; pUnwindInfo->UnwindCode[0].OpInfo = 3; pUnwindInfo->UnwindCode[1].CodeOffset = 1; pUnwindInfo->UnwindCode[1].UnwindOp = UWOP_PUSH_NONVOL; pUnwindInfo->UnwindCode[1].OpInfo = 3; // RBX RUNTIME_FUNCTION tableEntry = {}; tableEntry.BeginAddress = 0; tableEntry.EndAddress = (DWORD)codeSize; tableEntry.UnwindData = (DWORD)((BYTE*)pUnwindInfo - (BYTE*)pMemory); DWORD64 baseAddress = (DWORD64)pMemory; RtlAddFunctionTable(&tableEntry, 1, baseAddress); typedef void(*FuncType)(); FuncType myFunc = (FuncType)pMemory; myFunc(); return 0; } ``` Windows' kernel can propagate hardware exception through that function, so clearly these entries are at least partially correct. Right now, ASan's stack walking produces this (compiled with latest release, clang++): ``` PS D:\Local Projects\cpp-playground> ./a.exe ================================================================= ==14216==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x0199561c0008 bp 0x004cf0cffb30 sp 0x004cf0cff970 T0) ==14216==The signal is caused by a READ memory access. ==14216==Hint: address points to the zero page. #0 0x0199561c0007 (<unknown module>) #1 0x000000000000 (<unknown module>) #2 0x000000000000 (<unknown module>) ==14216==Register values: rax = 0 rbx = 4cf0cffaa0 rcx = 7ffcb97b4e28 rdx = 19955dc0000 rdi = 11bf564a0040 rsi = 0 rbp = 4cf0cffb30 rsp = 4cf0cff970 r8 = 7ffffffffffffffc r9 = 1 r10 = 0 r11 = 246 r12 = 0 r13 = 0 r14 = 0 r15 = 0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation (<unknown module>) ==14216==ABORTING ``` Frames one and two is just some stack space allocated by that dynamic function. While patched version produces this: ``` PS D:\Local Projects\cpp-playground> ./a.exe ================================================================= ==13660==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x01ed5ad70008 bp 0x00d76492f650 sp 0x00d76492f490 T0) ==13660==The signal is caused by a READ memory access. ==13660==Hint: address points to the zero page. #0 0x01ed5ad70007 (<unknown module>) #1 0x7ff732e518a1 in main (D:\Local Projects\cpp-playground\a.exe+0x1400018a1) #2 0x7ff732e56a9b in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78 #3 0x7ff732e56a9b in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 #4 0x7ffcb878e8d6 (C:\WINDOWS\System32\KERNEL32.DLL+0x18002e8d6) llvm#5 0x7ffcb966c53b (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18008c53b) ==13660==Register values: rax = 0 rbx = d76492f5c0 rcx = 7ffcb97b4e28 rdx = 1ed5a870000 rdi = 12135afa0040 rsi = 0 rbp = d76492f650 rsp = d76492f490 r8 = 7ffffffffffffffc r9 = 1 r10 = 0 r11 = 246 r12 = 0 r13 = 0 r14 = 0 r15 = 0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation (<unknown module>) ==13660==ABORTING ``` Now we see that stack walking handled our dynamic function properly. Interestingly enough, it appears that other overloaded version of UnwindSlow procedure that works without CONTEXT structure already has some logic to handle this. Theoretically, symbolizer should also be able to provide some information about these functions, but I don't think that this is necessary. I added SANITIZER_WINDOWS64 check because I am pretty sure Microsoft only mentions these functions for 64 bit version of their OS. I also can't check how this works on ARM.
Using code/ideas from the x86 backend to optimize a select on a bitcast integer. The previous aarch64 approach was to individually extract the bits from the mask, which is kind of terrible. https://rust.godbolt.org/z/576sndT66 ```llvm define void @if_then_else8(ptr %out, i8 %mask, ptr %if_true, ptr %if_false) { start: %t = load <8 x i32>, ptr %if_true, align 4 %f = load <8 x i32>, ptr %if_false, align 4 %m = bitcast i8 %mask to <8 x i1> %s = select <8 x i1> %m, <8 x i32> %t, <8 x i32> %f store <8 x i32> %s, ptr %out, align 4 ret void } ``` turned into ```asm if_then_else8: // @if_then_else8 sub sp, sp, llvm#16 ubfx w8, w1, #4, #1 and w11, w1, #0x1 ubfx w9, w1, llvm#5, #1 fmov s1, w11 ubfx w10, w1, #1, #1 fmov s0, w8 ubfx w8, w1, llvm#6, #1 ldp q5, q2, [x3] mov v1.h[1], w10 ldp q4, q3, [x2] mov v0.h[1], w9 ubfx w9, w1, #2, #1 mov v1.h[2], w9 ubfx w9, w1, #3, #1 mov v0.h[2], w8 ubfx w8, w1, llvm#7, #1 mov v1.h[3], w9 mov v0.h[3], w8 ushll v1.4s, v1.4h, #0 ushll v0.4s, v0.4h, #0 shl v1.4s, v1.4s, llvm#31 shl v0.4s, v0.4s, llvm#31 cmlt v1.4s, v1.4s, #0 cmlt v0.4s, v0.4s, #0 bsl v1.16b, v4.16b, v5.16b bsl v0.16b, v3.16b, v2.16b stp q1, q0, [x0] add sp, sp, llvm#16 ret ``` With this PR that instead emits ```asm if_then_else8: adrp x8, .LCPI0_1 dup v0.4s, w1 ldr q1, [x8, :lo12:.LCPI0_1] adrp x8, .LCPI0_0 ldr q2, [x8, :lo12:.LCPI0_0] ldp q4, q3, [x2] and v1.16b, v0.16b, v1.16b and v0.16b, v0.16b, v2.16b ldp q5, q2, [x3] cmeq v1.4s, v1.4s, #0 cmeq v0.4s, v0.4s, #0 bsl v1.16b, v2.16b, v3.16b bsl v0.16b, v5.16b, v4.16b stp q0, q1, [x0] ret ``` So substantially shorter. Instead of building the mask element-by-element, this approach (by virtue of not splitting) instead splats the mask value into all vector lanes, performs a bitwise and with powers of 2, and compares with zero to construct the mask vector. cc rust-lang/rust#122376 cc llvm#175769
`SE.getUMaxExpr` causes assertion failure due to type mismatch here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/LoopAccessAnalysis.cpp#L253 Running `opt -S -p loop-vectorize -debug-only=loop-vectorize llvm/test/Analysis/LoopAccessAnalysis/type-mismatch-in-scalar-evolution.ll ` without the changes made in LoopAccessAnalysis.cpp causes assertion failure. Attaching the stack dump for reference: ``` LV: Checking a loop in 'loop_contains_store_assumed_bounds' from input.ll LV: Loop hints: force=? width=4 interleave=0 LV: Found a loop: for.body LV: Found an induction variable. opt: /home/kshitij/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3918: const llvm::SCEV* llvm::ScalarEvolution::getMinMaxExpr(llvm::SCEVTypes, llvm::SmallVectorImpl<const llvm::SCEV*>&): Assertion `getEffectiveSCEVType(Ops[i]->getType()) == ETy && "Operand types don't match!"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug. Stack dump: 0. Program arguments: opt -S -passes=loop-vectorize -debug-only=loop-vectorize -force-vector-width=4 -disable-output input.ll 1. Running pass "function(loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>)" on module "input.ll" 2. Running pass "loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>" on function "loop_contains_store_assumed_bounds" #0 0x000058ee97c5e652 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/opt+0x4f44652) #1 0x000058ee97c5af0f llvm::sys::RunSignalHandlers() (/usr/local/bin/opt+0x4f40f0f) #2 0x000058ee97c5b05c SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0 #3 0x00007c49d4c45330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330) #4 0x00007c49d4c9eb2c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 llvm#5 0x00007c49d4c9eb2c __pthread_kill_internal ./nptl/pthread_kill.c:78:10 llvm#6 0x00007c49d4c9eb2c pthread_kill ./nptl/pthread_kill.c:89:10 llvm#7 0x00007c49d4c4527e raise ./signal/../sysdeps/posix/raise.c:27:6 llvm#8 0x00007c49d4c288ff abort ./stdlib/abort.c:81:7 llvm#9 0x00007c49d4c2881b _nl_load_domain ./intl/loadmsgcat.c:1177:9 llvm#10 0x00007c49d4c3b517 (/lib/x86_64-linux-gnu/libc.so.6+0x3b517) llvm#11 0x000058ee98003fdb llvm::ScalarEvolution::getMinMaxExpr(llvm::SCEVTypes, llvm::SmallVectorImpl<llvm::SCEV const*>&) (/usr/local/bin/opt+0x52e9fdb) llvm#12 0x000058ee98004507 llvm::ScalarEvolution::getUMaxExpr(llvm::SCEV const*, llvm::SCEV const*) (/usr/local/bin/opt+0x52ea507) llvm#13 0x000058ee980dc728 llvm::getStartAndEndForAccess(llvm::Loop const*, llvm::SCEV const*, llvm::Type*, llvm::SCEV const*, llvm::SCEV const*, llvm::ScalarEvolution*, llvm::DenseMap<std::pair<llvm::SCEV const*, llvm::Type*>, std::pair<llvm::SCEV const*, llvm::SCEV const*>, llvm::DenseMapInfo<std::pair<llvm::SCEV const*, llvm::Type*>, void>, llvm::detail::DenseMapPair<std::pair<llvm::SCEV const*, llvm::Type*>, std::pair<llvm::SCEV const*, llvm::SCEV const*>>>*, llvm::DominatorTree*, llvm::AssumptionCache*, std::optional<llvm::ScalarEvolution::LoopGuards>&) (/usr/local/bin/opt+0x53c2728) llvm#14 0x000058ee9814008b llvm::isDereferenceableAndAlignedInLoop(llvm::LoadInst*, llvm::Loop*, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) (/usr/local/bin/opt+0x542608b) llvm#15 0x000058ee9a0fa1ca llvm::LoopVectorizationLegality::canUncountableExitConditionLoadBeMoved(llvm::BasicBlock*) (/usr/local/bin/opt+0x73e01ca) llvm#16 0x000058ee9a0faee0 llvm::LoopVectorizationLegality::isVectorizableEarlyExitLoop() (/usr/local/bin/opt+0x73e0ee0) llvm#17 0x000058ee9a104678 llvm::LoopVectorizationLegality::canVectorize(bool) (/usr/local/bin/opt+0x73ea678) llvm#18 0x000058ee9a08c953 llvm::LoopVectorizePass::processLoop(llvm::Loop*) (/usr/local/bin/opt+0x7372953) llvm#19 0x000058ee9a090e21 llvm::LoopVectorizePass::runImpl(llvm::Function&) (/usr/local/bin/opt+0x7376e21) llvm#20 0x000058ee9a0914e0 llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/opt+0x73774e0) llvm#21 0x000058ee99e419a5 llvm::detail::PassModel<llvm::Function, llvm::LoopVectorizePass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilderPipelines.cpp:0:0 llvm#22 0x000058ee97f18905 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/opt+0x51fe905) llvm#23 0x000058ee995d70d5 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) AMDGPUTargetMachine.cpp:0:0 llvm#24 0x000058ee97f17051 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/opt+0x51fd051) llvm#25 0x000058ee995d7775 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) AMDGPUTargetMachine.cpp:0:0 llvm#26 0x000058ee97f1783d llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/opt+0x51fd83d) llvm#27 0x000058ee9c153909 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool, bool) (/usr/local/bin/opt+0x9439909) llvm#28 0x000058ee97c3f380 optMain (/usr/local/bin/opt+0x4f25380) llvm#29 0x00007c49d4c2a1ca __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 llvm#30 0x00007c49d4c2a28b call_init ./csu/../csu/libc-start.c:128:20 llvm#31 0x00007c49d4c2a28b __libc_start_main ./csu/../csu/libc-start.c:347:5 llvm#32 0x000058ee97c309a5 _start (/usr/local/bin/opt+0x4f169a5) ``` This is caused by a type mismatch between `SE.getSCEV(DerefRK.IRArgValue)` and `DerefBytesSCEV`. Fixing this by extending them to the wider type.
…D tensor.extract (llvm#187085) Vectorizing a rank-0 `linalg.generic` whose body contains `tensor.extract` with data-dependent indices hits the Gather classification in `getTensorExtractMemoryAccessPattern` because `isOutput1DVector` returns false for a 0-D result. This produces an invalid `vector.gather` where operand #2 must be a vector of index values but gets a scalar `index` instead. Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and skips mask generation for 0-D in that path.
…e edge case (llvm#188590) llvm#186966 was reverted because the test case triggered a use-of-uninitialized-memory (https://lab.llvm.org/buildbot/#/builders/94/builds/16379), due to the include directive omitting a trailing newline. This patch adds a minor fix to avoid the use-of-uninitialized-memory, and deliberately re-adds the test case sans trailing newline for regression testing. MSan report prior to this fix: ``` @@@BUILD_STEP sanitizer logs: stage2/msan_track_origins check@@@ ==clang-scan-deps==616960==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5555599c3300 in isAnnotation /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/Lex/Token.h:131:38 #1 0x5555599c3300 in setLength /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/Lex/Token.h:152:13 #2 0x5555599c3300 in clang::Lexer::FormTokenWithChars(clang::Token&, char const*, clang::tok::TokenKind) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/include/clang/Lex/Lexer.h:644:12 #3 0x5555599cf895 in clang::Lexer::LexEndOfFile(clang::Token&, char const*) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Lexer.cpp:3166:5 #4 0x555559bb229b in clang::Preprocessor::Lex(clang::Token&) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/Preprocessor.cpp:916:11 llvm#5 0x555559aa5365 in __invoke<void (clang::Preprocessor::*&)(clang::Token &), clang::Preprocessor *, clang::Token &> /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/libcxx_install_msan_track_origins/include/c++/v1/__type_traits/invoke.h:90:27 llvm#6 0x555559aa5365 in invoke<void (clang::Preprocessor::*&)(clang::Token &), clang::Preprocessor *, clang::Token &> /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/libcxx_install_msan_track_origins/include/c++/v1/__functional/invoke.h:29:10 llvm#7 0x555559aa5365 in operator()<void (clang::Preprocessor::*)(clang::Token &)> /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:470:5 llvm#8 0x555559aa5365 in clang::Preprocessor::CheckEndOfDirective(llvm::StringRef, bool, llvm::SmallVectorImpl<clang::Token>*) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:478:5 llvm#9 0x555559ab96b5 in clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2205:7 ... ```
Running gcc test c-c++-common/tsan/tls_race.c on s390 we get: ThreadSanitizer: CHECK failed: tsan_platform_linux.cpp:618 "((thr_beg)) >= ((tls_addr))" (0x3ffaa35e140, 0x3ffaa35e250) (tid=2419930) #0 __tsan::CheckUnwind() /devel/src/libsanitizer/tsan/tsan_rtl.cpp:696 (libtsan.so.2+0x91b57) #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /devel/src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 (libtsan.so.2+0xd211b) #2 __tsan::ImitateTlsWrite(__tsan::ThreadState*, unsigned long, unsigned long) /devel/src/libsanitizer/tsan/tsan_platform_linux.cpp:618 (libtsan.so.2+0x8faa3) #3 __tsan::ThreadStart(__tsan::ThreadState*, unsigned int, unsigned long long, __sanitizer::ThreadType) /devel/src/libsanitizer/tsan/tsan_rtl_thread.cpp:225 (libtsan.so.2+0xaadb5) #4 __tsan_thread_start_func /devel/src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1065 (libtsan.so.2+0x3d34d) llvm#5 start_thread <null> (libc.so.6+0xae70d) (BuildId: d3b08de1b543c2d15d419bf861b3c2e4c01ac75b) llvm#6 thread_start <null> (libc.so.6+0x12d2ff) (BuildId: d3b08de1b543c2d15d419bf861b3c2e4c01ac75b) In order to determine the static TLS blocks in GetStaticTlsBoundary we iterate over the modules and try to find the largest range without a gap. Here we might have that modules are spaced exactly by the alignment. For example, for the failing test we have: (gdb) p/x ranges.data_[0] $1 = {begin = 0x3fff7f9e6b8, end = 0x3fff7f9e740, align = 0x8, tls_modid = 0x3} (gdb) p/x ranges.data_[1] $2 = {begin = 0x3fff7f9e740, end = 0x3fff7f9eed0, align = 0x40, tls_modid = 0x2} (gdb) p/x ranges.data_[2] $3 = {begin = 0x3fff7f9eed8, end = 0x3fff7f9eef8, align = 0x8, tls_modid = 0x4} (gdb) p/x ranges.data_[3] $4 = {begin = 0x3fff7f9eefc, end = 0x3fff7f9ef00, align = 0x4, tls_modid = 0x1} where ranges[3].begin == ranges[2].end + ranges[3].align holds. Since in the loop a strict inequality test is used we compute the wrong address (gdb) p/x *addr $5 = 0x3fff7f9eefc whereas 0x3fff7f9e6b8 is expected which is why we bail out in the subsequent.
…8271) Example: int foo(int a, int b) { return a - 1 + ~b; } Before, on AArch64: mvn w8, w1 add w8, w0, w8 sub w0, w8, #1 After (matches gcc): sub w0, w0, w1 sub w0, w0, #2 Proof: https://alive2.llvm.org/ce/z/g_bV01
…bols add' (llvm#188377) Context: lldb might crash when running to a debuggee crashing state and do a target symbols add command. Backtrace: ``` #0 0x000055ca6790dc65 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:848:11 #1 0x000055ca6790e434 PrintStackTraceSignalHandler(void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:931:1 #2 0x000055ca6790b839 llvm::sys::RunSignalHandlers() /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Signals.cpp:104:5 #3 0x000055ca6790ff6b SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:430:38 #4 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 llvm#5 0x00007fe9e5f25649 syscall /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/misc/../sysdeps/unix/sysv/linux/x86_64/syscall.S:38:0 llvm#6 0x00007fe9ec649170 SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:429:7 llvm#7 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 llvm#8 0x00007fe9ebb77bf0 lldb_private::operator<(lldb_private::StackID const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackID.cpp:99:16 llvm#9 0x00007fe9ebb6863d CompareStackID(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:683:3 llvm#10 0x00007fe9ebb6d049 bool __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>::operator()<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/predefined_ops.h:196:4 llvm#11 0x00007fe9ebb6cefe __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::__lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algobase.h:1464:8 llvm#12 0x00007fe9ebb6cdfc __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algo.h:2062:14 llvm#13 0x00007fe9ebb685fa auto llvm::lower_bound<std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/STLExtras.h:2001:10 llvm#14 0x00007fe9ebb68441 lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:697:11 llvm#15 0x00007fe9ebbee395 lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/include/lldb/Target/Thread.h:459:7 llvm#16 0x00007fe9ebac7cf7 lldb_private::ExecutionContextRef::GetFrameSP() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:643:25 llvm#17 0x00007fe9ebac80e1 lldb_private::GetStoppedExecutionContext(lldb_private::ExecutionContextRef const*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:164:34 llvm#18 0x00007fe9eb8903fa lldb_private::Statusline::Redraw(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Statusline.cpp:139:7 llvm#19 0x00007fe9eb7ac8be lldb_private::Debugger::RedrawStatusline(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1233:3 llvm#20 0x00007fe9eb804d1e lldb_private::IOHandlerEditline::RedrawCallback() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:446:3 llvm#21 0x00007fe9eb80aa81 lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2::operator()() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:262:73 llvm#22 0x00007fe9eb80aa5d void llvm::detail::UniqueFunctionBase<void>::CallImpl<lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2>(void*) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:213:5 llvm#23 0x00007fe9eb93bfbf llvm::unique_function<void ()>::operator()() /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:365:5 llvm#24 0x00007fe9eb93bb80 lldb_private::Editline::GetCharacter(wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:0:5 llvm#25 0x00007fe9eb941a18 lldb_private::Editline::ConfigureEditor(bool)::$_0::operator()(editline*, wchar_t*) const /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1287:5 llvm#26 0x00007fe9eb9419e2 lldb_private::Editline::ConfigureEditor(bool)::$_0::__invoke(editline*, wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1286:27 llvm#27 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:439:14 llvm#28 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:400:1 llvm#29 0x00007fe9f3384f90 read_getcmd /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:247:14 llvm#30 0x00007fe9f3384f90 el_gets /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:586:14 llvm#31 0x00007fe9eb9409f3 lldb_private::Editline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1636:16 llvm#32 0x00007fe9eb8044d7 lldb_private::IOHandlerEditline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:339:5 llvm#33 0x00007fe9eb805609 lldb_private::IOHandlerEditline::Run() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:600:11 llvm#34 0x00007fe9eb7b214c lldb_private::Debugger::RunIOHandlers() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1280:16 llvm#35 0x00007fe9eb98f00f lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:3620:16 llvm#36 0x00007fe9eb4f0e09 lldb::SBDebugger::RunCommandInterpreter(bool, bool) /home/hyubo/osmeta/external/llvm-project/lldb/source/API/SBDebugger.cpp:1234:42 llvm#37 0x000055ca6788d6b0 Driver::MainLoop() /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:677:3 llvm#38 0x000055ca6788e226 main /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:887:17 llvm#39 0x00007fe9e5e2c657 __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16 llvm#40 0x00007fe9e5e2c718 call_init /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:128:20 llvm#41 0x00007fe9e5e2c718 __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:379:5 llvm#42 0x000055ca67889a11 _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:118:0 Segmentation fault (core dumped) ``` When `target symbols add` is run, `Symtab::AddSymbol()` can reallocate the underlying `std::vector<Symbol>` and resize it, invalidating all existing Symbol* pointers. While `Process::Flush()` clears stale stack frames, the statusline caches its own `ExecutionContextRef` containing a `StackID` with a `SymbolContextScope*` (which can be a `Symbol*`). This cached reference is not cleared by `Process::Flush()`, so the next statusline redraw accesses a dangling pointer and crashes. Fix this by adding `Statusline::Flush()` which clears the cached frame, `Debugger::Flush()` which forwards to it under the statusline mutex, and calling `Debugger::Flush()` from `Process::Flush()` so that all flush paths (symbol add, exec, module load) also invalidate the statusline's stale state. After this fix, lldb is not crashing anymore, new symbols from a symbol file are correctly loaded --------- Co-authored-by: George Hu <georgehuyubo@gmail.com>
When Control Flow Integrity (CFI) is enabled, jump tables are used to redirect indirect calls. Previously, these jump table entries lacked debug information, making it difficult for profilers and debuggers to attribute execution time correctly. Now stack trace, when stopped on jump table entry will looks like this: ``` #0: __ubsan_check_cfi_icall_jt at sanitizer/ubsan_interface.h:0 #1: c::c() (.cfi_jt) at sanitizer/ubsan_interface.h:0:0 #2: .cfi.jumptable.81 at sanitizer/ubsan_interface.h:0:0 ```
…93670) When Control Flow Integrity (CFI) is enabled, jump tables are used to redirect indirect calls. Previously, these jump table entries lacked debug information, making it difficult for profilers and debuggers to attribute execution time correctly. Now stack trace, when stopped on jump table entry will looks like this: ``` #0: __ubsan_check_cfi_icall_jt at sanitizer/ubsan_interface.h:0 #1: c::c() (.cfi_jt) at sanitizer/ubsan_interface.h:0:0 #2: .cfi.jumptable.81 at sanitizer/ubsan_interface.h:0:0 ``` This is reland of llvm#192736, reverted with llvm#193663. This version don't update debug info for "Cross-DSO CFI" mode.
…s an equality comparison operator" (llvm#177415) Reland PR llvm#176893 which was an attempt to reland PR llvm#176429. Fix: There was a test case asserting that for types `StructA` `StructB` where `operator==(const StructB &, const StructA &)` is defined, `has_equality_comparison_v<StructA, StructB>` is false because the arguments are the wrong way around. However, in C++20, operator overload resolution was changed so that for reflexive comparison operators `==` and `!=` if a candidate exists with the arguments swapped then this will be used. This means the test case failed when compiled with C++20. That check was simply removed in this version.
…input" (llvm#195551) Reverts llvm#190863 due to buildbot breakage e.g., https://lab.llvm.org/buildbot/#/builders/52/builds/16951 ``` Failed Tests (1): LLVM :: tools/llvm-profgen/filter-build-id.test ``` ``` ==llvm-profgen==3809550==ERROR: AddressSanitizer: container-overflow on address 0x6e80441e1762 at pc 0x6216c3f2cdce bp 0x7fff3c3ddf60 sp 0x7fff3c3dd710 READ of size 8 at 0x6e80441e1762 thread T0 #0 0x6216c3f2cdcd in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:848:7 #1 0x6216c3f2d25c in bcmp /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:894:10 #2 0x6216c400b836 in operator== /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:914:10 #3 0x6216c400b836 in operator!= /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:917:69 #4 0x6216c400b836 in llvm::sampleprof::PerfScriptReader::extractCallstack(llvm::sampleprof::TraceStream&, llvm::SmallVectorImpl<unsigned long>&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:801:36 llvm#5 0x6216c400d37a in llvm::sampleprof::HybridPerfReader::parseSample(llvm::sampleprof::TraceStream&, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:881:8 llvm#6 0x6216c40150d8 in parseSample /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1118:3 llvm#7 0x6216c40150d8 in llvm::sampleprof::PerfScriptReader::parseEventOrSample(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1201:5 llvm#8 0x6216c401539a in llvm::sampleprof::PerfScriptReader::parseAndAggregateTrace() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1210:5 llvm#9 0x6216c4018c88 in llvm::sampleprof::PerfScriptReader::parsePerfTraces() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1457:3 llvm#10 0x6216c3ff2c7a in main /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/llvm-profgen.cpp:229:19 llvm#11 0x72404502a8c0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a8c0) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#12 0x72404502a9d7 in __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a9d7) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#13 0x6216c3f0f3d4 in _start (/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-profgen+0x2f083d4) 0x6e80441e1762 is located 18 bytes inside of 48-byte region [0x6e80441e1750,0x6e80441e1780) allocated by thread T0 here: #0 0x6216c3feab0d in operator new(unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:109:35 #1 0x724045511c07 in __libcpp_allocate<char> /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__new/allocate.h:42:28 #2 0x724045511c07 in allocate /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator.h:92:14 #3 0x724045511c07 in allocate_at_least /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator.h:99:13 #4 0x724045511c07 in allocate_at_least<std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator_traits.h:340:22 llvm#5 0x724045511c07 in __allocate_at_least<std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocate_at_least.h:36:16 llvm#6 0x724045511c07 in __allocate_long_buffer /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/string:2259:21 llvm#7 0x724045511c07 in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__grow_by(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/string:2769:25 llvm#8 0x6216c401d90a in __grow_by_without_replace /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/string:2795:3 llvm#9 0x6216c401d90a in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>& std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::append[abi:sqn230000]<char const*, 0>(char const*, char const*) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/string:1431:9 llvm#10 0x6216c401d1a6 in std::__1::basic_istream<char, std::__1::char_traits<char>>& std::__1::getline[abi:sqn230000]<char, std::__1::char_traits<char>, std::__1::allocator<char>>(std::__1::basic_istream<char, std::__1::char_traits<char>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, char) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/istream:1309:15 llvm#11 0x6216c4014a76 in getline<char, std::__1::char_traits<char>, std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/istream:1343:10 llvm#12 0x6216c4014a76 in advance /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.h:52:10 llvm#13 0x6216c4014a76 in llvm::sampleprof::PerfScriptReader::parseAggregatedCount(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1110:13 llvm#14 0x6216c4015095 in parseSample /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1116:20 llvm#15 0x6216c4015095 in llvm::sampleprof::PerfScriptReader::parseEventOrSample(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1201:5 llvm#16 0x6216c401539a in llvm::sampleprof::PerfScriptReader::parseAndAggregateTrace() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1210:5 llvm#17 0x6216c4018c88 in llvm::sampleprof::PerfScriptReader::parsePerfTraces() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1457:3 llvm#18 0x6216c3ff2c7a in main /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/llvm-profgen.cpp:229:19 llvm#19 0x72404502a8c0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a8c0) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#20 0x72404502a9d7 in __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a9d7) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#21 0x6216c3f0f3d4 in _start (/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-profgen+0x2f083d4) ```
…198548) When an MCP client disconnects (EOF), `IOTransport::OnRead` called `handler.OnClosed()` before resetting `m_read_handle`. The MCP server's `OnClosed` handler erases the client from `m_instances`, destroying both the transport (`this`) and the binder (`handler`). The subsequent `m_read_handle.reset()` then accessed the destroyed transport's member, causing a use-after-free (SIGSEGV). * thread #1, stop reason = signal SIGSEGV: address not mapped to object (fault address=0x28) * frame #0: 0x00007ff5d4d5afda liblldb.so.23.2`lldb_private::transport::IOTransport<lldb_protocol::mcp::ProtocolDescriptor>::OnRead(lldb_private::MainLoopBase&, lldb_private::transport::JSONTransport<lldb_protocol::mcp::ProtocolDescriptor>::MessageHandler&) + 1274 frame #1: 0x00007ff5d1140ad8 liblldb.so.23.0`lldb_private::MainLoopPosix::Run() + 408 frame #2: 0x00007ff5d1760c1c liblldb.so.23.0`std::thread::_State_impl<std::thre Fix by resetting the read handle before calling `OnClosed()`, so no transport members are accessed after the handler potentially destroys the transport. Then when the scope is left, the destructor is called for the new read_handle local variable and it is cleaned up. New unit tests added that fail without this change. With the change, the custom 'ai' script (allows end user locally to communicate lldb context to agent backend via a spun up MCP server: "protocol-server start MCP listen://localhost:{port}") now successfully concludes without this crash Assisted with: claude
Update 2:
After another round of review update the uArch definition in the following way:
Update:
After the first iteration of review comment, the current PR is update in the following way:
-- Full uArch definition, utilities and all (at least for one instruction for now, DPAS).
-- Definition for 2 uArchs (PVC, BMG).
-- A pass to attach the uArch name to the module attribute using DLTI.
-- Shows the usage of the uArch interface to verify XeGPU ops.
-- Few test cases.
P.S. I still kept some old code intentionally, if we want to go back or use some of it. I'll clean up once we agree on the process.
================================
Hi All,
This is the initial PR for uArch definition. The primary purpose for this PR is to have a discussion and finalize the design. So please think of it as an RFC as well.
First target we want to achieve here is to finalize the skeleton:
Due to this purpose, the main focus point of this PR are the following files:
mlir/lib/Dialect/XeGPU/Utils/intel_gpu_pvc.yaml: holds the YAML definition of uArch (PVC is used for this use case)mlir/include/mlir/Dialect/XeGPU/Utils/uArch.h: holds the base uArch struct that can be re-used by specialized uArchs such as PVC.mlir/include/mlir/Dialect/XeGPU/Utils/IntelGpuPVC.h: holds the PVC-specific structs.mlir/lib/Dialect/XeGPU/Utils/IntelGpuPVC.cpp is added as placeholder for YAML mapping. This file is intentionally not complete. We want to reach a consensus on the YAML, and C++ structures before filling this one in.
Please provide your valuable feedback as we try to finalize this together.