Skip to content

feat: C# support (System.Security.Cryptography) - initial draft#376

Draft
fynnth wants to merge 4 commits intomainfrom
feat/csharp-support
Draft

feat: C# support (System.Security.Cryptography) - initial draft#376
fynnth wants to merge 4 commits intomainfrom
feat/csharp-support

Conversation

@fynnth
Copy link

@fynnth fynnth commented Mar 4, 2026

C# Support — Initial Draft (feat/csharp-support)

Hey TSC-Members, just putting this up as a draft so you can take a look at what I've been working on. This is not meant to be merged anytime soon and it's more of a "here's where I'm at, please tell me if I'm going in the right direction" kind of thing.


What's in here

I've added a first pass at C# / System.Security.Cryptography support. The approach I went with is basically the same as what was discussed a year ago for parsing: a custom SonarQube sensor backed by an ANTLR grammar to parse C# source files (It is not yet a complete c# parser). The sensor hooks into the existing engine and detection rule infrastructure, so detection rules and the mapper layer stay consistent with the other languages.

I was heaviy inspired by the recently added go support and had claude basically replicate all engine files and the sensor setup fitted for c#. And this seemed to work very good, but correct me on any mistake i made there.

There are a handful of detection rules for the most common System.Security.Cryptography classes (AES, DES, RSA, HMAC variants, hashing, etc.) so that it's enough to validate that the whole pipeline works end to end.

I tested this locally against SonarQube and the rules do produce findings, so at least the basic flow is working to some extend.

Things I'd specifically like @n1ckl0sk0rtge to look at

As I said, I worked on this partly together with Claude (first timew using it as well) and I'm honestly not 100% sure whether some of my/its structural choices align with how things are supposed to be done here. Specifically:

  • Does the module layout (the new csharp/ module, the sensor, etc.) fit the pattern you had in mind for new language support?
  • Is the ANTLR approach the right call, or would you have done something differently? Also there is just the initial parser, that is not capable of the whole microsoft c# grammar. How complete does it need to be for our purposes?
  • Any issues with how I'm wiring things into the main plugin entry point?
  • I have not put any effort till now into thinking of a detection rule strategy for c# or mapping till now, so tips there are welcome before starting on library coverage (if oyu have any)!

No need to go deep on the detection rules themselves for now, since those will need to be expanded a lot anyway. I'm mostly asking about the skeleton.

Please test on another system if you can

I've only run this on my own machine (WSL2). It would be really helpful if someone could pull this branch, build it, and drop the JAR into a SonarQube instance to see whether the findings actually show up. Since I for example had some issues on windows some time earlier with some crlf vs lf file endings for the .sh scripts in this project and compilation works out of the box only on linux not on windows for me.

Build and deploy steps are the same as for the rest of the plugin -> build with mvn clean package, then copy the JAR from sonar-cryptography-plugin/target/ into your SonarQube plugins folder and restart.


Again, this is very much a draft. I am very happy to rework anything that's off or needs some fine tuning.

Adds a first-pass C# language module (csharp/) using an ANTLR-based
custom SonarQube sensor, following the pattern established by the Go
sensor. Includes a custom tree representation, detection engine, and
detection rules for the most common System.Security.Cryptography classes:
AES, DES, TripleDES, RC2, RSA, DSA, ECDsa, ECDiffieHellman, HMAC variants,
SHA hashing, and Rfc2898DeriveBytes.

All detection rules have unit tests and have been validated end-to-end
against a local SonarQube instance (23 findings produced).

Signed-off-by: Fynn Thierling <Fynnth@outlook.de>
@fynnth fynnth force-pushed the feat/csharp-support branch from 7a469e6 to 244d023 Compare March 4, 2026 12:24
@fynnth
Copy link
Author

fynnth commented Mar 4, 2026

One thing I noticed while looking at the implementation more closely: property assignments like aes.Mode = CipherMode.CBC are currently not detected at all -> the tree converter only picks up method calls and constructors, so anything set via a property after object creation is invisible to the detection engine. This means mode, padding, and similar parameters are missed unless they're passed directly as constructor arguments.

I'm not sure how to fix this properly. From what I can tell it would require either tracking the variable across statements (Symbol table) or using a different approach with actual semantic analysis. Any input on that?

@san-zrl
Copy link
Contributor

san-zrl commented Mar 6, 2026

Hi @fynnth - Thanks a lot for the code drop. Amazing work! Do you have a C# test repo with a set of known findings a scsn should produce? I could give it a try it on a Mac.

@fynnth
Copy link
Author

fynnth commented Mar 6, 2026

Hi @san-zrl,
I have just used the sonar-crypto repo, since it has the invocations i wanted to detect in the testfiles (For just the testfile folder it produced 23 findings, if i recall correctly). Later, when i take a more serious approach for proper detection rules, i will test on some open-source dotNet repos of course!

Also, since the parser is not using full c# grammar, many things in real world repos won't work.

@san-zrl
Copy link
Contributor

san-zrl commented Mar 6, 2026

Hi @fynnth - I pulled your PR, built it and ran it in a docker container SonarQube instance on my system. I scanned the files in csharp/src/test/files and got the summary below. Is this what you expected to see?

INFO: ========== CBOM Statistics ==========
INFO: Detected Assets                  : 23
INFO:  - Mac                           : 5
INFO:  - KeyAgreement                  : 2
INFO:  - PasswordBasedKeyDerivationFunction: 1
INFO:  - MessageDigest                 : 5
INFO:  - Signature                     : 3
INFO:  - PublicKeyEncryption           : 2
INFO:  - BlockCipher                   : 5
INFO: =====================================

@fynnth
Copy link
Author

fynnth commented Mar 6, 2026

Thank you, that looks perfect!

The coming week i will try to extend the parser to be more complete and maybe introduce a symbol table to also detect properties or similar things. There could still be some issues after that with properties outside of functions, but i think this will be a big step if i could achieve that :)

Also i will try to integrate any feedback Nicklas still has!

@n1ckl0sk0rtge
Copy link
Contributor

@fynnth some initial feedback. In general looks very good!!! Thanks for coding this up ☺️

Revert JsonCipherSuites.java changes — no logic change, just noisy reformatting

The cipher suites are condensed from multi-line to single-line format. The license header was also removed. This adds diff noise, will fail mvn spotless:check (missing license header), and has nothing to do with C# support.

Please revert this file entirely.

Move CSharpTreeConverter logic into the engine module

Looking at the other language modules like Go, the equivalent tree walking/conversion is handled within the engine layer (GoDetectionEngine directly processes sonar-go's Tree types inside the engine module).

Also the ANTLR build should already generates a CSharpParserBaseVisitor class (both <visitor>true</visitor> and <listener>true</listener> are set in pom.xml). Instead of manually walking the parse tree with instanceof checks and casting, consider extending CSharpParserBaseVisitor and overriding methods like visitMethodInvocation(), visitObjectCreation(), etc. This would make the tree conversion cleaner and easier to maintain as the grammar evolves.

ANTLR silent error handling in CryptoCSharpSensor.java

lexer.removeErrorListeners() and parser.removeErrorListeners() silently swallow parse errors. Malformed C# files will be partially scanned with no indication. The other language sensors (Go) logs warnings and optionally fails fast.

At least, add a custom error listener that logs at WARN level. Ideally, but optionally, add the fail-fast option like it's done for Go.

withoutParameters() on constructors that accept parameters

AesCcm(key), AesGcm(key), and Rfc2898DeriveBytes(password, salt, iterations) all use .withoutParameters(), meaning arguments are not captured. For things like key derivation, capturing password length and salt length is relevant.

This is not a problem if we skip it for now, but then add a TODO in the code as a known gap, or add parameter detection for those cases.

Limited variable/assignment tracking

It seems like only literals, member access (CipherMode.CBC), and direct identifiers are resolved? So, var cipher = Aes.Create(); cipher.KeySize = 256; cannot be traced?

I don't know if that is a limitation of using ANTLR (no semantic analysis), but it should be documented clearly so users understand the detection scope.

@n1ckl0sk0rtge
Copy link
Contributor

n1ckl0sk0rtge commented Mar 7, 2026

@fynnth

Does the module layout fit the pattern for new language support?

Yes, mostly. I think, just the CSharpTreeConverter should move into the engine module (engine/src/main/java/com/ibm/engine/language/csharp/), since that's where the equivalent logic lives for other languages.

Is the ANTLR approach the right call? How complete does the grammar need to be?

ANTLR is the right call here :) there's no sonar-csharp rule API to hook into (unlike Java/Python), so a custom parser is the only option.

Any issues with the plugin entry point wiring?

No, look good! :)

Tips on detection rule strategy and mapping?

Some thoughts:

  • Start with factory methods over constructors — Aes.Create(), RSA.Create(keySize) are the idiomatic .NET patterns. You're already doing this, which is good.
  • Property assignments matter — In C#, crypto configuration is often done via property setters (aes.KeySize = 256; aes.Mode = CipherMode.CBC; aes.Padding = PaddingMode.PKCS7). Without tracking these, we would miss relevant parameters. That will require some variable tracking in the engine.
  • For mapping, look at how the Java module maps JCA algorithm strings (e.g., "AES/CBC/PKCS5Padding") to the CBOM model. C#'s System.Security.Cryptography is more explicit (separate classes per algorithm), so the mapping is actually simpler — class name → algorithm is mostly 1:1.

In general, this repo might help https://github.com/FRI-DAY/sonar-gosu-plugin/tree/main as a reference. Its a sonar plugin based on ANTLR.

Signed-off-by: Fynn Thierling <Fynnth@outlook.de>
@fynnth
Copy link
Author

fynnth commented Mar 7, 2026

Hey @n1ckl0sk0rtge ,

Thank you for the quick response!
Here's a summary of what was addressed in the latest commit based on your review:

Addressed feedback:

  • Added CSharpParserErrorListener with WARN-level logging on ANTLR parse errors, with a unit test (CSharpParserErrorListenerTest) verifying the behavior on malformed and valid input

  • Moved CSharpTreeConverter from csharp/ to engine/ and refactored it to extend CSharpParserBaseVisitor, using visitBlock and visitPrimary_expression overrides instead of manual instanceof traversal (I hope this is was what you meant :) )

  • Documented the variable tracking limitation clearly in CSharpDetectionEngine's comment.

  • Implemented Phase 1 of variable tracking: CSharpTreeConverter.StatementCollector now overrides visitLocal_variable_declarator to populate assignedIdentifier on CSharpMethodInvocationTree and CSharpObjectCreationTree nodes (e.g. var aes = Aes.Create() → assignedIdentifier = "aes"). isInitForVariable and isInvocationOnVariable in CSharpDetectionEngine are now implemented accordingly. Added CSharpTreeConverterTest to directly verify the extraction.

Does everything fit what you had in mind?

Question on Phase 2 (property assignment tracking):

For patterns like aes.Mode = CipherMode.CBC / aes.KeySize = 256, i was thinking of extracting these as a new CSharpPropertyAssignmentTree node type in the converter, and then extending the detection engine to scan the block for property assignments on a tracked variable after an initial detection fires. Would this be the approach you'd recommend, or would you prefer to model property setters as synthetic method invocations (e.g. set_Mode(CipherMode.CBC)) so that existing withDependingDetectionRules machinery can be reused? Happy to go either direction or something completely different, because i really do not know what would be best.

@n1ckl0sk0rtge
Copy link
Contributor

Hey @fynnth,

thanks for adjusting this! :)

Regarding the property assignment tracking, I'd think modeling property setters as method invocations sounds good, but I'm not 100% sure either :)

C# properties are syntactic sugar for get_/set_ methods, so the compiler compiles into something like aes.set_Mode(CipherMode.CBC) anyway. Modeling them as synthetic method invocations in the tree converter is actually closer to the underlying semantics.

As i would see it, in CSharpTreeConverter, when you encounter an assignment expression where the left-hand side is a member access on a tracked variable (e.g., aes.Mode), emit a CSharpMethodInvocationTree with method name set_Mode and the right-hand side as the argument. The existing isInvocationOnVariable + withDependingDetectionRules chain would handle the rest.

The only thing to watch out for is that tree visitor visits these synthetic nodes the same way it visits real method invocations, so the depending rule chain picks them up correctly.

fynnth added 2 commits March 9, 2026 19:21
…cations

Model C# property assignments (e.g. ) as
synthetic  method invocations in the tree converter, following
the CLR convention where properties compile to / methods.
This enables the existing  chain to detect
property-based crypto configuration without any changes to the detection
engine itself.

Signed-off-by: Fynn Thierling <Fynnth@outlook.de>
Signed-off-by: Fynn Thierling <Fynnth@outlook.de>
@fynnth
Copy link
Author

fynnth commented Mar 9, 2026

Hi @n1ckl0sk0rtge,

I have added some initial version for property tracking in the latest 2 commits (last one was only a quick assertion fix). Please check if you think it is enough for first detection rules or if i should focus more on detection capabilties on the engine side. Here a quick recap of what i think the current version is capable of and what not:

Detection Capabilities:

What is detectable now through the right detection rules:

  • Static Factory-Calls: Aes.Create(), RSA.Create(), SHA256.Create(), ...
    Konstruktor-Calls: new AesManaged(), new AesGcm(key), new HMACSHA256(), ...

  • Method-Callson objects: rsa.SignData(...), sha.ComputeHash(...), ...

  • Property Setter:
    obj.Mode = CipherMode.CBC → synthetic set_Mode(CBC)
    obj.KeySize = 256 → synthetic set_KeySize(256)
    obj.Padding = PaddingMode.PKCS7 → synthetic set_Padding(PKCS7)
    -> works as soon as a Detection Rule for setter exists

What still is problematic from what i assume:

Parser Issues:

  • Object Initializer Syntax: new AesManaged { Mode = CipherMode.CBC, KeySize = 256 } -> will not generate node in tree
  • Cross-Block Property Tracking: creation in an outer block, Setter in if/while/for-Body ->connection between both gets lost
  • Compound Assignments: +=, -= will not be identified as setters
  • Multi-Level Property Chains: a.Inner.Prop = x will be ignored

Engine-Gaps:

  • Inter-procedural variable tracking
  • conditional/loop variable and crypto tracking (as already mentioned)

Grammar-Gaps(Parser fails-> file will be skipped, since th grammar i am using is only complete for c# 6):

  • File-scoped Namespaces namespace Foo; (C# 10) -> Grammar expects namespace Foo { }
  • Primary Constructors class Foo(int x) { } (C# 12)
  • Collection Expressions [1, 2, 3] (C# 12)
  • List-Patterns if (x is [1, 2, ..]) (C# 11)
  • Function Pointers delegate* managed (C# 9)
  • Raw String Literals """...""" (C# 11)

And I might miss many things here (i do not code in c#) so this is not a complete list. if you know of something, let me know! So i want to start with detection rules, but if you say we should complete the skeleton completely first and find some solution to the problems i understadn that and i will try harder to solve them in some way. (i dont know if that is easily done though for the issues with the antlr grammar)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants