From 5a809b5cc8b6240b9a131ebf5bcfdb9f67cf1273 Mon Sep 17 00:00:00 2001 From: zyf265600 Date: Thu, 17 Jul 2025 23:25:22 -0400 Subject: [PATCH 1/4] new onboarding doc for future URA student --- .../Checker Framework URA Onboarding.md | 1650 +++++++++++++++++ 1 file changed, 1650 insertions(+) create mode 100644 docs/manual/Checker Framework URA Onboarding.md diff --git a/docs/manual/Checker Framework URA Onboarding.md b/docs/manual/Checker Framework URA Onboarding.md new file mode 100644 index 000000000000..19080885c2e6 --- /dev/null +++ b/docs/manual/Checker Framework URA Onboarding.md @@ -0,0 +1,1650 @@ +# Checker Framework Start up / Onboarding + +## 1. **Introduction** + +Welcome to this comprehensive guide on understanding the Checker Framework's internal workings. As a new contributor to this project, you're about to embark on a journey through one of Java's most sophisticated static analysis tools. This guideline will walk you through exactly what happens when the Checker Framework processes a simple Java file, layer by layer, from the moment you invoke the checker until it completes its analysis. + +### 1.1 **What is the Checker Framework?** + +The Checker Framework is a pluggable type-checking system for Java that detects potential programming errors through static analysis, such as null pointer exceptions, resource leaks, thread safety issues, and more. It operates as an annotation processor for the javac compiler, which means it's invoked during the Java compilation process and has access to analyze the source code's Abstract Syntax Tree (AST). For more information, feel free to check the developer manual https://eisop.github.io/cf/manual/checker-framework-manual.pdf. + +**Our Example: NewObject.java** + +We'll use an extremely simple yet representative Java file as our analysis subject: + +```java +public class NewObject { + void test() { + Object nn = new Object(); + } +} +``` + +While this file appears trivial, when the Checker Framework processes it, it triggers a series of complex internal mechanisms, including: +- **Compiler Integration**: Running as a javac plugin +- **AST Construction**: Converting source code into an analyzable tree structure +- **Type Inference**: Determining the type and annotations for each expression +- **Dataflow Analysis**: Tracking variable states throughout program execution +- **Control Flow Graph Construction**: Creating a graph of program execution paths +- **Error Detection**: Identifying potential type safety issues + +### 1.2 **Why This Matters** + +Understanding these internal mechanisms is crucial for: +- **Contributing Code**: Knowing how to modify and extend existing functionality +- **Debugging Issues**: Being able to locate and resolve bugs within checkers +- **Performance Optimization**: Understanding bottlenecks to improve analysis efficiency +- **Creating New Checkers**: Developing custom analyzers for specific checking requirements + +### 1.3 What You'll Learn + +In this guideline, we will deeply explore: + +1. **Compiler Layer**: How javac invokes and integrates the Checker Framework +2. **Annotation Processor Layer**: How SourceChecker is initialized and executed +3. **AST Processing Layer**: How the Abstract Syntax Tree is parsed and traversed +4. **Type Factory Layer**: How AnnotatedTypeFactory assigns types to code elements +5. **Dataflow Analysis Layer**: How CFAnalysis tracks program state changes +6. **Visitor Pattern Layer**: How BaseTypeVisitor performs type safety checks + +Each layer will be explained in detail regarding its responsibilities, working principles, and how it collaborates with other layers to complete the comprehensive static analysis. + +### 1.4 The Journey Ahead + +We'll trace the execution path from the moment you run: +```bash +javac -processor org.checkerframework.checker.nullness.NullnessChecker NewObject.java +``` + +Through every internal method call, data structure creation, and analysis step, until the checker completes its work and reports any findings. + +Are you ready to dive deep into the internal workings of this powerful tool? Let's begin this technical journey! + + + +## 2. What is Processor + +Before understanding the -processor parameter, let's first explain the main phases of Java compilation. The Checker Framework integrates into this compilation process as an annotation processor, which is why understanding the overall compilation flow is crucial. + +### 2.1 Java Compilation Process Overview + +The following diagram illustrates the complete Java compilation pipeline: + +image-20250710140646269 + +#### Phase 1: Lexical Analysis + +The source code text is broken down into a stream of tokens including keywords, identifiers, operators, and literals. Each character sequence is classified according to the Java language specification. This phase produces a linear sequence of tokens that serves as input for syntax analysis. + +**Input**: Source code text +**Process**: The source code is broken down into tokens +**Output**: Token stream + +```java +// Source code +public class Hello { + public static void main(String[] args) { + System.out.println("Hello"); + } +} + +// After lexical analysis: Token stream +[PUBLIC] [CLASS] [IDENTIFIER:Hello] [LBRACE] +[PUBLIC] [STATIC] [VOID] [IDENTIFIER:main] [LPAREN] ... +``` + +#### Phase 2: Syntax Analysis +The token stream is parsed according to Java's grammar rules to construct an Abstract Syntax Tree (AST). This hierarchical structure represents the syntactic relationships between different code elements. The AST captures the program's structure in a form that subsequent phases can easily analyze and manipulate. + +**Input**: Token stream +**Process**: Constructs a hierarchical syntax structure +**Output**: Abstract Syntax Tree (AST) + +``` +CompilationUnit +└── ClassDeclaration (Hello) + └── MethodDeclaration (main) + └── BlockStatement + └── ExpressionStatement + └── MethodInvocation (System.out.println) +``` + +#### Phase 3: Symbol Resolution + +Classes, methods, variables, and other program entities are identified and added to symbol tables. Type information is initially resolved and cross-references between different program elements are established. This creates the foundation for semantic analysis by organizing all declared symbols. + +**Process**: Classes, methods, and variables are added to the symbol table +**Output**: Initial symbol table + +#### Phase 4: Annotation Processing (What we are looking for!) + +**Annotation processors like the Checker Framework are invoked to analyze the AST and potentially generate additional source files.** The javac compiler calls each processor's `init()` and `process()` methods in sequence. This phase can run multiple rounds if new source files are generated, allowing for iterative analysis and code generation. + +```java +// This is where the Processor participates! +// javac calls the annotation processor +processor.init(processingEnvironment); +processor.process(annotations, roundEnvironment); +``` + +**Key Points:** + +- Multiple rounds possible +- Can generate new source files +- **Checker Framework operates at this annotation processing phase** + +#### Phase 5: Attribute/Type Checking + +Type compatibility is verified throughout the program, ensuring that method calls, assignments, and other operations are semantically valid. Variable scoping rules are enforced and type inference is performed where necessary. The Checker Framework integrates here through TaskListener to perform its specialized type system analysis. + +**Process**: Type checking and variable analysis +- **Checker Framework's entry point**: Listens for ANALYZE events through TaskListener + +**This is where annotation processing and attribute analysis together complete the Checker Framework's analysis** + +#### Phase 6: Flow Analysis + +The compiler analyzes control flow to detect unreachable code, uninitialized variables, and other flow-related issues. This includes checking that final variables are properly initialized and that all code paths return appropriate values. Flow analysis ensures the program's execution semantics are well-defined. + +**Process**: Checks if variables are initialized, code reachability, etc. + +#### Phase 7: Desugaring + +Syntactic sugar constructs like enhanced for-loops, try-with-resources, and lambda expressions are converted into more basic Java constructs. This transformation simplifies the remaining compilation phases by reducing the variety of language constructs. The desugared code maintains the same semantics while using only fundamental language features. + +Converts syntactic sugar into more basic constructs: + +```java +// Syntactic sugar +for (String s : list) { } + +// After desugaring +for (Iterator it = list.iterator(); it.hasNext(); ) { + String s = it.next(); +} +``` + +#### Phase 8: Code Generation + +The final AST is converted into Java bytecode and written to `.class` files. Optimization may be applied during this phase to improve runtime performance. The resulting bytecode can be executed by any Java Virtual Machine. + +**Output**: .class files + +#### Understanding `-processor` + +The -processor parameter tells javac which annotation processors to run during Phase 4 of compilation. When you specify a processor, javac loads the specified class and gives it access to the AST and compilation environment. The Checker Framework uses this mechanism to perform static analysis and report type safety violations without modifying the standard compilation process. + + + +### 2.2 What is Processor in Java? + +Before we dive into how the Checker Framework works, we need to understand what a "Processor" is and why it's crucial for the Checker Framework's operation. + +#### 2.2.1 Basic Concepts of the Processor Interface + +The Processor interface is part of JSR 269 (Pluggable Annotation Processing API), which is Java's core interface that defines the standard protocol for annotation processing. + +```java +public interface Processor { + // Initialization of the processor + void init(ProcessingEnvironment processingEnv); + + // Annotation processing + boolean process(Set annotations, RoundEnvironment roundEnv); + + // Get supported annotation types + Set getSupportedAnnotationTypes(); + + // Get supported source code version + SourceVersion getSupportedSourceVersion(); + + // Get supported options + Set getSupportedOptions(); + + // Get auto-completion suggestions + Iterable getCompletions(...); +} +``` + + + +#### 2.2.2 Core Methods of the Processor Interface + +**A. init() - Initialization** + +```java +void init(ProcessingEnvironment processingEnv) +``` + +- **Purpose**: Connect to the compilation environment and obtain tool services +- **Timing**: The processor is created and initialized, called only once +- **Provided Tools**: Filer (file creation), Messager (message reporting), Elements/Types (class utilities) + +**B. process() - Annotation Processing** + +```java +boolean process(Set annotations, RoundEnvironment roundEnv) +``` + +- **Purpose**: Handle the core logic of **annotation processing** +- **Return Value**: true indicates the annotation has been processed; other processors should not handle it +- **Multiple Calls**: Can be called multiple times (annotation processing is multi-round) + +**C. getSupportedAnnotationTypes() - Declare Supported Annotations** + +```java +Set getSupportedAnnotationTypes() +``` + +- **Return Value**: Set of supported annotation class name collections +- **Special Value**: "*" indicates processing all annotations + +#### 2.2.3 Significance for Checker Framework + +By creating a concrete NullnessChecker instance that inherits from the entire framework hierarchy, the system achieves three critical capabilities that distinguish it from standard Java compilation: + +**A. Standard Interface** + +The Checker Framework implements the Processor interface, becoming a standard Java annotation processor: + +```java +// Complete inheritance chain +javax.annotation.processing.Processor (interface) + ↓ implements +javax.annotation.processing.AbstractProcessor (JDK standard abstract class) + ↓ inherits +org.checkerframework.javacutil.AbstractTypeProcessor (Checker Framework extension) + ↓ inherits +org.checkerframework.framework.source.SourceChecker (base checker) + ↓ inherits +org.checkerframework.common.basetype.BaseTypeChecker (type checker base class) + ↓ inherits +org.checkerframework.checker.initialization.InitializationChecker (initialization checker) + ↓ inherits +org.checkerframework.checker.nullness.NullnessChecker (nullness checker) +``` + +**B. Integration with javac** + +```java +// Achieved through standard -processor options +javac -processor org.checkerframework.checker.nullness.NullnessChecker MyClass.java +``` + +**C. Compile-time Type Checking** + +- **Traditional Approach**: Runtime type error discovery +- **Checker Framework: Compile-time type error discovery** +- **Implementation**: Using Processor API for compilation integration + +#### 2.2.4 Checker Framework's Extension of Processor + +![image-20250711165204213](/Users/zyf/Library/Application Support/typora-user-images/image-20250711165204213.png) + + + +#### 2.2.5 Key Design Decisions + +##### **A. Why Use the Processor Interface?** + +1. **Standardization**: Follows Java official standards +2. **Tool Support**: All tools supporting JSR 269 can be used +3. **Ecosystem**: Compatible with other annotation processors + +##### **B. AbstractTypeProcessor's Innovation** + +**Core Problem:** +The standard `process()` method is called during the **annotation processing phase** when type information is incomplete, making it impossible to perform in-depth type checking. + +**Root Causes of Incomplete Type Information:** + +1. **Compilation Design**: javac is segmented, with annotation processing before type analysis +2. **Dependency Analysis**: Must first analyze all dependencies to complete type information +3. **Circular Dependencies**: May exist dependencies between classes, requiring multi-round analysis +4. **Annotation Semantics**: Some annotation semantics can only be understood with complete type context + +Therefore, the Checker Framework must wait until the **ANALYZE** phase (when type information is complete) to perform accurate type checking. This is why it designed the **TaskListener** mechanism as the real timing entry point. + +##### **Solution (Important):** + +**AbstractTypeProcessor** adopts a two-phase approach: + +**Phase 1: Registration (process method)** + +```java +@Override +public final boolean process(...) { + // Collect the names of classes that need to be checked + for (TypeElement elem : roundEnv.getRootElements()) { + elements.add(elem.getQualifiedName()); // e.g.: "NewObject" + } + return false; // Don't claim any annotations +} +``` + +- **Timing**: Annotation processing phase (early) +- **Purpose**: Record which classes need to be checked +- **Limitation**: Type information incomplete + +**Phase 2: Type Checking (TaskListener) - Will be talked about later** + +```java +private class AttributionTaskListener implements TaskListener { + public void finished(TaskEvent e) { + if (e.getKind() == TaskEvent.Kind.ANALYZE) { + // Now type information is complete! + typeProcess(elem, path); // Perform real type checking + } + } +} +``` + +- **Timing**: ANALYZE phase (late) +- **Purpose**: Perform real type checking +- **Advantage**: Type information complete, enabling in-depth analysis + +**Why This Design:** + +1. **Follow Standards**: Still implements the Processor interface +2. **Real-time Checking**: Can check types in real-time during compilation +3. **Maintain Compatibility**: Can coexist with other annotation processors + +**This design allows the Checker Framework to utilize the compilation-time mechanism to access the complete type checking system.** + +#### 2.2.6 Practical Application Examples + +**A. Traditional Annotation Processor** + +```java +@SupportedAnnotationTypes("com.example.MyAnnotation") +public class MyProcessor extends AbstractProcessor { + @Override + public boolean process(...) { + // Handle @MyAnnotation + return true; + } +} +``` + +**B. Checker Framework Type Checker** + +```java +// Simplified version +public class NullnessChecker extends InitializationChecker { + // Doesn't override process() here! + // Type checking implemented in visitor +} +``` + +#### 2.2.8 Limitations of the Processor Interface and Checker Framework's Solutions + +| Limitation | Checker Framework's Solution | +| ------------------------------------------------- | ---------------------------------------------------- | +| process() called when type information incomplete | Use TaskListener to handle at ANALYZE phase | +| Can only handle annotations, not types | Extend to complete type system checking | +| Lacks AST traversal framework | Provide BaseTypeVisitor and complete visitor pattern | +| Limited error reporting functionality | Enhanced error reporting and diagnostic system | + +The Checker Framework skillfully utilizes the infrastructure provided by the Processor interface, while using mechanisms like TaskListener to overcome its limitations, creating a powerful compile-time type checking framework. This design maintains standard compatibility while achieving innovative functional extensions. + +Now that we understand what a Processor is and its role, let's continue with the detailed explanation of what happens when you run the javac command with the Checker Framework. + + + +## 3. How Processor Works + +After introducing what a Java processor is in the previous chapter, this chapter explains how a processor works. Using the command `javac -processor org.checkerframework.checker.nullness.NullnessChecker NewObject.java` as our reference, we will break down step by step what the Java compiler does after this line is executed for our simple code example: + +```java +public class NewObject { + void test() { + Object nn = new Object(); + } +} +``` + +### 3.1 Step 1: javac Command Startup + +```bash +javac -processor org.checkerframework.checker.nullness.NullnessChecker \ + -cp checker-framework.jar \ + NewObject.java +``` + +When javac starts, it analyzes the command line parameters. The `-processor org.checkerframework.checker.nullness.NullnessChecker` parameter tells javac to run this specific annotation processor during compilation. javac will instantiate this class through **reflection** using the **Java Standard Annotation Processing Mechanism**. + +#### 3.1.1 What is reflection + +**Reflection** is a fundamental Java feature that allows a program to examine and modify its own structure during **runtime**, essentially enabling code to "look at itself" and make decisions based on what it finds. Think of reflection as giving a program the ability to hold up a mirror to its own code and understand what it's made of. + +In simple terms, reflection works like this: instead of directly calling a method when you know its name at compile time, reflection allows you to discover and call methods whose names you only learn at runtime. For example, rather than writing `obj.getName()` directly, you could use reflection to find a method called "getName" and then invoke it dynamically. + +```java +// Normal method call (compile-time) +String name = person.getName(); + +// Reflection-based call (runtime) +Method method = person.getClass().getMethod("getName"); +String name = (String) method.invoke(person); +``` + +The Checker Framework relies heavily on reflection throughout its architecture. When you execute `javac -processor org.checkerframework.checker.nullness.NullnessChecker NewObject.java`, the Java compiler uses reflection to dynamically load the `NullnessChecker` class by its string name, instantiate it, and then invoke its processing methods. The framework couldn't work without reflection because it needs to examine the annotations, class structures, and method signatures of your source code to perform type checking. + +During the analysis of our `NewObject.java` example, the NullnessChecker uses reflection to inspect the class structure, discover that there's a `test()` method, examine the `new Object()` expression, and determine the appropriate nullness annotations to apply. This introspective capability is what makes the Checker Framework so flexible and powerful—it can analyze any Java code and adapt to different type systems without being hardcoded to specific program structures. + +image-20250711222214255 + +The diagram above illustrates how reflection powers the Checker Framework's operation. Each step involving reflection demonstrates how the framework dynamically discovers and interacts with code elements that weren't known at compile time, enabling the flexible and extensible type checking that makes the Checker Framework so powerful for ensuring code safety. + +#### 3.1.2 Java Standard Annotation Processing Mechanism + +Java's annotation processing mechanism, introduced in Java 6, allows code analysis and generation during compilation. This serves as the foundation for the Checker Framework's type-checking capabilities. + +**Understanding Annotation Processing** + +Annotation processing performs **compile-time introspection** - examining source code during compilation rather than at runtime. The Java compiler (`javac`) provides an API that lets annotation processors access and analyze the Abstract Syntax Tree (AST) before bytecode generation. + +```java +public class NewObject { + void test() { + Object nn = new Object(); // <- Processor analyzes this expression + } +} +``` + +**The Processing Lifecycle** + +Annotation processing occurs in multiple rounds during compilation: + +image-20250711231434036 + +Processing continues in rounds until no new files are generated, then standard compilation proceeds. + +**Processing NewObject.java** + +When processing our example, the NullnessChecker: + +1. **Discovers** all classes and methods to analyze +2. **Traverses** the AST of each method +3. **Analyzes** expressions like `new Object()` to determine nullness types +4. **Verifies** type consistency with nullness rules +5. **Reports** any violations through the compiler's diagnostic system + +This mechanism enables the Checker Framework to perform sophisticated compile-time verification without modifying the Java language or compiler, providing powerful static analysis for preventing null pointer exceptions and other type-related errors. + +### 3.2 Step 2: Annotation Processor Discovery and Instantiation + +#### 3.2.1 javac Instantiates NullnessChecker + +Javac's Internal Code for Instantiation: +```java +// javac's internal code for instantiation: +NullnessChecker checker = new NullnessChecker(); // Creates NullnessChecker instance +``` + +#### 3.2.2 Inheritance Chain Construction Process + +When javac calls `new NullnessChecker()`, the actual construction process involves: + +```java +public class NullnessChecker extends InitializationChecker { + public NullnessChecker() {} // Step 1: First call this constructor +} + +public abstract class InitializationChecker extends BaseTypeChecker { + public InitializationChecker() {} // Step 2: Then call parent constructor +} + +public abstract class BaseTypeChecker extends SourceChecker { + // No explicit constructor, use default constructor // Step 3: Continue inheritance chain +} + +public abstract class SourceChecker extends AbstractTypeProcessor { + protected SourceChecker() {} // Step 4: Call parent constructor again +} + +public abstract class AbstractTypeProcessor extends AbstractProcessor { + protected AbstractTypeProcessor() {} // Step 5: Finally call top-level parent + // Key creation during this process: listener = new AttributionTaskListener() +} +``` + +**Key Points from the Construction Process:** + +1. Assigns internal storage space to the entire NullnessChecker object +2. Must initialize all parent class members and code +3. Then initializes subclass-specific components + +#### 3.2.3 Critical Component: AttributionTaskListener + +During construction (line 77 in AbstractTypeProcessor): +```java +private final AttributionTaskListener listener = new AttributionTaskListener(); +``` + +This `listener` is defined in AbstractTypeProcessor but its external reference points to the concrete NullnessChecker instance. + +The construction process establishes the complete inheritance hierarchy: **NullnessChecker → InitializationChecker → BaseTypeChecker → SourceChecker → AbstractTypeProcessor**, with the actual instantiated object being a NullnessChecker instance that inherits all capabilities from its parent classes. + + + +### 3.3 Step 3: javac Invoke Processor.init() + +The `init()` method is a fundamental part of the Java annotation processing lifecycle, defined in the `javax.annotation.processing.Processor` interface. This method serves as the initialization hook for annotation processors. + +#### 3.3.1 When is init() Called? + +The `init()` method is called **immediately after** the annotation processor is **instantiated** by `javac`, but **before** any processing rounds begin. According to the Java annotation processing specification, this occurs: + +1. **Once per processor instance** - never called multiple times on the same processor object +2. **Before getSupportedAnnotationTypes(), getSupportedOptions(), and getSupportedSourceVersion()** are called +3. **Before any process() or typeProcess() methods** are invoked + +#### 3.3.2 What is the Purpose of init()? + +The `init()` method provides the processor with its `ProcessingEnvironment`, which contains all the essential tools and utilities needed for code analysis: + +```java +public interface Processor { + void init(ProcessingEnvironment processingEnv); + // ... other methods +} +``` + +The `ProcessingEnvironment` provides access to: +- **Messager**: For reporting errors, warnings, and notes +- **Filer**: For creating new files during processing +- **Elements**: Utilities for working with program elements +- **Types**: Utilities for working with type information +- **Options**: Access to processor options passed via command line + +#### 3.3.3 The Complete init() Call Chain in the Checker Framework + +Let's trace the complete initialization sequence when `javac` processes our command: +`javac -processor org.checkerframework.checker.nullness.NullnessChecker docs/examples/NewObject.java` + +##### Step 1: Processor Instantiation +```java +// javac internally creates the processor instance +NullnessChecker checker = new NullnessChecker(); +``` + +##### Step 2: AbstractTypeProcessor.init() Call +file location: javacutil/src/main/java/org/checkerframework/javacutil/AbstractTypeProcessor.java + +```java +@Override +public synchronized void init(ProcessingEnvironment env) { + super.init(env); // Call AbstractProcessor.init() + JavacTask.instance(env).addTaskListener(listener); // Register TaskListener + Context ctx = ((JavacProcessingEnvironment) processingEnv).getContext(); + JavaCompiler compiler = JavaCompiler.instance(ctx); + compiler.shouldStopPolicyIfNoError = CompileState.max(compiler.shouldStopPolicyIfNoError, CompileState.FLOW); + compiler.shouldStopPolicyIfError = CompileState.max(compiler.shouldStopPolicyIfError, CompileState.FLOW); +} +``` + +**This critical init() method accomplishes:** + +1. **Stores the ProcessingEnvironment**: Calls `super.init(env)` to store the processing environment +2. **Registers the AttributionTaskListener**: This is the **key innovation** - it tells javac to notify the processor when type analysis is complete +3. **Configures Compiler Behavior**: Ensures javac will complete the FLOW phase (type analysis) before stopping + + + +### 3.4 Step 4: Waiting for Type Analysis Completion (Core) + +#### 3.4.1 Current State Summary + +At this point, the system has completed the basic initialization: + +```java +// Current state after init() completion: +NullnessChecker instance = /* Created and initialized */; +TaskListener listener = /* Registered with javac */; +ProcessingEnvironment env = /* Set up and ready */; +``` + +The NullnessChecker is **ready but waiting** - it has established communication channels with javac but has not yet begun any actual type checking work. + + + +#### 3.4.2 Review of Javac Compilation Phase Order + +**The javac compiler executes phases in the following order:** + +1. **PARSE** - Lexical analysis and syntax analysis, generating AST +2. **ENTER** - Symbol table construction, type declarations enter symbol table +3. **ANALYZE - Type checking, semantic analysis, data flow analysis** +4. **GENERATE** - Bytecode generation + +##### 3.4.2.1 ANALYZE Phase Details + +The standard compilation flow of the Java compiler (`com.sun.tools.javac.main.JavaCompiler`): + +```text +SOURCE FILE (.java) + ↓ +PARSE → Syntax analysis, generate AST + ↓ +ENTER → Symbol registration (class names, method names into symbol table) + ↓ +ANALYZE → Type checking, semantic analysis, annotation inference (Checker Framework core work phase) + ↓ +LOWER/TRANSTYPES → AST transformations + ↓ +GENERATE → Output .class bytecode +``` + +##### 3.4.2.2 TaskEvent Notification System + +Upon completion of each compilation phase, the compiler emits corresponding `TaskEvent` notifications to registered listeners: + +```java +TaskEvent.Kind.PARSE // Parsing completion +TaskEvent.Kind.ENTER // Symbol table construction completion +TaskEvent.Kind.ANALYZE // Type checking and semantic analysis completion +TaskEvent.Kind.GENERATE // Bytecode generation completion +``` + +##### 3.4.2.3 ANALYZE Phase Specification + +The ANALYZE phase represents the most semantically intensive stage of the compilation process, encompassing several critical operations: + +| **Operation** | **Description** | +| ------------------------- | ------------------------------------------------------------ | +| **Type Attribution** | Assignment of type information to all expressions, variables, and method invocations | +| **Semantic Validation** | Verification of language semantics compliance and type safety constraints | +| **Symbol Resolution** | Resolution of all identifiers to their corresponding symbol table entries | +| **Data Flow Analysis** | Detection of uninitialized variables, unreachable code, and control flow anomalies | +| **Annotation Processing** | Execution of pluggable annotation processors and type system extensions | + +##### 3.4.2.4 Implementation Architecture + +The ANALYZE phase is implemented through several key compiler components: + +```java +com.sun.tools.javac.comp.Attr // Type attribution engine +com.sun.tools.javac.comp.Flow // Data flow and definite assignment analyzer +com.sun.tools.javac.comp.Check // Semantic constraint validator +``` + +##### 3.4.2.5 Processing Sequence + +During the ANALYZE phase, the compiler performs the following sequence of operations: + +1. **Type Attribution**: The `Attr` component traverses the AST and assigns type information to each node +2. **Flow Analysis**: The `Flow` component performs definite assignment analysis and checks for: + - Uninitialized local variables + - Unreachable statements + - Missing return statements + - Exception handling completeness +3. **Semantic Validation**: The `Check` component validates: + - Method override correctness + - Access modifier compliance + - Type conversion legality + - Generic type parameter constraints +4. **Extension Processing**: Registered annotation processors and type system extensions are invoked + +This phase is fundamental to the Checker Framework's operation, as it provides the complete type-attributed AST necessary for advanced static analysis and type checking operations. + + + +#### 3.4.3 Review of The NullnessChecker's Two-Phase Strategy + +Building on the javac compilation phases, the Nullness Checker introduces a specialized two-phase strategy to perform precise nullness analysis using fully resolved type information. + +**Traditional annotation processors** work during annotation processing rounds, but they lack access to complete type information. The Checker Framework needs **fully resolved type information** to perform accurate nullness checking. + +The Checker Framework employs a sophisticated **two-phase approach**: + + + +#### 3.4.4 Phase 1: Registration (process method) + +```java +@Override +public final boolean process(Set annotations, RoundEnvironment roundEnv) { + // PURPOSE: Just register elements - do NO real work + for (TypeElement elem : ElementFilter.typesIn(roundEnv.getRootElements())) { + elements.add(elem.getQualifiedName()); // Add "NewObject" to pending list + } + + return false; +} +``` + +Collect class names that need type checking for later processing during ANALYZE phase, acting as a deferred registration mechanism. + +**Elements:** `TypeElement` objects representing classes/interfaces from the source code obtained via `ElementFilter.typesIn(roundEnv.getRootElements())`. For example, if compiling `NewObject.java`, this would include the `TypeElement` for the `NewObject` class. + +**What is Added:** The qualified names (strings like `"com.example.NewObject"`, `"java.lang.String"`) of these type elements are added to an internal `elements` HashSet collection via `elements.add(elem.getQualifiedName())`. + +**What is Registered:** A pending work list of class names that will be processed when javac completes type analysis. For instance, when processing: +```java +// Input: NewObject.java +public class NewObject { + Object obj = new Object(); // This will need nullness checking later +} +``` + +The system registers `"NewObject"` in the elements set. Later, when javac triggers `TaskEvent.Kind.ANALYZE` for this class, the AbstractTypeProcessor will: +1. **Remove `"NewObject"` from the pending elements set** +2. **Call `typeProcess(NewObject, treePath)` to perform actual nullness checking** + +This design exists because **type information isn't fully available during PROCESS phase** - method signatures, generics, and type relationships are incomplete until ANALYZE phase. + +##### Who Calls It and When? + +**Caller**: `javac`'s annotation processing infrastructure +**When**: During the **PROCESS** phase of compilation, immediately after **ENTER** phase completes +**Context**: This happens for **every** annotation processor registered via `-processor` flag + +```java +// javac internal sequence for our command: +// javac -processor org.checkerframework.checker.nullness.NullnessChecker docs/examples/NewObject.java + +1. javac creates NullnessChecker instance +2. javac calls init(ProcessingEnvironment) +3. javac calls getSupportedAnnotationTypes() → returns ["*"] +4. javac calls process(annotations, roundEnv) ← THIS IS WHERE WE ARE +``` + +##### Why This Design? The Fundamental Problem + +Traditional annotation processors face a **critical limitation**: + +```java +// Traditional annotation processor problem: +@Override +public boolean process(Set annotations, RoundEnvironment roundEnv) { + // Problem: Type information is NOT fully available here! + // - No resolved method signatures + // - No inheritance relationships + // - No generic type parameters resolved + // - Cannot determine if "new Object()" is @NonNull or @Nullable +} +``` + + + +#### 3.4.5 Phase 2: Type Checking (TaskListener) + +**Purpose**: Perform actual nullness analysis +**Timing**: After javac completes type analysis +**Method**: `AttributionTaskListener.finished()` → `typeProcess()` + +```java +private final class AttributionTaskListener implements TaskListener { + @Override + public void finished(TaskEvent e) { + // WHO CALLS: javac's internal TaskListener notification system + // WHEN: After each compilation phase completes for each file + // WHY: javac fires events to notify interested parties about compilation progress + + if (e.getKind() != TaskEvent.Kind.ANALYZE) { + return; // We only care about type analysis completion, not PARSE, ENTER, etc. + } + + // FIRST-TIME SETUP: Initialize the checker infrastructure + if (!hasInvokedTypeProcessingStart) { + typeProcessingStart(); // → initChecker() → create type factory, visitor, etc. + hasInvokedTypeProcessingStart = true; + } + + // CHECK: Was this element registered during Phase 1? + if (!elements.remove(e.getTypeElement().getQualifiedName())) { + return; // Skip elements we didn't register (shouldn't happen for our case) + } + + // EXTRACT: Get the fully-analyzed element and tree path + TypeElement elem = e.getTypeElement(); // NewObject class with complete type info + TreePath p = Trees.instance(processingEnv).getPath(elem); // AST path to NewObject + + // BEGIN ACTUAL TYPE CHECKING: + typeProcess(elem, p); // This is where nullness analysis finally happens! + } +} +``` + +Execute actual nullness analysis and type checking on fully-analyzed classes with complete type information available. + +##### Who Calls It and When? + +javac's internal task management system creates and fires `TaskEvent.Kind.ANALYZE` events through its `notifyAnalyzeDone()` method when completing analysis of each compilation unit. + +**Complete Compilation Pipeline for NewObject.java:** + +```java +// javac internal compilation pipeline for NewObject.java: +PARSE: "public class NewObject { void test() { Object nn = new Object(); } }" + ↓ (creates AST) +ENTER: Create symbol table entries for NewObject, test(), nn variable + ↓ (symbols created but not resolved) +PROCESS: Call all annotation processors' process() methods (including ours) + ↓ (we register "NewObject" in elements set) +ANALYZE: ← TYPE RESOLUTION HAPPENS HERE ← + - Resolve "Object" type reference + - Determine "new Object()" return type + - Complete symbol table with full type information + ↓ (TaskEvent.Kind.ANALYZE fired) +FINISHED: Our AttributionTaskListener.finished() gets called ← PHASE 2 BEGINS +``` + +![image-20250712233933724](assets/image-20250712233933724.png) + +**TaskEvent Creation Process:** +When javac completes analyzing the NewObject class, it internally creates a TaskEvent containing: + +- `Kind`: `TaskEvent.Kind.ANALYZE` +- `compilationUnit`: CompilationUnitTree for "NewObject.java" +- `typeElement`: TypeElement with qualified name "NewObject" +- Full AST with complete type attribution + +**Connection to Phase 1:** Phase 2 consumes the work list created in Phase 1. When javac completes analysis of a class (e.g., `"NewObject"`), the system: +1. Receives TaskEvent through `AttributionTaskListener.finished(event)` +2. Checks if this class name exists in the `elements` set registered during Phase 1 +3. If found, removes it from the pending list (`elements.remove(event.getTypeElement().getQualifiedName())`) +4. Extracts the fully-analyzed `TypeElement` and constructs `TreePath` using `Trees.instance(processingEnv).getPath(elem)` +5. Calls `typeProcess(elem, treePath)` to perform actual nullness checking on the complete, type-attributed AST + +**Example Flow:** +- Phase 1: Register `"NewObject"` → pending list +- Phase 2: javac completes analyzing `NewObject` → creates TaskEvent(Kind.ANALYZE, NewObject) → removes from pending list → performs nullness analysis on `new Object()` expressions with full type information + +This two-phase design ensures type checking happens at the optimal time when complete type information is available, solving the fundamental limitation that type details are incomplete during the initial PROCESS phase. + +##### Why This Design? The Innovation + +Critical Information Available Now (vs. Phase 1) + +**Phase 1 (process method)** - Limited Information: + +```java +// What we CAN'T know during process(): +TypeElement newObject = ...; // Incomplete type information +// - Don't know if Object() constructor exists +// - Don't know return type of new Object() +// - Don't know inheritance hierarchy +// - Don't know if any annotations are inherited +``` + +**Phase 2 (finished method)** - Complete Information: + +```java +// What we CAN know during finished(): +TypeElement newObject = e.getTypeElement(); // FULLY RESOLVED +// - Object() constructor is resolved and valid +// - new Object() returns exactly java.lang.Object type +// - Complete inheritance: Object extends nothing, implements nothing +// - All type annotations are computed and available +// - Symbol table is complete with method signatures, field types, etc. +``` + +**Why Choose the ANALYZE Phase** + +The ANALYZE phase is the most suitable timing for type checking because: + +- **Complete type information**: All type declarations have entered the symbol table +- **Complete AST**: The syntax tree has been fully constructed +- **Symbol resolution complete**: All identifiers have been resolved to corresponding symbols +- **Type inference complete**: Generic type parameters have been inferred +- **Data flow analysis available**: More complex analysis can be performed + +**Element Collection Management** + +`AbstractTypeProcessor` maintains an `elements` collection to track classes that need processing: + +- **process() phase**: Add all root elements to the `elements` collection +- **After ANALYZE completion**: Remove processed elements from the `elements` collection +- **Processing completion determination**: Call `typeProcessingOver()` when `elements` is empty + +**Key Timing** + +1. **Compilation starts** → `process()` is called → Collect class names to process +2. **ANALYZE phase** → Each class completes analysis → Triggers `TaskEvent.Kind.ANALYZE` +3. **Event handling** → `AttributionTaskListener.finished()` → Calls `typeProcess()` +4. **Type checking** → `SourceChecker.typeProcess()` → Traverses AST for checking +5. **Processing complete** → All classes processed → Calls `typeProcessingOver()` + + + +#### 3.4.6 NewObject.java Example Walkthrough + +**Phase 1 Registration for NewObject.java** + +```java +// When javac processes: docs/examples/NewObject.java +// During PROCESS phase: + +roundEnv.getRootElements() contains: +├── PackageElement: "" (default package) +└── TypeElement: "NewObject" (but type info incomplete) + +// Our process() method executes: +for (TypeElement elem : ElementFilter.typesIn(roundEnv.getRootElements())) { + // elem = TypeElement for "NewObject" class (but incomplete) + elements.add(elem.getQualifiedName()); // elements.add("NewObject") +} + +// Result: elements = {"NewObject"} - ready for later processing +``` + +**Phase 2 Type Checking for NewObject.java** + +```java +// After ANALYZE phase completes for NewObject.java: +// TaskEvent.Kind.ANALYZE fired with complete type information + +TaskEvent e contains: +├── e.getTypeElement() = fully resolved NewObject TypeElement +│ ├── Methods: [test()] +│ │ └── test() has complete signature: void test() +│ └── Fields: [] (none) +└── e.getCompilationUnit() = complete AST with resolved types + └── ClassTree for NewObject + └── MethodTree for test() + └── VariableTree for "nn" with type java.lang.Object + └── NewClassTree for "new Object()" with resolved constructor + +// Our finished() method executes: +elements.remove("NewObject") = true // Successfully found and removed +typeProcess(NewObjectElement, TreePath) // Begin nullness checking! +``` + + + +#### 3.4.7 What's Next? Type Checking Phase Preview + +After `typeProcess(NewObject, treePath)` is called, the **actual nullness analysis** begins: + +#### Next Steps (Brief Overview) + +1. **AST Traversal**: `NullnessNoInitVisitor` walks through the NewObject AST +2. **Expression Analysis**: Encounters `new Object()` expression +3. **Type Factory Consultation**: `NullnessNoInitAnnotatedTypeFactory` determines this should be `@NonNull` +4. **Assignment Checking**: Verifies `Object nn = new Object()` is type-safe +5. **Result**: No nullness violations found (new Object() is always @NonNull) + +#### The Transition + +```java +// We're transitioning from: +Phase 1 & 2: "Getting ready to check types" + ↓ +Phase 3: "Actually checking NewObject.java for nullness violations" + +// Next major method calls: +typeProcess(NewObject, treePath) + → visitor.visit(ClassTree) + → visitor.visitMethod(test()) + → visitor.visitVariable(nn declaration) + → visitor.visitNewClass(new Object()) + → Check: Is new Object() @NonNull? YES! + → Check: Can @NonNull be assigned to nn? YES! + → Result: No errors +``` + +The two-phase design ensures that when we finally analyze `new Object()` in our NewObject.java, we have complete certainty about its type properties, inheritance relationships, and nullness characteristics - enabling accurate and reliable null pointer exception prevention. + + + +### 3.5 Step 5: Start Checking! - typeProcess + +#### 3.5.1 Event Trigger Layer - AttributionTaskListener.finished() +##### Detailed Analysis of finished() Method: + +As mentioned in section 3.4, based on the actual implementation in `AbstractTypeProcessor.java`, the `AttributionTaskListener.finished()` method performs a comprehensive multi-stage validation and processing sequence: + +```java +@Override +public void finished(TaskEvent e) { + // Step 1: Event Type Filtering - Only handle ANALYZE events + if (e.getKind() != TaskEvent.Kind.ANALYZE) { + return; // Ignore other events (PARSE, ENTER, GENERATE, etc.) + } + + // Step 2: First-time Initialization - Called exactly once + if (!hasInvokedTypeProcessingStart) { + typeProcessingStart(); // → initChecker() → create type factory, visitor, etc. + hasInvokedTypeProcessingStart = true; + } + + // Step 3: Early Completion Check - Handle empty processing queue + if (!hasInvokedTypeProcessingOver && elements.isEmpty()) { + typeProcessingOver(); // All processing completed cleanup + hasInvokedTypeProcessingOver = true; + } + + // Step 4: Data Integrity Validation - Ensure complete event data + if (e.getTypeElement() == null) { + throw new BugInCF("event task without a type element"); + } + if (e.getCompilationUnit() == null) { + throw new BugInCF("event task without compilation unit"); + } + + // Step 5: Registration Verification - Check if this class was registered in Phase 1 + // elements collection populated in process() during PROCESS phase, contains qualified names + if (!elements.remove(e.getTypeElement().getQualifiedName())) { + return; // Skip elements we didn't register (shouldn't happen for our case) + } + + // Step 6: Extract Fully-analyzed Element and Tree Path + TypeElement elem = e.getTypeElement(); // Complete type element + TreePath p = Trees.instance(processingEnv).getPath(elem); // AST path + + // Step 7: Begin Actual Type Checking - Call into concrete implementation + typeProcess(elem, p); // This is where nullness analysis finally happens! + + // Step 8: Final Completion Check - Check if all processing completed + if (!hasInvokedTypeProcessingOver && elements.isEmpty()) { + typeProcessingOver(); + hasInvokedTypeProcessingOver = true; + } +} +``` + +Within function `finished` there are two important code lines: + +```java +TypeElement elem = e.getTypeElement(); +TreePath p = Trees.instance(processingEnv).getPath(elem); +``` + +Represents two parameters are passed into function `typeProcess()`. We will dive into these two parameters and figure out why they are crucial about typeProcess process. + + + +##### 1. First Line: Obtaining TypeElement + +```java +TypeElement elem = e.getTypeElement(); +``` + +**TaskEvent.getTypeElement() Functionality** + +- `e` is a TaskEvent object representing an event in javac's compilation process +- `getTypeElement()` returns the type element associated with this event +- TypeElement is a Java Language Model API representing a class or interface element + +**TypeElement vs Tree Distinction** + +The compilation process creates two parallel representations: + +**Source Code** → **Syntax Analysis** → **Two Outputs:** + +1. **AST Tree (ClassTree)** - **Syntax Structure** + - Tree nodes + - Position information + - Source code representation + +2. **Symbol Table (TypeElement)** - **Semantic Information** + - Type element + - Type information + - Member information + - Resolution information + +**TypeElement Information Content with NewObject Example** + +Consider our example NewObject.java: +```java +public class NewObject { + void test() { + Object nn = new Object(); + } +} +``` + +When javac completes ANALYZE phase for this class, the TaskEvent contains a TypeElement with complete semantic information: + +```java +// TypeElement provides key semantic information for NewObject class +TypeElement elem = e.getTypeElement(); + +elem.getQualifiedName() // Returns: "NewObject" (or "com.example.NewObject" if in package) +elem.getSimpleName() // Returns: "NewObject" +elem.getKind() // Returns: ElementKind.CLASS +elem.getModifiers() // Returns: [PUBLIC] +elem.getEnclosedElements() // Returns: [test() method, implicit constructor] +elem.asType() // Returns: TypeMirror representing NewObject type +``` + + + +##### 2. Second Line: Obtaining TreePath + +```java +TreePath p = Trees.instance(processingEnv).getPath(elem); +``` + +**Trees.instance()** + +- `Trees` is a utility class provided by javac +- `instance(processingEnv)` obtains the Trees instance associated with the current processing environment +- Provides bridging functionality between AST and language model + +**getPath(elem) Working Principle** + +Converts TypeElement (semantic information) to TreePath (syntax structure information) + +Working Steps: + +1. **Obtain Current Compilation Unit**: Retrieve CompilationUnitTree from processing environment +2. **AST Search**: Traverse syntax tree to find Tree node corresponding to TypeElement +3. **Element Matching**: Compare each Tree node's Element with target TypeElement +4. **Path Construction**: After finding matching node, construct complete TreePath from root to target + +**Trees.getPath(TypeElement) Implementation Mechanism** + +The core functionality of `Trees.getPath(elem)` method is: + +- **Find Corresponding Tree from TypeElement**: Search for AST node corresponding to TypeElement in current compilation unit +- **Construct TreePath**: Create complete path from root node to target node +- **Caching Optimization**: Avoid redundant searches, improve performance + +##### 3. Detailed Implementation Flow + +![image-20250714123825161](assets/image-20250714123825161.png) + + + +##### 4. TreePath Structure and Functionality + +4.1 TreePath Composition + +```java +// TreePath represents hierarchical path from root to target +// Example: for foo() method in MyClass +TreePath path = [ + CompilationUnitTree, // Root node: entire .java file + ClassTree, // Class declaration node + MethodTree // Method declaration node (leaf) +] +``` + +4.2 TreePath Provided Functionality + +**TreePath p = Trees.instance(processingEnv).getPath(elem);** + +```java +// Obtain leaf node (target node) - in tree construction, leaf node is target node +Tree leaf = p.getLeaf(); // ClassTree + +// Obtain parent path +TreePath parent = p.getParentPath(); // Path to CompilationUnitTree + +// Obtain compilation unit +CompilationUnitTree unit = p.getCompilationUnit(); + +// Traverse entire path +for (Tree tree : p) { + System.out.println(tree.getKind()); +} +``` + +##### 5. Why These Two Operations Are Required + +5.1 TypeElement Limitations + +- **Only semantic information**: types, members, annotations, etc. +- **Lacks positional information**: Cannot know location in source code +- **Cannot access syntax structure**: Cannot obtain detailed syntax information + +5.2 TreePath Advantages + +- **Complete context**: Contains complete path from root to target +- **Positional information**: Can obtain source positions, line numbers, etc. +- **Syntax structure**: Can access detailed AST syntax information +- **Traversal capability**: Supports AST traversal and analysis + +##### 6. Practical Application Scenario + +```java +// In AttributionTaskListener.finished() +TypeElement elem = e.getTypeElement(); // Obtain type element +TreePath p = Trees.instance(processingEnv).getPath(elem); // Obtain AST path + +// Now can perform type checking +typeProcess(elem, p); // Pass to type processor + +// Usage in typeProcess() +public void typeProcess(TypeElement element, TreePath path) { + // element provides semantic information + String className = element.getQualifiedName().toString(); + + // path provides syntax information + ClassTree classTree = (ClassTree) path.getLeaf(); + CompilationUnitTree unit = path.getCompilationUnit(); + + // Begin AST traversal + visitor.visit(path); +} +``` + +##### 7. Summary + +The core functionality of these two lines of code is **establishing the bridge between semantic information and syntax information**: + +1. **TypeElement**: Provides semantic information for classes (names, members, annotations, etc.) +2. **TreePath**: Provides corresponding syntax structure information (AST nodes, positions, context) + +Through this approach, Checker Framework can simultaneously access: +- **Compiled type system** (via TypeElement) +- **Source syntax structure** (via TreePath) + +This provides a comprehensive information foundation for subsequent type checking, enabling accurate analysis in the correct context. + +**Enhanced Context**: The `finished()` method serves as a critical **synchronization point** where javac's internal compilation state transitions from "analyzing" to "analyzed." At this precise moment, both the semantic model (TypeElement with complete type information) and syntax model (TreePath with complete AST) are guaranteed to be consistent and complete. This timing is essential because: + +- **Before ANALYZE phase**: Type information is incomplete, generics unresolved, symbols unlinked +- **During ANALYZE phase**: Type attribution is in progress, state may be inconsistent +- **After ANALYZE phase**: Perfect moment - all information complete and consistent +- **After GENERATE phase**: Too late - AST may be optimized/transformed for bytecode generation + +The dual-representation approach (TypeElement + TreePath) enables Checker Framework to perform **semantic-syntax correlation analysis**, where type checking rules can correlate semantic constraints with specific syntax constructs in the source code. + + + +#### 3.5.2 Abstract Layer - AbstractTypeProcessor.typeProcess() + +```java +public abstract void typeProcess(TypeElement element, TreePath tree); +``` + +**This abstract method is implemented by `SourceChecker` class in the Checker Framework.** + +**Implementation Hierarchy** + +``` +AbstractTypeProcessor (javacutil) → abstract typeProcess() + ↓ +SourceChecker (framework) → concrete implementation + ↓ +BaseTypeChecker (common/basetype) → createSourceVisitor() + ↓ +NullnessChecker (checker/nullness) → domain-specific logic +``` + +**SourceChecker.typeProcess() Key Operations** + +```java +@Override +public void typeProcess(TypeElement e, TreePath p) { + // 1. Process subcheckers (for composite type systems) + for (SourceChecker subchecker : getSubcheckers()) { + subchecker.typeProcess(e, p); + } + + // 2. Validate inputs and compilation state + if (javacErrored || e == null || p == null) return; + + // 3. Core delegation to visitor for AST traversal + setRoot(p.getCompilationUnit()); + visitor.visit(p); // ← START TYPE CHECKING + warnUnneededSuppressions(); +} +``` + +**NewObject Example Call Chain** + +```java +// TypeElement: NewObject class metadata +// TreePath: [CompilationUnitTree → ClassTree for NewObject] + +SourceChecker.typeProcess(NewObjectElement, TreePath) + → setRoot(NewObject.java CompilationUnitTree) + → visitor.visit(TreePath) // Start AST traversal + → scan(ClassTree for NewObject) + → visitMethod(test() method) + → visitVariable(Object nn = ...) + → visitNewClass(new Object()) // Apply @NonNull annotation +``` + +**Framework vs. Checker Responsibilities** + +**SourceChecker (Framework):** +- Javac integration, subchecker coordination, error handling +- AST root management, visitor orchestration + +**Individual Checkers (NullnessChecker):** +- Type system rules, AST node processing (visitNewClass, etc.) +- Domain-specific error messages and suppression logic + +The abstract design enables robust framework infrastructure while supporting modular, extensible type system implementations. + + + +#### 3.5.3 Implementation Layer - SourceChecker.typeProcess() + +##### 1. Method Signature and Responsibilities + +```java +@Override +public void typeProcess(TypeElement e, TreePath p) { + // Process class-specific type checking +} +``` + +**Primary Responsibility**: Orchestrate the complete type checking workflow by coordinating subcheckers, validating compilation state, and delegating AST traversal to the visitor pattern implementation. + +##### 2. Detailed Implementation Analysis + +**Message Storage Management** + +```java +if (messageStore != null && parentChecker == null) { + messageStore.clear(); // Clear previous error messages +} +``` + +**Purpose**: The framework implements a **centralized error reporting system** that collects all diagnostic messages during type checking and sorts them by source location before final output. + +- **Message aggregation**: Stores errors from all subcheckers for the current compilation unit +- **Only top-level checkers clear**: Prevents subcheckers from interfering with message coordination +- **Sorted output**: Groups related messages together for better user experience + +**NewObject Example**: When processing `NewObject.java`, any errors from nullness checking (like missing `@NonNull` annotations) are stored here rather than printed immediately. + +##### 3. Subchecker Processing + +```java +for (SourceChecker subchecker : getSubcheckers()) { + subchecker.errsOnLastExit = numErrorsOfAllPreviousCheckers; + subchecker.messageStore = messageStore; + + // Invoke subchecker + subchecker.typeProcess(e, p); + + // Accumulate error counts + numErrorsOfAllPreviousCheckers += errorsAfterTypeChecking - errorsBeforeTypeChecking; +} +``` + +**Composite Checker Pattern**: Enables running multiple type systems simultaneously on the same source code. Each subchecker processes the same `TypeElement` and `TreePath` independently. + +**Error State Coordination**: Tracks cumulative error counts across all subcheckers to maintain consistent error reporting state. This prevents later checkers from being confused by errors from earlier checkers. + +**NewObject Example**: If running both nullness and interning checkers together, both would analyze the `new Object()` expression in sequence, each applying their own type system rules. + +##### 4. Parameter Validation + +```java +if (e == null) { + messager.printMessage(Diagnostic.Kind.ERROR, "Refusing to process empty TypeElement"); + return; +} +if (p == null) { + messager.printMessage(Diagnostic.Kind.ERROR, + "Refusing to process empty TreePath in TypeElement: " + e); + return; +} +``` + +**Defensive Programming**: Validates that the javac integration provided complete data. While these null checks should theoretically never trigger in normal operation, they provide clear error messages if the javac integration breaks. + +**Early Termination**: Prevents `NullPointerException`s deeper in the type checking pipeline by failing fast with clear diagnostic messages. + +##### 5. Compilation Environment Checks + +```java +// Check Java version support +Source source = Source.instance(context); +if (source.compareTo(Source.lookup("8")) < 0) { + messager.printMessage(Diagnostic.Kind.WARNING, + "-source " + source.name + " does not support type annotations"); +} + +// Check for compilation errors +if (log.nerrors > this.errsOnLastExit) { + this.errsOnLastExit = log.nerrors; + javacErrored = true; + return; // Skip type checking if compilation failed +} +``` + +**Version Compatibility**: Ensures the checker runs on supported Java versions since type annotations were introduced in Java 8. + +**Compilation Error Detection**: Monitors javac's error count to detect if previous compilation phases failed. If Java syntax errors exist, type checking is skipped since the AST may be incomplete or inconsistent. + +**NewObject Example**: If `NewObject.java` had syntax errors (like missing semicolons), this check would detect them and skip nullness analysis entirely. + +##### 6. Root Node Setup + +```java +if (p.getCompilationUnit() != currentRoot) { + setRoot(p.getCompilationUnit()); // Set current compilation unit + + if (printFilenames) { + message(Diagnostic.Kind.NOTE, "%s is type-checking %s", + this.getClass().getSimpleName(), + currentRoot.getSourceFile().getName()); + } +} +``` + +**Compilation Unit Context**: Updates the checker's internal state to track which source file is currently being processed. This enables accurate source position reporting in error messages. + +**Visitor State Synchronization**: Calls `visitor.setRoot(currentRoot)` to ensure the visitor has the correct AST context for traversal. + +**NewObject Example**: Sets `currentRoot` to the `CompilationUnitTree` representing `NewObject.java`, enabling accurate line/column reporting for any nullness errors found. + +##### 7. Core AST Traversal + +```java +try { + visitor.visit(p); // Critical delegation: Start AST traversal + warnUnneededSuppressions(); +} catch (UserError ce) { + logUserError(ce); +} catch (TypeSystemError ce) { + logTypeSystemError(ce); +} catch (BugInCF ce) { + logBugInCF(ce); +} catch (Throwable t) { + logBugInCF(wrapThrowableAsBugInCF("SourceChecker.typeProcess", t, p)); +} finally { + this.errsOnLastExit = log.nerrors; + printStoredMessages(p.getCompilationUnit()); +} +``` + +**Core Delegation**: The critical line `visitor.visit(p)` transfers control from framework infrastructure to type-system-specific logic. This is where actual nullness checking begins. + +**Comprehensive Error Handling**: Catches and properly reports different categories of errors: +- **UserError**: Configuration or input errors (wrong command-line options) +- **TypeSystemError**: Type system implementation errors +- **BugInCF**: Internal framework bugs requiring investigation +- **Throwable**: Unexpected runtime errors + +**Resource Cleanup**: The `finally` block ensures error counts are updated and stored messages are printed even if exceptions occur during type checking. + +**NewObject Example**: `visitor.visit(p)` starts with the `ClassTree` for `NewObject` and recursively traverses to eventually reach `visitNewClass()` for the `new Object()` expression, where the `@NonNull` annotation is applied. + +The complete flow transforms abstract framework coordination into concrete type checking through careful orchestration of subcheckers, state validation, and visitor delegation. + + + +#### 3.5.4 Visitor Layer - BaseTypeVisitor.visit() + +**Inheritance Hierarchy** + +``` +TreePathScanner (Java Standard Library) + ↓ +SourceVisitor (Checker Framework) + ↓ +BaseTypeVisitor (Checker Framework) +``` + +The visitor pattern implementation provides automatic AST traversal with type-system-specific processing at each node. + +**SourceVisitor.visit() Method Implementation** + +```java +// SourceVisitor.visit() +public void visit(TreePath path) { + lastVisited = path.getLeaf(); // Record last visited node (for error reporting) + this.scan(path, null); // Call TreePathScanner.scan() +} +``` + +**Entry point delegation** that sets up error reporting context and initiates the recursive AST traversal. The `lastVisited` field enables accurate source location reporting when exceptions occur during type checking. + +**TreePathScanner Working Mechanism** + +TreePathScanner is Java's standard AST traversal infrastructure that implements the **Visitor Pattern** through automatic method dispatch: + +```java +// TreePathScanner core logic (conceptual) +public R scan(TreePath path, P p) { + if (path == null) return null; + + Tree tree = path.getLeaf(); + + // Automatic dispatch based on tree node type + return tree.accept(this, p); // ← KEY: calls appropriate visitXxx method +} +``` + +**Automatic Method Dispatch**: Each `Tree` node type implements an `accept` method that calls the corresponding visitor method: + +```java +// Automatic dispatching examples: +ClassTree.accept(visitor, p) → visitor.visitClass(ClassTree, p) +MethodTree.accept(visitor, p) → visitor.visitMethod(MethodTree, p) +NewClassTree.accept(visitor, p) → visitor.visitNewClass(NewClassTree, p) +VariableTree.accept(visitor, p) → visitor.visitVariable(VariableTree, p) +// ... for all AST node types +``` + +**BaseTypeVisitor Enhancements** + +BaseTypeVisitor extends SourceVisitor and overrides the `scan` method to provide **type-checking context management**: + +```java +@Override +public Void scan(@Nullable Tree tree, Void p) { + if (tree == null) return null; + + // Update type factory with current tree path for accurate type attribution + if (getCurrentPath() != null) { + this.atypeFactory.setVisitorTreePath(new TreePath(getCurrentPath(), tree)); + } + + // Handle newer Java features (switch expressions, etc.) + if (SystemUtil.jreVersion >= 14 && tree.getKind().name().equals("SWITCH_EXPRESSION")) { + visitSwitchExpression17(tree); + return null; + } + + return super.scan(tree, p); // Continue standard traversal +} +``` + +**Context Synchronization**: Ensures the type factory has accurate TreePath information for type attribution and error reporting at each AST node. + +**Visit Method Overrides for Type Checking** + +BaseTypeVisitor overrides specific visit methods to inject type-checking logic: + +```java +@Override +public final Void visitClass(ClassTree classTree, Void p) { + // Skip classes based on configuration + if (checker.shouldSkipDefs(classTree) || checker.shouldSkipFiles(classTree)) { + return null; + } + + // Set up class processing context + atypeFactory.preProcessClassTree(classTree); + atypeFactory.setVisitorTreePath(TreePath.getPath(root, classTree)); + + try { + processClassTree(classTree); // Delegate to subclass-specific logic + atypeFactory.postProcessClassTree(classTree); + } finally { + // Restore previous context + atypeFactory.setVisitorTreePath(preTreePath); + } + return null; +} +``` + +**NewObject Example: Complete Traversal Flow** + +For `NewObject.java` with `new Object()` expression: + +```java +// TreePath: [CompilationUnitTree → ClassTree for NewObject] + +1. SourceVisitor.visit(TreePath) + → Sets lastVisited = ClassTree for NewObject + → Calls scan(TreePath, null) + +2. TreePathScanner.scan() + → tree = ClassTree for NewObject + → Calls ClassTree.accept(this, null) + → Automatically dispatches to visitClass(ClassTree, null) + +3. BaseTypeVisitor.visitClass(ClassTree, null) + → Sets up class processing context + → Calls processClassTree(ClassTree) + → Recursively scans class members via super.visitClass() + +4. Recursive traversal continues: + → visitMethod(test() method) + → visitVariable(Object nn = ...) + → visitNewClass(new Object()) ← TARGET REACHED + +5. BaseTypeVisitor.visitNewClass(NewClassTree, null) + → Gets annotated type: AnnotatedDeclaredType dt = atypeFactory.getAnnotatedType(tree) + → Applies @NonNull annotation to new Object() result type + → Performs constructor invocation checking +``` + +**Error Context Tracking** + +The visitor maintains **comprehensive error context** through multiple mechanisms: + +- **lastVisited field**: Tracks the most recent AST node for exception reporting +- **TreePath management**: Maintains complete path from compilation unit to current node +- **Type factory synchronization**: Ensures accurate source position information for type errors + +**Framework vs. Domain-Specific Responsibilities** + +**SourceVisitor/BaseTypeVisitor (Framework):** +- AST traversal infrastructure and automatic method dispatch +- Error context management and source position tracking +- Type factory coordination and TreePath synchronization + +**Individual Checkers (NullnessVisitor):** +- Override specific visitXxx methods (visitNewClass, visitAssignment, etc.) +- Implement type-system-specific rules and error messages +- Domain-specific suppression and annotation logic + +The visitor layer transforms generic AST traversal into targeted type checking by providing **automatic method dispatch** combined with **comprehensive context management**, enabling individual checkers to focus on type system rules rather than traversal infrastructure. + + + + + + + + + + + + + + + + + + + + + + + + From 01dce461a26ac13e24e795c85c5f7eacdd62a5ac Mon Sep 17 00:00:00 2001 From: zyf265600 Date: Thu, 17 Jul 2025 23:37:20 -0400 Subject: [PATCH 2/4] Table Content update --- .../Checker Framework URA Onboarding.md | 88 ++++++++++++++++++- 1 file changed, 87 insertions(+), 1 deletion(-) diff --git a/docs/manual/Checker Framework URA Onboarding.md b/docs/manual/Checker Framework URA Onboarding.md index 19080885c2e6..52af57ff1ed8 100644 --- a/docs/manual/Checker Framework URA Onboarding.md +++ b/docs/manual/Checker Framework URA Onboarding.md @@ -1,5 +1,92 @@ # Checker Framework Start up / Onboarding +## Table of Contents + +- [Checker Framework Start up / Onboarding](#checker-framework-start-up--onboarding) + - [Table of Contents](#table-of-contents) + - [1. **Introduction**](#1-introduction) + - [1.1 **What is the Checker Framework?**](#11-what-is-the-checker-framework) + - [1.2 **Why This Matters**](#12-why-this-matters) + - [1.3 What You'll Learn](#13-what-youll-learn) + - [1.4 The Journey Ahead](#14-the-journey-ahead) + - [2. What is Processor](#2-what-is-processor) + - [2.1 Java Compilation Process Overview](#21-java-compilation-process-overview) + - [Phase 1: Lexical Analysis](#phase-1-lexical-analysis) + - [Phase 2: Syntax Analysis](#phase-2-syntax-analysis) + - [Phase 3: Symbol Resolution](#phase-3-symbol-resolution) + - [Phase 4: Annotation Processing (What we are looking for!)](#phase-4-annotation-processing-what-we-are-looking-for) + - [Phase 5: Attribute/Type Checking](#phase-5-attributetype-checking) + - [Phase 6: Flow Analysis](#phase-6-flow-analysis) + - [Phase 7: Desugaring](#phase-7-desugaring) + - [Phase 8: Code Generation](#phase-8-code-generation) + - [Understanding `-processor`](#understanding--processor) + - [2.2 What is Processor in Java?](#22-what-is-processor-in-java) + - [2.2.1 Basic Concepts of the Processor Interface](#221-basic-concepts-of-the-processor-interface) + - [2.2.2 Core Methods of the Processor Interface](#222-core-methods-of-the-processor-interface) + - [2.2.3 Significance for Checker Framework](#223-significance-for-checker-framework) + - [2.2.4 Checker Framework's Extension of Processor](#224-checker-frameworks-extension-of-processor) + - [2.2.5 Key Design Decisions](#225-key-design-decisions) + - [**A. Why Use the Processor Interface?**](#a-why-use-the-processor-interface) + - [**B. AbstractTypeProcessor's Innovation**](#b-abstracttypeprocessors-innovation) + - [**Solution (Important):**](#solution-important) + - [2.2.6 Practical Application Examples](#226-practical-application-examples) + - [2.2.8 Limitations of the Processor Interface and Checker Framework's Solutions](#228-limitations-of-the-processor-interface-and-checker-frameworks-solutions) + - [3. How Processor Works](#3-how-processor-works) + - [3.1 Step 1: javac Command Startup](#31-step-1-javac-command-startup) + - [3.1.1 What is reflection](#311-what-is-reflection) + - [3.1.2 Java Standard Annotation Processing Mechanism](#312-java-standard-annotation-processing-mechanism) + - [3.2 Step 2: Annotation Processor Discovery and Instantiation](#32-step-2-annotation-processor-discovery-and-instantiation) + - [3.2.1 javac Instantiates NullnessChecker](#321-javac-instantiates-nullnesschecker) + - [3.2.2 Inheritance Chain Construction Process](#322-inheritance-chain-construction-process) + - [3.2.3 Critical Component: AttributionTaskListener](#323-critical-component-attributiontasklistener) + - [3.3 Step 3: javac Invoke Processor.init()](#33-step-3-javac-invoke-processorinit) + - [3.3.1 When is init() Called?](#331-when-is-init-called) + - [3.3.2 What is the Purpose of init()?](#332-what-is-the-purpose-of-init) + - [3.3.3 The Complete init() Call Chain in the Checker Framework](#333-the-complete-init-call-chain-in-the-checker-framework) + - [Step 1: Processor Instantiation](#step-1-processor-instantiation) + - [Step 2: AbstractTypeProcessor.init() Call](#step-2-abstracttypeprocessorinit-call) + - [3.4 Step 4: Waiting for Type Analysis Completion (Core)](#34-step-4-waiting-for-type-analysis-completion-core) + - [3.4.1 Current State Summary](#341-current-state-summary) + - [3.4.2 Review of Javac Compilation Phase Order](#342-review-of-javac-compilation-phase-order) + - [3.4.2.1 ANALYZE Phase Details](#3421-analyze-phase-details) + - [3.4.2.2 TaskEvent Notification System](#3422-taskevent-notification-system) + - [3.4.2.3 ANALYZE Phase Specification](#3423-analyze-phase-specification) + - [3.4.2.4 Implementation Architecture](#3424-implementation-architecture) + - [3.4.2.5 Processing Sequence](#3425-processing-sequence) + - [3.4.3 Review of The NullnessChecker's Two-Phase Strategy](#343-review-of-the-nullnesscheckers-two-phase-strategy) + - [3.4.4 Phase 1: Registration (process method)](#344-phase-1-registration-process-method) + - [Who Calls It and When?](#who-calls-it-and-when) + - [Why This Design? The Fundamental Problem](#why-this-design-the-fundamental-problem) + - [3.4.5 Phase 2: Type Checking (TaskListener)](#345-phase-2-type-checking-tasklistener) + - [Who Calls It and When?](#who-calls-it-and-when-1) + - [Why This Design? The Innovation](#why-this-design-the-innovation) + - [3.4.6 NewObject.java Example Walkthrough](#346-newobjectjava-example-walkthrough) + - [3.4.7 What's Next? Type Checking Phase Preview](#347-whats-next-type-checking-phase-preview) + - [Next Steps (Brief Overview)](#next-steps-brief-overview) + - [The Transition](#the-transition) + - [3.5 Step 5: Start Checking! - typeProcess](#35-step-5-start-checking---typeprocess) + - [3.5.1 Event Trigger Layer - AttributionTaskListener.finished()](#351-event-trigger-layer---attributiontasklistenerfinished) + - [Detailed Analysis of finished() Method:](#detailed-analysis-of-finished-method) + - [1. First Line: Obtaining TypeElement](#1-first-line-obtaining-typeelement) + - [2. Second Line: Obtaining TreePath](#2-second-line-obtaining-treepath) + - [3. Detailed Implementation Flow](#3-detailed-implementation-flow) + - [4. TreePath Structure and Functionality](#4-treepath-structure-and-functionality) + - [5. Why These Two Operations Are Required](#5-why-these-two-operations-are-required) + - [6. Practical Application Scenario](#6-practical-application-scenario) + - [7. Summary](#7-summary) + - [3.5.2 Abstract Layer - AbstractTypeProcessor.typeProcess()](#352-abstract-layer---abstracttypeprocessortypeprocess) + - [3.5.3 Implementation Layer - SourceChecker.typeProcess()](#353-implementation-layer---sourcecheckertypeprocess) + - [1. Method Signature and Responsibilities](#1-method-signature-and-responsibilities) + - [2. Detailed Implementation Analysis](#2-detailed-implementation-analysis) + - [3. Subchecker Processing](#3-subchecker-processing) + - [4. Parameter Validation](#4-parameter-validation) + - [5. Compilation Environment Checks](#5-compilation-environment-checks) + - [6. Root Node Setup](#6-root-node-setup) + - [7. Core AST Traversal](#7-core-ast-traversal) + - [3.5.4 Visitor Layer - BaseTypeVisitor.visit()](#354-visitor-layer---basetypevisitorvisit) + +--- + ## 1. **Introduction** Welcome to this comprehensive guide on understanding the Checker Framework's internal workings. As a new contributor to this project, you're about to embark on a journey through one of Java's most sophisticated static analysis tools. This guideline will walk you through exactly what happens when the Checker Framework processes a simple Java file, layer by layer, from the moment you invoke the checker until it completes its analysis. @@ -517,7 +604,6 @@ public abstract class AbstractTypeProcessor extends AbstractProcessor { #### 3.2.3 Critical Component: AttributionTaskListener -During construction (line 77 in AbstractTypeProcessor): ```java private final AttributionTaskListener listener = new AttributionTaskListener(); ``` From 6bd02a06868022cf6d73947200bee14c15aeb998 Mon Sep 17 00:00:00 2001 From: zyf265600 Date: Thu, 17 Jul 2025 23:46:23 -0400 Subject: [PATCH 3/4] add new example java file --- docs/examples/NewObject.java | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 docs/examples/NewObject.java diff --git a/docs/examples/NewObject.java b/docs/examples/NewObject.java new file mode 100644 index 000000000000..d2125fdfe33d --- /dev/null +++ b/docs/examples/NewObject.java @@ -0,0 +1,5 @@ +public class NewObject { + void test() { + Object object = new Object(); + } +} From f79db8d0c6cea310cafa66301866676a132f7c21 Mon Sep 17 00:00:00 2001 From: zyf265600 Date: Thu, 17 Jul 2025 23:49:52 -0400 Subject: [PATCH 4/4] ura onboarding doc rename --- .../{Checker Framework URA Onboarding.md => ura-onboarding.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/manual/{Checker Framework URA Onboarding.md => ura-onboarding.md} (100%) diff --git a/docs/manual/Checker Framework URA Onboarding.md b/docs/manual/ura-onboarding.md similarity index 100% rename from docs/manual/Checker Framework URA Onboarding.md rename to docs/manual/ura-onboarding.md