- https://thelearningjourneyebooks.com/ebooks/TheLinuxKernelDataStructuresJourney_v2.0_April2024.pdf
- https://medium.com/embedworld/maximizing-performance-in-embedded-linux-with-cache-aware-programming-ec3d7ad21e5a
| Command | Description |
|---|---|
| df(disk free) | The df command shows disk space usage for all mounted filesystems. Options: -h : human readable (MB/GB) --local: locally mounted filesystem |
| echo c > /proc/sysrq-trigger |
|
| efibootmgr -v | To view the active boot entries registered in your NVRAM, we can use the efibootmgr command.
|
| free -m | The free -m command provides a snapshot of your system's memory (RAM) usage in Megabytes. |
| free -mh |
|
| getconf -a | grep CACHE_LINESIZE | CPU cacheline detail. (CPU read/write data from and to CP<->RAM in atomic unit called the CPU cacheline). (unit is byte not bit) |
| grep <option> |
Options:
|
| grub2-mkconfig | grub2-mkconfig is the actual command-line tool that generates the final grub.cfg file. |
| head Makefile | To know kernel version. (Check top-level Makefile) |
| insmod | Insert module. It loads a kernel module (.ko file) directly into the running kernel. |
| ls -R /boot/efi | ISafely view /boot/efi files. |
| lsmod | List all currently loaded kernel modules. Each line has 3 parts:
|
| lspci | lspci is a Linux command that lists all PCI and PCI‑Express devices on your system. PCI devices include:
Options:
|
| lstop | Visualizes NUMA hierarchy. |
| man -k grub | Keywords of GRUB |
| make defconfig | Default Kernel Configuration. It generates a standard, safe, general-purpose configuration.
|
| make distclean | mrproper + remove editor backup and patch files. run this cmd in the root of the kernel source tree, useful when you want to restart the kernel build procedure from scratch. |
| make help | to see make command option details |
| make -j8 | implifying up to eight processes performing the build in parallel. All the build processes write to the same stdout location - the console or terminal window. Hence, the output may be out of order or mixed up. |
| make localmodconfig | It reads currently loaded kernel modules from /proc/modules, hardware info from /sys, /proc, lsmod and creates a new .config file.
|
| make menuconfig | UI to fine-tune kernel configuration.
|
| make modules | Compile .ko (kernel object) file |
| make modules_install | Getting the kernel modules installed. Sudo is not required if INSTALL_MOD_PATH refers a location that does not require root for writing. |
| make oldconfig | Update current config utilizing a provided .config as base |
| modinfo -p <module_name> |
|
| mknod | mknod is a command used to create special files in Linux/Unix, such as:
mknod NAME TYPE MAJOR MINOR where:
|
| mkfs | make filesystem mkfs is a Linux command used to create (format) a filesystem on a block device — for example, a partition like /dev/sdb1.Because it destroys all data on the target device, it must be used carefully. |
| ps -A | Lists all processes currently running on the system. |
| ps -el | List of the processes and their respective nice values (under the column marked NI). |
| ps -eo state,uid,pid,ppid,rtprio,time,comm | List of the processes and their respective real-time priority (under the column marked RTPRIO). A value of "-" means the process is not real-time. |
| ps -LA | shows all threads of all processes on the system. |
| ps aux | shows a detailed snapshot of all running processes using BSD-style (Berkeley Software Distribution) options.
|
| pstree | pstree is a classic tool that shows your running processes as a tree structure. It’s much easier to read than ps when you want to see which process started (parented) another. |
| readelf -S <module_name> | grep ksym | __ksymtab/__ksymtab_gpl section of ELF. Exported symbol of module is present in this section. |
| rmmod | Remove Module |
| sed '1d' sed '2d' |
Delete 1st line. Delete 2nd line. |
| systemctl isolate graphical.target | systemctl isolate graphical.target is the command you use to tell Ubuntu to immediately start the full desktop environment (GUI). What happens when you run this?
|
| time <cmd_name> | To see how long a command takes to execute. |
| ulimit | view and set resource limits. -f option to query the maximum possible size of files written to by the shell process. unlimited only implies that there is no particular limit imposed by the OS. Of course it's finite, limited by the actual available disk space on the box. |
| uname <option> |
|
| vmstat -m | Slab cache detail (vmstat --> Report virtual memory statistics). |
| wc -l | counts the number of lines in input. |
| Type | Specifier |
|---|---|
| size_t | %zu |
| ssize_t | %zd |
| Kernel pointer for security (hashed value | %pk |
| Actual pointer (don't use in production) | %px |
| Physical Address (kptr_restrict) | %pa |
| Raw buffer as a string of hex characters | %*ph (* is replaced by the number of characters). Use it for buffer within 64 chars, and use the print_hex_dump_bytes() routine for more. |
| IPv4 address | %pI4 |
| IPv6 address | %pI6 |
| log level | Value |
|---|---|
| KERN_EMERG: | 0 |
| KERN_ALERT: | 1 |
| KERN_CRIT: | 2 |
| KERN_ERR: | 3 |
| KERN_WARNING: | 4 |
| KERN_NOTICE: | 5 |
| KERN_INFO: | 6 |
| KERN_DEBUG: | 7 |
| Error | Meaning |
|---|---|
| ESRCH | Error - No Such Process |
| EINVAL | Invalid Argument |
| ERESTARTSYS | -ERESTARTSYS is a specialized error code used to handle interruptions caused by signals during a blocking system call. It is primarily used in conjunction with interruptible sleeps (such as mutex_lock_interruptible or wait_event_interruptible). |
| EINTR | EINTR (Error code 4) stands for Interrupted System Call. |
| EPERM | EPERM (Error code 1) stands for Operation Not Permitted. |
| Signal | Description |
|---|---|
| PF_EXITING | |
| SIGCHLD | SIGCHLD (Signal: Child): is the notification the kernel sends to a parent process whenever one of its child processes terminates, stops, or continues. |
| SIGSTOP | SIGSTOP is the "hard pause" button for a process. SIGSTOP cannot be ignored, blocked, or handled by the process. When the kernel sends this signal, the process stops exactly where it is immediately. |
| SIGTTIN | SIGTTIN (Signal Terminal Input): is the signal sent to a background process when it attempts to read from its controlling terminal (keyboard). |
| SIGTTOU | Signal: Terminal Output. This is the signal sent to a background process when it tries to write data to its controlling terminal (tty). |
| Flags | Header File | Description |
|---|---|---|
| IRQF_SHARED | <linux/interrupt.h> | This allows you to share the IRQ line between several devices. Required for devices on the PCI bus. |
| IRQF_ONESHOT | The IRQ is not enabled after the hardirq handler finishes executing. This flag is typically used by threaded interrupts to ensure that the IRQ remains disabled until the threaded handler completes. | |
| __IRQF_TIMER | It's used to mark the interrupt as a timer interrupt. The timer interrupt fires at periodic intervals and is responsible for implementing the kernel's timer/timeout mechanism, scheduler-related housekeeping and so on. | |
| _IRQF_NO_SUSPEND | It specifies that the interrupt remains enabled even when the system goes into a suspend state. | |
| IRQF_NO_THREAD | IRQF_NO_THREAD flag specifies that this interrupt cannot use the threaded model. | |
| IRQF_PROBE_SHARED | IRQF_PROBE_SHARED is a specialized interrupt flag used by drivers that perform IRQ probing (automatic detection of interrupt lines) on devices that share an interrupt line with other hardware. Tells the kernel that the driver is willing to share the interrupt line even during the sensitive probing phase. It allows the probe to proceed even if the IRQ is already in use by another "shareable" driver. | |
| IRQF_PERCPU | IRQF_PERCPU is a specialized interrupt flag used to indicate that a specific interrupt line is private to each CPU core. | |
| IRQF_NOBALANCING | IRQF_NOBALANCING is a specialized interrupt registration flag used to exclude a specific interrupt from the kernel's automatic IRQ balancing mechanism. IRQ balancing is a kernel process (often assisted by the userspace irqbalance daemon) that periodically redistributes hardware interrupts across different CPU cores to prevent any single core from being overwhelmed |
|
| IRQF_IRQPOLL | IRQF_IRQPOLL is a specialized interrupt registration flag used to support the kernel's irqpoll mechanism. It is primarily a diagnostic and recovery tool used when hardware or firmware fails to correctly signal interrupts. When the irqpoll boot option is active, the kernel will poll all handlers registered with the IRQF_IRQPOLL flag whenever an unhandled interrupt occurs on any line. |
|
| IRQF_FORCE_RESUME | IRQF_FORCE_RESUME is a specialized interrupt flag used to ensure that a specific interrupt line is re-enabled immediately during the system resume process, even if the device itself has not yet been resumed. |
|
| _IRQF_EARLY_RESUME | IRQF_EARLY_RESUME is a specialized interrupt flag used to control the timing of when an interrupt is re-enabled during the system's transition from sleep (suspend) back to a running state (resume). |
|
| IRQF_COND_SUSPEND | IRQF_COND_SUSPEND is a specialized interrupt flag used to safely share an interrupt line between a standard device and a "non-suspending" device (like a system timer) during system sleep transitions. |
|
| IRQF_TRIGGER_NONE | IRQF_TRIGGER_NONE is a flag used during interrupt registration to indicate that the driver is not specifying a hardware trigger style (such as edge or level). When you use IRQF_TRIGGER_NONE, you are telling the kernel to use the default trigger configuration already defined for that interrupt line. |
|
| IRQF_TRIGGER_RISING | IRQF_TRIGGER_RISING is an interrupt flag used to configure an interrupt as edge-triggered. It specifies that the interrupt should be generated specifically when the electrical signal on the IRQ line transitions from a low voltage to a high voltage (the "rising edge"). |
|
| IRQF_TRIGGER_FALLING | IRQF_TRIGGER_FALLING is an interrupt flag used to configure an interrupt as edge-triggered. It specifies that the interrupt should be generated exactly when the electrical signal on the IRQ line transitions from a high voltage to a low voltage (the "falling edge"). |
|
| IRQF_TRIGGER_HIGH | IRQF_TRIGGER_HIGH is an interrupt flag used to configure a level-triggered interrupt. It specifies that the interrupt remains active as long as the voltage on the IRQ line is held at a high logical level. |
|
| IRQF_TRIGGER_LOW | IRQF_TRIGGER_LOW is an interrupt flag used to configure a level-triggered interrupt. It specifies that the interrupt is considered active as long as the voltage on the IRQ line is held at a low logical level. |
|
| TIMER_DEFERABLE | <linux/timer.h> | TIMER_DEFERRABLE is a flag used when initializing a kernel timer to indicate that the timer does not need to wake up a CPU core from a deep sleep (idle) state. Standard timers are "hard" deadlines. If a timer expires while a CPU is in a power-saving C-state, the hardware will force the CPU to wake up just to handle the interrupt. This consumes significant battery/power.
|
| TIMER_PINNED | <linux/timer.h> | TIMER_PINNED is a flag used during timer initialization to ensure that a timer's callback function always executes on the same CPU core that scheduled it. Normally, the Linux scheduler and the timer wheel may move a timer to a different CPU core to balance the load or save power (especially on multi-core SoCs).
|
| TIMER_IRQSAFE | <linux/timer.h> | TIMER_IRQSAFE is a specialized flag used during timer initialization to indicate that the timer's callback function can be safely executed in hard-interrupt (atomic) context without triggering deadlocks.
|
| PF_EXITING | include/linux/sched.h | PF_EXITING is a process flag bit used to mark a task that has begun its termination sequence. |
| Abbreviation | Full Form | Description | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| .ko | Kernel object | Give us kernel functionality in a modular manner. | ||||||||||||||||
| ABI | Application Binary Interface | ABI refers to the low-level interface between the kernel and other software (either user-space applications or kernel modules). Unlike the API (Application Programming Interface), which is defined at the source code level, the ABI is defined at the binary level (registers, memory layouts, and stack conventions). Unlike the user-space interface, the internal kernel ABI is unstable.
|
||||||||||||||||
| ASAN | Address SANitizer | |||||||||||||||||
| ASLR | Adress Space Layout Randomization | |||||||||||||||||
| BCC | BPF Compiler Collection | It is a toolkit and framework for creating eBPF programs used to trace, profile,and observe Linux systems at runtime with very low overhead. It’s widely used for performance analysis, debugging, networking, and security. |
||||||||||||||||
| BDI | Backing Device Info | It is a core data structure (struct backing_dev_info) that represents the properties and state of a storage device (the "backing store") that sits underneath a filesystem. |
||||||||||||||||
| BIOS | Basic Input Output System | |||||||||||||||||
| BKL | Big Kernel Lock | When held, it kept the kernel in a non-preemptible state for long period of time. Now has been removed. |
||||||||||||||||
| BoF | Buffer Overflow | |||||||||||||||||
| BSA | Buddy System Allocator | |||||||||||||||||
| BSP | Board Support Package | Support files for hardware/SoC | ||||||||||||||||
| CFS | Completely Fair Scheduler | |||||||||||||||||
| cgroup | Control groups | |||||||||||||||||
| CISC | Complex Instruction Set Computing | |||||||||||||||||
| CMA | Contiguous Memory Allocator | |||||||||||||||||
| cmpxchg | Compare and Exchange | cmpxchg (Compare and Exchange) is an atomic instruction provided by the CPU hardware.
bool cmpxchg(int *address, int expected, int new_value) { if (*address == expected) { *address = new_value; return true; // Success! } return false; // Someone else changed it } |
||||||||||||||||
| CPIO | Copy In, Copy Out | CPIO is a simple archive file format used widely in Linux systems—especially for initramfs / initrd, embedded systems, and packaging files for the kernel.
|
||||||||||||||||
| CONFIG_MODULE_SIG | Module Signature Verification | CONFIG_MODULE_SIG is a kernel build-time option that controls module signature verification — i.e., whether the kernel requires .ko modules to be cryptographically signed before loading. CONFIG_MODULE_SIG_ALL Sign all modules automatically during kernel build. CONFIG_MODULE_SIG_FORCE Kernel refuses to load unsigned modules. |
||||||||||||||||
| cpuhp | CCPU Hotplug | It manage the state transitions (online/offline) for each specific CPU core. | ||||||||||||||||
| CR3 | Control Register 3 | CR3 is the control register that holds the physical address of the top-level page table (PGD/PML4) on x86/x86-64. CR3 is the x86/x86-64 equivalent of ARM64’s TTBR0/TTBR1. |
||||||||||||||||
| CTF | Common Vulnerabilities and Exposures | A CVE is a standardized identifier for a publicly disclosed security flaw. | ||||||||||||||||
| CVE | Common Trace Format | |||||||||||||||||
| CWE | Common Weakness and Enumeration | While CVE (Common Vulnerabilities and Exposures) identifies a specific security flaw in a program (like a specific bug in kernel 6.17), CWE (Common Weakness Enumeration) identifies the type or root cause of that weakness. |
||||||||||||||||
| DAMON | Data Access MONitor | Capture and analyse memory access patterns of user-space process. | ||||||||||||||||
| dd | Disc duplicator | |||||||||||||||||
| debugfs | Debug File System | |||||||||||||||||
| defconfig | Default Kernel Configuration | |||||||||||||||||
| dentry | Directory entry | It is a core Virtual File System (VFS) structure that represents a specific component in a file path.
|
||||||||||||||||
| DKMS | Dynamic Kernel Module Support | Framework for module auto-loading. | ||||||||||||||||
| dm-verity | Device-Mapped-Verity | A kernel feature that ensures the integrity of read-only partitions like /system and /vendor. |
||||||||||||||||
| DSO | Dynamic Shared Object | |||||||||||||||||
| DSP | Digital Signal Processor | Special processor for signal ops | ||||||||||||||||
| DTB | Device Tree Blob | Binary hardware description | ||||||||||||||||
| DTS | Device Tree Source | Source format of DTB | ||||||||||||||||
| eBPF | Extended Berkeley Packet Filter | in-kernel programmable VM that lets you run user-defined programs inside the Linux kernel without loading kernelmodules. It’s widely used for observability, networking, and security. |
||||||||||||||||
| ELF | Executable and Linkable Format | |||||||||||||||||
| Epoll | Event Poll | epoll (Event Poll) is a scalable Linux-specific I/O event notification mechanism used to monitor multiple file descriptors (FDs) to see if I/O is possible on any of them. |
||||||||||||||||
| EUID | Effective User ID | Who the kernel trusts for access control. | ||||||||||||||||
| EXPORT_SYMBOL | By default all symbols (static/global) are private to the kernel modules. Using EXPORT_SYMBOL we can make it global, visible to any and all other kernel modules. |
|||||||||||||||||
| ext2 | Second Extended File System | It does not keep a log of intended changes.
|
||||||||||||||||
| ext3 | Third Extended File System | It records changes in a dedicated area (the "journal") before they are permanently applied to the main file system. |
||||||||||||||||
| f2fs | Fast flash file system | |||||||||||||||||
| FIQ | Fast Interrupt Request | FIQ (Fast Interrupt Request) is a legacy hardware-level interrupt specific to the ARM (32-bit) architecture. It was designed to provide a higher-priority, lower-latency alternative to the standard IRQ. |
||||||||||||||||
| FPU | Floating Point Unit | |||||||||||||||||
| FSUID | File System User ID | File-System specific checks. | ||||||||||||||||
| GFP | Get Free Page | |||||||||||||||||
| GIC | Generic Interrupt Controller | On ARM | ||||||||||||||||
| GKI | General Kernel Image | GKI provides a generic, common Linux kernel image that works across many Android devices without vendors heavily modifying the core kernel. |
||||||||||||||||
| GPL | General Public License | If code is upstream into the mainline kernel, it must be under the GNU GPL-2.0 license. |
||||||||||||||||
| GPOS | General Purpose Operating System | |||||||||||||||||
| GRUB | Grand Unified Bootloader |
How GRUB works on your system: |
||||||||||||||||
| HAL | Hardware Abstraction Layer | Layer between hw and OS | ||||||||||||||||
| HID | Human Interface Device | |||||||||||||||||
| HRT | High-resolution timers | It is the interrupt source for the kernel's high-precision timing subsystem, which allows for microsecond-level (or even nanosecond-level) accuracy, far exceeding the old "jiffies" system. |
||||||||||||||||
| I2C | Inter-Integrated Circuit | |||||||||||||||||
| IDR | Integer ID Management | The IDR (Integer ID Management) is a library used to map small integer identifiers (IDs)to pointer-based data structures. It solves the problem of efficiently allocating, managing, and looking up unique IDs—such as file descriptors, process IDs (PIDs),or device instance numbers—without the high memory overhead of a large array or the slow lookup times of a linked list. |
||||||||||||||||
| initramfs | Initial RAM filesystem | initramfs (Initial RAM Filesystem) is a tiny, temporary root filesystem that loads into your RAM right after GRUB but before your actual Ubuntu system starts. Think of it as the "bridge" that helps the kernel find and mount your real hard drive. Why you need it (especially with your setup)
|
||||||||||||||||
| inode | Index node | Contains file metadata such as access permissions, size, owner, creation time etc. The inode object represents all the information needed by the kernel to manipulate a file or directory. An inode is created in two distinct scenarios: physically on the disk and logically in the kernel's memory.
A new inode is allocated on the storage medium whenever a new file system object is created. This happens during: Even if a file already exists on disk, a "virtual" inode object must be created in RAM so the OS can work with it. This happens during: |
||||||||||||||||
| [IO][A]PIC | IO-[Advanced] Programmable Interrupt Controller | IO-APIC on x86 | ||||||||||||||||
| IOCTL | Input-Output Control | The ioctl system call is used to issue commands to the device (via its driver). | ||||||||||||||||
| IoF | Integer Overflow | |||||||||||||||||
| IRQ | Interrupt ReQuest | IRQ (Interrupt Request) is a signal sent by hardware to the CPU to indicate that an event requires immediate attention. It allows the processor to stop its current task, handle the hardware event, and then resume. |
||||||||||||||||
| ISR | Interrupt Service Routine | |||||||||||||||||
| IWI | Inter-Work Interrupt | It is primarily used on ARM64 and some RISC-V systems to signal a CPU core that a new task has been added to its local Workqueue. |
||||||||||||||||
| KASAN | Kernel Address SANitizer | It is a dynamic memory error detector used primarily to find out-of-bounds(buffer overflow/underflow), use-after-free bug and double-free access. |
||||||||||||||||
| KASLR | Kernel ASLR | |||||||||||||||||
| KCSAN | Kernel Concurrency SANitizer | |||||||||||||||||
| Kbuild | System for selecting kernel features | Kernel Build System | ||||||||||||||||
| Kconfig | Kernel Configuration | System for selecting kernel features | ||||||||||||||||
| KMSAN | Kernel Memory Sanitizer | |||||||||||||||||
| kprobe | Kernel probe | |||||||||||||||||
| kretprobe | Kernel probe return | |||||||||||||||||
| KSE | Kernel Schedulable Entity | In linux, the KSE is a thread, not a process. | ||||||||||||||||
| LANANA | Linux Assigned Names And Numbers Authority | Only these folks can officially assign the device node - the type and the major:minor numbers - to devices. |
||||||||||||||||
| LDM | Linux Device Model | |||||||||||||||||
| LKM | Loadable Kernel Module | Kernel code loaded/unloaded at runtime. | ||||||||||||||||
| LLC | Last Level Cache | |||||||||||||||||
| loff_t | Long Offset Type | loff_t is a signed 64-bit integer used to represent file positions and offsets. | ||||||||||||||||
| LPA | Large Physical Address | |||||||||||||||||
| LTTng | Linux Trace Toolkit- next generation | Powerful and popular open-source tracing system for Linux Kernel. | ||||||||||||||||
| MAC | Mandatory Access Control | |||||||||||||||||
| MBR | Master Boot Record |
The Master Boot Record (MBR) is the first sector of a storage device (Sector 0), occupying exactly 512 bytes. It is the legacy standard for partitioning disks, used primarily by BIOS-based systems to locate and load an operating system. sudo xxd -l 512 /dev/nvme0n1 The output shows a Protective MBR (Master Boot Record). 00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000090: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000100: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000110: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000120: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000130: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000140: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000150: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000160: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000170: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000180: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000190: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001c0: 0200 eeff ffff 0100 0000 ffff ff18 0000 ................ 000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U.
The 512 bytes are strictly divided into three main components:
|
||||||||||||||||
| min_flt | nnumber of minor page faults | A minor page fault occurs when:
|
||||||||||||||||
| MMU | Memory Management Unit | Hardware for memory translation | ||||||||||||||||
| Mutex | Mutual Exclusion | |||||||||||||||||
| NAPI | New API | |||||||||||||||||
| NBD | Network Block Device | |||||||||||||||||
| NMI | Non-maskable Interrupt | NMI (Non-Maskable Interrupt) is a high-priority hardware interrupt that cannot be ignored or disabled by standard software masking techniques. It is reserved for critical events that must be handled immediately, even if the CPU is in a state where regular interrupts are disabled. NMI interrupt lines cannot be shared. |
||||||||||||||||
| NUMA | Non-Uniform Memory Access | |||||||||||||||||
| nvcsw | number of non-voluntary context switches | A non-voluntary context switch happens when:
|
||||||||||||||||
| OF | Open Firmware | Used for device tree bindings | ||||||||||||||||
| PA | Page Allocator | |||||||||||||||||
| PB | Petabyte | It’s a unit of digital storage size. | ||||||||||||||||
| PCIe | PCI Express | |||||||||||||||||
| PFN | Page Frame Numbers | |||||||||||||||||
| PGD | Page Global Directory | It is the top-level page table used by the Linux kernel to translate virtual addresses → physical addresses. |
||||||||||||||||
| PMD | Page Middle Directory | It is the third level in the Linux page table hierarchy (for most modern configs) and sits between PUD and PTE. |
||||||||||||||||
| PMI | Performance Monitoring Interrupt | It is a specialized interrupt generated by the CPU's Performance Monitoring Unit (PMU) to signal that a specific hardware counter has overflowed. |
||||||||||||||||
| POST | Power On Self Test | |||||||||||||||||
| PSS | Proportional Set Size | Physical memory used by a process, where shared pages are divided proportionally among all sharers. Example:
|
||||||||||||||||
| pts | Pseudo-Terminal Slave number 1 | Unlike /dev/tty1 (which represents a physical keyboard and monitor attached to the machine), a pts is a "fake" terminal created by software. Why are you on a PTS? You get a pts address whenever you connect to the system via:
|
||||||||||||||||
| PTE | Page Table Entry | It is the lowest (leaf) level of the Linux page table hierarchy and directly maps a virtual page to a physical page. Virtual Address --> PGD (L0) → PUD (L1) → PMD (L2) → PTE (L3) → Physical Page (4 KB). |
||||||||||||||||
| pty | Pseudo-terminal | |||||||||||||||||
| PUD | Page Upper Directory | It is the second level in the Linux page table hierarchy and sits between PGD and PMD. |
||||||||||||||||
| RCU | Ready-Copy update | RCU (Read-Copy-Update) is a high-performance synchronization mechanism that allows multiple "readers" to access data simultaneously with a "writer.
|
||||||||||||||||
| RISC | Reduced Instruction Set Computer | |||||||||||||||||
| RMW | Read-Modify-Write | |||||||||||||||||
| RSS | Resident Set Size | The amount of physical RAM currently occupied by a process. RSS counts only pages that are resident in RAM, such as:
|
||||||||||||||||
| RTC | Real Time Clock | |||||||||||||||||
| RTL | Real Time Linux | |||||||||||||||||
| RUID | Real User ID | Who started the process.! | ||||||||||||||||
| SCL | Serial Clock | |||||||||||||||||
| SCSI | Small Computer System Interface | |||||||||||||||||
| SDA | Serial Data | |||||||||||||||||
| sed | Simple encrypt decrypt | |||||||||||||||||
| SELinux | Security Enhanced Linux | |||||||||||||||||
| SEV | Send Event | SEV (Send Event) is the companion instruction to WFE. It acts as a signaling mechanism to wake up processor cores that have entered a low-power standby state. When a core executes SEV, it causes an event to be signaled to all cores in the multiprocessor system (or within a specific sharing domain).
|
||||||||||||||||
| SIMD | Single Instruction, Multiple Data | |||||||||||||||||
| SLOCs | Source Lines of Code | |||||||||||||||||
| SMP | Symmetric MultiProcessing | Multiple CPUs sharing memory | ||||||||||||||||
| SOH | Start of Header | |||||||||||||||||
| SPDX | Software Package Data Exchange | A shorthand and concise format for expressing the license the code is under. Must be 1st line in every source file. //SPDX-License-Identifier: GPL-2.0 |
||||||||||||||||
| SUID | Saved User ID | For temporarily dropping/regaining privilege. | ||||||||||||||||
| SVE | Scalable Vector Extension | |||||||||||||||||
| systemd | System Daemon | |||||||||||||||||
| TLB | Translation Lookaside Buffer | It’s a CPU hardware cache that speeds up virtual → physical address translation. |
||||||||||||||||
| TGID | Thread Group ID | |||||||||||||||||
| TTBR0 | Translation Table Base Register 0 | It is is an ARM64 CPU register that tells the MMU where the page tables for user space start. TTBR0_EL1 holds the physical base address of the page tables used for translating user-space virtual addresses. |
||||||||||||||||
| TTBR1 | Translation Table Base Register 1 | For kernel paging table. | ||||||||||||||||
| ttv | Teletype terminal | |||||||||||||||||
| UAF | Use After Free | |||||||||||||||||
| UB | Undefined Behavior | |||||||||||||||||
| UBSAN | Undefined BehaviorSanatizer | UBSAN (Undefined Behavior Sanitizer) is a runtime debugging tool for the Linux kernel that detects Undefined Behavior—actions in C that the language standard doesn't define, often leading to unpredictable crashes or security flaws. What it catches: It identifies common "silent" bugs that compilers usually ignore:
|
||||||||||||||||
| UEFI | Unified Extensible Firmware Interface |
|
||||||||||||||||
| umh | User Mode Helper | |||||||||||||||||
| UMR | Uninitialized Memory Reads | |||||||||||||||||
| UTS | Unix Timesharing System | It provides domain name and hostname isolation. | ||||||||||||||||
| VDSO | Virtual Dynamic Shared Object |
|
||||||||||||||||
| VFS | Virtual File System | |||||||||||||||||
| wfe | Wait For Event | WFE (Wait For Event) is a hint instruction used to put a processor into a low-power standby state until a specific "event" occurs. |
||||||||||||||||
| w/w | Wait/wound | It is a specialized mutex implementation used to handle deadlock avoidance when a thread needs to acquire multiple locks at once. W/W mutexes use a Ticket (Timestamp) system. Every "transaction" (a set of lock attempts) gets a serial number.
|
Ans: Use a softirq only if you are writing core kernel infrastructure that requires extreme performance and massive parallelism.
- Parallelism: The same softirq can run on multiple CPUs simultaneously.
- Complexity: You must ensure your code is perfectly re-entrant and uses complex fine-grained locking.
- Usage: Reserved for Networking (NET_RX/NET_TX), Block I/O, and RCU.
- Static: You cannot add new softirqs without modifying and recompiling the core kernel.
Q. When to use a Tasklet?
Ans: Use a tasklet if you are maintaining legacy driver code that requires a simple, atomic bottom half.
- Ease of Use: Tasklets are dynamically allocatable and don't require you to worry about multi-CPU concurrency.
- Serialization: A specific tasklet will never run on two CPUs at once. This simplifies locking significantly.
- Execution: They always run on the same CPU that scheduled them, which is good for cache locality.
Q. In SMP can 1 method run critical section on 1 core and interrupt handler on 2nd core for the same critical section?
Ans: Yes, Without proper synchronization, a critical section can be accessed simultaneously by a process on one core and an interrupt handler on another.
The Scenario: The "Race Condition"
Imagine you have a shared data structure protected by a standard mutex or a simple flag.
- Core 1: Thread A enters the critical section (acquires a lock).
- Core 2: A hardware interrupt occurs. The CPU stops what it's doing and jumps to the Interrupt Service Routine (ISR).
- The Conflict: If the ISR on Core 2 tries to access the same data structure while Thread A is still holding it on Core 1, you have a collision.
Q. Why standard Mutexes fail here?
Ans: In the Linux kernel, an Interrupt Handler cannot sleep. Because standard mutexes (mutex_lock) put a thread to sleep if the lock is held, you cannot use them inside an interrupt handler. If the ISR tries to take a mutex held by Core 1, the system will likely crash or panic.
Q. why spin_lock will not work in this case?
Ans: A simple spin_lock fails because it does not account for the same-core deadlock scenario. Even in an SMP (Symmetric Multiprocessing) system, an interrupt can fire on the same core that is currently holding the lock.
The Local Deadlock (Self-Deadlock)
If a process on Core 1 acquires a simple spin_lock, it successfully enters the critical section. However, if a hardware interrupt occurs on that same core (Core 1) before the lock is released:
- The kernel stops the process and starts the Interrupt Service Routine (ISR).
- If the ISR tries to acquire the same spinlock, it will see the lock is already "taken" and will begin spinning (looping) to wait for it.
- The process that holds the lock can never run to release it because it has been preempted by the very ISR that is now spinning.
- Result: The core is deadlocked in a permanent "spin".
Q. Why SMP Doesn't Solve This?
Ans: While you might think the ISR on Core 2 would be fine (it would just spin until Core 1 finishes), you cannot guarantee which core will receive a specific interrupt. If the interrupt happens to hit the core holding the lock, the entire system can hang.
In an SMP system, an interrupt arriving on Core 2 cannot physically preempt a process running on Core 1. So why is spin_lock still "wrong"?
Ans: The reason you are told a "simple spin_lock will not work" is not because of Core 2; it is because of the uncertainty of interrupt routing.
In most modern systems, the Programmable Interrupt Controller (APIC) decides which core gets an interrupt. You cannot guarantee the interrupt will always go to Core 2. If that same interrupt happens to be routed to Core 1 while Core 1 is holding the lock:
- Core 1 stops the process to handle the interrupt.
- The ISR on Core 1 tries to grab the lock Core 1 is already holding.
- Deadlock: Core 1 spins forever waiting for itself.
The Solution: Spinlocks with IRQ Disabling
To protect a critical section from being accessed by both a thread and an interrupt handler across different cores, you must use a Spinlock combined with Interrupt Disabling.
spin_lock_irqsave()
This is the "gold standard" for this problem. When you call this:
- On the local core (Core 1): It disables interrupts. This prevents an interrupt from firing on this core and trying to re-enter the critical section.
- Across the system: It acquires a spinlock. If an interrupt fires on Core 2 and tries to enter the same critical section, it will "spin" (loop rapidly) waiting for Core 1 to release the lock.
Q. spin_lock() works in process_context or atomic_context?
Ans: spin_lock()is versatile and can be used in both Process Context and Atomic Context, but its behavior changes how the system treats those contexts.