Computer architecture resources, including simulators, benchmarks, tools, and tutorials for researchers, graduate students, and hardware engineers.
- Books
- Conferences
- Simulators
- Benchmarks
- Tools for Research
- ISA References & Specs
- Compilers & Binary Analysis
- Traces & Datasets
- Tutorials
- Cutting-Edge Research & Emerging Architectures
- Elite Research Labs in Computer Architecture & Systems
- Hall of Fame
Essential reading materials for computer architecture, ranging from foundational concepts to advanced quantitative analysis.
- CMOS VLSI Design: A Circuits & Systems Perspective - Fundamentals of CMOS technology, circuits, and VLSI chip design by Weste & Harris.
- Computer Architecture: A Quantitative Approach - The definitive guide for graduate-level architecture by Hennessy & Patterson.
- Computer Organization and Design (RISC-V Edition) - Best for undergraduate basics and RISC-V fundamentals by Patterson & Hennessy.
- Digital Design and Computer Architecture - Connecting logic gates to processor design by Harris & Harris.
- Parallel Computer Architecture: A Hardware/Software Approach - The classic text on parallel computing and high-performance architecture by Culler, Singh, & Gupta.
- Principles and Practices of Interconnection Networks - Definitive guide on router architecture, topology, and routing by Dally & Towles.
The top-tier (Tier-1) academic conferences where the latest computer architecture research is published.
- ASPLOS - Architectural Support for Programming Languages and OS; the intersection of hardware, OS, and compilers.
- HPCA - International Symposium on High-Performance Computer Architecture; high-performance system design and analysis.
- ISCA - International Symposium on Computer Architecture; the premier forum for new ideas in computer architecture.
- MICRO - International Symposium on Microarchitecture; advanced microarchitecture and compiler techniques.
- ChampSim - Trace-driven simulator for branch prediction and prefetching, considered a de-facto standard for ISCA/MICRO research.
- gem5 - Modular, open-source CPU/SoC simulator used for studying cache, branch prediction, and out-of-order CPUs.
- MARSSx86 - Cycle-accurate x86 simulator for x86 microarchitecture analysis.
- QEMU - Fast full-system emulator heavily used for OS and software prototyping.
- Sniper - Interval-based multicore simulator for large-scale CMP performance studies.
- SST - Parallel simulator for large-scale systems focusing on HPC/SoC co-design.
- ZSim - Fast, scalable x86-64 multicore simulator for cache hierarchy and NUCA research.
- Accel-Sim - Validated GPU simulator for modern workloads like tensor cores and deep learning workloads.
- GPGPU-Sim - Cycle-accurate NVIDIA GPU simulator for CUDA kernel analysis.
- SCALE-Sim - Systolic array simulator for deep neural networks and TPU-like architectures.
- STONNE - is a cycle-level microarchitectural simulator for flexible DNN inference accelerators.
- Timeloop - Deep learning accelerator modeling framework for energy and performance optimization.
- DRAMSim3 - Cycle-accurate DRAM simulator for DDRx and LPDDRx modeling.
- MQSim - NVMe and SSD simulator for computational storage and near-data processing.
- Ramulator - Supports HBM, GDDR, and emerging memories for memory controller design.
- Ramulator 2 - Redesigned Ramulator with a cleaner, modular API for next-generation memory protocol research.
- FireSim - FPGA-accelerated datacenter simulator for cloud hardware research.
- Simics - Commercial full-system simulator for enterprise virtual prototyping.
- VCS - Commercial RTL simulator by Synopsys for ASIC/FPGA validation.
- Verilator - Fast RTL-to-C++ simulator for pre-silicon verification.
- SimpleScalar - Classic CPU simulator generally used for teaching pipeline basics.
- Spike - RISC-V ISA reference simulator for RISC-V software development.
Benchmark suites crucial for evaluating computer system performance, selected based on their relevance and widespread use in academic research and industry.
- Graph500 - Graph-based workload benchmark for memory-bound and graph analytics research.
- MiBench - Benchmark suite for evaluating embedded systems performance.
- MLPerf - Industry-standard ML hardware benchmark suite for AI accelerator and GPU evaluation.
- NAS Parallel Benchmark 3.4.2 - High-performance computing benchmarks for parallel systems performance.
- PARSEC 3.0 - Benchmark suite for shared-memory computers, heavily used in multicore research.
- PolyBench/C 4.2 - Kernel benchmark suite for compilers and performance optimization.
- Rodinia v3.1 - Heterogeneous computing benchmark suite for GPU and heterogeneous architectures.
- SPEC CPU2017 - Standard benchmark suite for comprehensive CPU performance evaluation.
- SPLASH-3 - Shared-memory parallel programs for cache coherence and multicore studies.
- STREAM - Synthetic benchmark program that measures sustainable memory bandwidth.
Essential tools categorized for performance analysis, power measurement, design, and visualization.
- ARM Development Studio - Performance tools for ARM architectures.
- Intel VTune - Detailed performance analysis for Intel architectures.
- Linux perf - Profiling tool for Linux systems.
- CACTI - DRAM and SRAM cache power and area model.
- ntel® PCM - Energy measurement tool for Intel processors.
- McPAT - Power, area, and timing modeling framework.
- Wattch - Power modeling integrated with cycle-level simulators.
- Bluespec SystemVerilog - High-level language for hardware design.
- Chisel - Hardware construction language for custom processors.
- Flame Graphs - Performance visualization tool.
- SpeedScope - Web-based profiling data analysis tool.
Official specifications and reference manuals for major instruction set architectures.
- AMD64 Architecture Programmer's Manual - AMD's x86-64 ISA reference.
- ARM Architecture Reference Manual - Complete ARMv8/v9 ISA reference.
- Intel x86 Software Developer Manuals - Full x86-64 ISA and microarchitecture reference.
- MIPS Architecture - Classic MIPS ISA reference, useful for teaching.
- RISC-V Spec - Official unprivileged and privileged ISA specifications.
Tools for compiler research, binary instrumentation, and code generation.
- Capstone - Lightweight multi-architecture disassembly framework for binary analysis and reverse engineering.
- DynamoRIO - Runtime code manipulation framework for dynamic analysis and custom instrumentation.
- GCC - GNU Compiler Collection with broad architecture support for cross-compilation and ISA-level code generation.
- Intel Pin - Dynamic binary instrumentation framework for trace generation, profiling, and microarch analysis.
- LLVM - Modular compiler infrastructure and IR framework for custom backends, pass development, and arch-aware optimization.
- Valgrind - Instrumentation framework for dynamic analysis, memory error detection, and cache profiling.
Memory, branch, and instruction traces used as inputs for simulators and predictor research.
- CRC2 Traces - Traces from the Cache Replacement Championship for replacement policy research.
- CVP-1 Traces - Branch predictor traces from the Championship Value Prediction.
- DPC-3 Traces - Traces from the Data Prefetching Championship for prefetcher design and evaluation.
- MLPerf Inference Traces - Inference workload traces for AI accelerator research and LLM hardware evaluation.
- SPEC CPU Traces - Instruction and memory traces derived from SPEC workloads for simulator input.
Online courses and specific guides for learning computer architecture, catering to both beginners and graduate researchers.
- David Patterson RISC-V Lectures - Lectures by the co-inventor of RISC-V on modern ISA design.
- gem5 Bootcamp - Recorded sessions from the official gem5 bootcamp.
- gem5 Tutorials - Official tutorials for hands-on learning with the gem5 simulator.
- Georgia Tech HPCA - Advanced course on high-performance computer architecture.
- GPU Architecture Tutorial - Detailed explanation of GPU architecture internals.
- MIT 6.004 Computation Structures - Foundational digital design and architecture course.
- Princeton Computer Architecture - Comprehensive architecture course for graduate students.
- RISC-V Tutorials - Official learning resources for the open standard RISC-V ISA.
Tools and frameworks actively used in recent top-tier publications (ISCA, MICRO, HPCA, ASPLOS) to evaluate next-generation computing paradigms.
- ASTRA-sim - Distributed deep learning training simulator for modeling multi-GPU communication and network endpoints.
- Sparseloop - Analytical modeling for sparse tensor accelerators exploring hardware exploitation of un-structured/structured sparsity.
- Timeloop + Accelergy - Joint performance and energy modeling for full design-space exploration of AI chips and DNN accelerators.
- Booksim2 - Cycle-accurate network-on-chip (NoC) simulator for modeling Network-on-Interposer (NoI) and chiplet routing.
- Garnet (gem5) - Detailed NoC model integrated into gem5 for on-chip network design and evaluation.
- CXL Consortium - Official CXL specification and resources for understanding CXL memory pooling and coherency protocols.
- ZSim+Ramulator - Processing-in-Memory Simulation Framework.
- Revizor - Microarchitectural fuzzing tool for detecting automated hardware information leaks like Spectre variants.
- SoftMC - FPGA-based DRAM testing framework for discovering and mitigating Rowhammer vulnerabilities in DDR4/DDR5.
Below is a curated list of research groups with sustained, high-impact contributions in top-tier computer architecture and systems venues such as ISCA, MICRO, HPCA, and ASPLOS.
- CMU CALCM – Computer Architecture Lab at Carnegie Mellon - The primary architecture hub at CMU. Historically and currently elite in storage systems, non-volatile memories, accelerators, parallel processing, and HW/SW co-design.
- Georgia Tech MSL – Memory Systems Lab - Led by Moinuddin Qureshi. World-class research focusing on cache/memory hierarchies, scalable memory systems, secure architecture, and quantum computing architecture.
- UC Berkeley SLICE Lab - Successor to the ADEPT Lab. Focuses on open-source silicon ecosystems (RISC-V), agile hardware design methodologies, ML accelerators, and warehouse-scale computing.
- UIUC I-ACOMA Lab - Led by Josep Torrellas. Renowned for work on extreme-scale architectures, parallel architectures, memory consistency models, and secure hardware.
- Cambridge Computer Architecture Group (CAG) - Specializes in manycore and interconnect architectures, memory systems, on-chip networks, and secure, low-power design.
- Cornell Computer Systems Laboratory (CSL) - A powerhouse for HW/SW co-design, datacenter networking microarchitecture, secure enclaves, scalable servers, and agile hardware.
- EPFL PARSA / EcoCloud - Focuses on datacenter and cloud server architectures, rack-scale computing, energy-efficient computing, and hardware for big data.
- ETH Zürich SAFARI Research Group - Led by Onur Mutlu. A highly prolific lab pioneering Processing-in-Memory (PIM), DRAM/NVM architectures, Rowhammer/hardware security, and bioinformatics acceleration.
- Georgia Tech Synergy Lab - Led by Tushar Krishna. Leading research in deep learning accelerators, Network-on-Chip (NoC), spatial architectures, and AI performance modeling.
- MIT CSG – Computer Systems Group - Highly influential in parallel architectures, spatial accelerators, cache coherence, secure hardware, and programmable ML hardware.
- Princeton Parallel Group - Led by David Wentzlaff. Known for manycore processors, scalable memory systems, datacenter/cloud architectures, and open-source hardware (e.g., OpenPiton).
- Stanford AHA – Agile Hardware Project - Pushing the boundaries of domain-specific accelerators, CGRAs, reconfigurable fabrics, memory systems, and agile hardware design tooling.
- Tsinghua PACMAN Group - Focuses on high-performance processors, accelerators for AI and HPC, memory systems, and heterogeneous systems design.
- UCLA VAST Lab - Led by Jason Cong. Pioneers in domain-specific computing, FPGA/heterogeneous acceleration, High-Level Synthesis (HLS), and automated ML hardware design tools.
- University of Michigan – Computer Engineering Lab (CE) - Home to multiple top PIs researching secure/trustworthy hardware, in-memory computing, energy-efficient architectures, and robust system design.
- University of Toronto – EECG - Contains multiple elite sub-groups. Globally recognized for energy-efficient ML systems, GPU microarchitecture, FPGA design, and memory for AI.
- UT Austin LCA – Laboratory for Computer Architecture - Elite research in core CPU microarchitecture, branch prediction, cache/memory hierarchies, GPU memory systems, and hardware reliability.
- University of Washington – Sampa Lab - Cutting-edge work in large-scale systems, approximate computing, DNA data storage, ML systems, and novel hardware substrates.
- UW–Madison Computer Architecture - A historically legendary hub for architecture. Focuses on classic and modern CPU microarchitecture, memory consistency models, heterogeneous computing, and tool development (e.g., gem5).
- Harvard Architecture, Circuits, and Compilers Group - Renowned for pioneering work in power/thermal-efficient architectures, edge AI hardware accelerators, autonomous systems, and HW/SW co-design.
- Universitat Politècnica de Catalunya (UPC) DAC - Deeply integrated with the Barcelona Supercomputing Center (BSC), conducting world-leading research in HPC, superscalar microarchitecture, vector processors, and memory systems.
- UC Berkeley BAR – Berkeley Architecture Research Group - Deeply influential in the RISC-V movement, vector architectures, out-of-order processor generators (BOOM), and SoC design tools.
- University of Edinburgh – ICSA - A leading European institute known for high-performance compilers, heterogeneous computing, low-power systems, and advanced microarchitecture.
- University of Manchester – APT Group - Famous for the SpiNNaker project, neuromorphic computing, and massive-scale many-core systems.
- TU Delft – Computer Engineering Lab - Leading European research in quantum computing control microarchitecture, memristor-based computing, and edge AI accelerators.
- KAIST – Computer Architecture & Systems (Navigate to Systems/Arch labs) - A dominant force in Asia producing highly influential papers in NVM, SSD architectures, AI accelerators, and main memory systems.
- Seoul National University – Computer Architecture & Embedded Systems (Navigate to labs) - World-class output in deep learning accelerators, GPU architectures, advanced memory structures, and hardware-software co-design.
Official Hall of Fame registries maintained by the top-tier conferences based on paper publication counts.
- HPCA Hall of Fame - Recognizes authors who have made significant contributions to HPCA over the years.
- ISCA Hall of Fame - Recognizes authors with 8 or more papers in the International Symposium on Computer Architecture.
- MICRO Hall of Fame - Recognizes authors with 8 or more papers in the International Symposium on Microarchitecture.
Contributions are welcome! Please see CONTRIBUTING.md for details on how to add new tools, benchmarks, or tutorials to this list.