diff --git a/README.es.md b/README.es.md index 64eb1bf..2c72258 100644 --- a/README.es.md +++ b/README.es.md @@ -24,6 +24,13 @@ Pools de buffers acotados sin bloqueos y eficientes en memoria para Go, optimiza - **Generación de IoVec sin copia** para llamadas al sistema de I/O vectorizado. - **Retroceso cooperativo**: Usa `iox.Backoff` para manejar el agotamiento de recursos con elegancia. +## Requisitos + +- **Go 1.25+** +- **CPU de 64 bits** (amd64, arm64, riscv64, loong64, ppc64, s390x, mips64, etc.) + +> **Nota:** Las arquitecturas de 32 bits no son compatibles debido a las operaciones atómicas de 64 bits en la implementación del pool sin bloqueos. + ## Instalación ```bash @@ -79,18 +86,22 @@ addr, n := iobuf.IoVecAddrLen(iovecs) ## Niveles de Buffer -Progresión de potencias de 4, comenzando en 16 bytes: +Progresión de potencias de 4, comenzando en 32 bytes (12 niveles, 32 B a 128 MiB): | Nivel | Tamaño | Caso de Uso | |-------|--------|-------------| -| Pico | 16 B | Metadatos pequeños, flags | -| Nano | 64 B | Cabeceras pequeñas, tokens | -| Micro | 256 B | Cabeceras de protocolo | -| Small | 1 KiB | Mensajes pequeños | -| Medium | 4 KiB | I/O de tamaño de página | -| Large | 16 KiB | Transferencias grandes | -| Huge | 64 KiB | UDP máximo | -| Giant | 256 KiB | I/O masivo, cargas grandes | +| Pico | 32 B | UUIDs, flags, mensajes de control pequeños | +| Nano | 128 B | Cabeceras HTTP, tokens JSON, payloads RPC pequeños | +| Micro | 512 B | Paquetes DNS, mensajes MQTT, tramas de protocolo | +| Small | 2 KiB | Frames WebSocket, respuestas HTTP pequeñas | +| Medium | 8 KiB | Segmentos TCP, mensajes gRPC, I/O de página | +| Big | 32 KiB | Registros TLS (máx 16 KiB), chunks de stream | +| Large | 128 KiB | Buffer rings io_uring, transferencias de red masivas | +| Great | 512 KiB | Páginas de base de datos, respuestas API grandes | +| Huge | 2 MiB | Alineado a huge pages, archivos mapeados en memoria | +| Vast | 8 MiB | Procesamiento de imágenes, archivos comprimidos | +| Giant | 32 MiB | Frames de video, pesos de modelos ML | +| Titan | 128 MiB | Datasets grandes, buffer máximo seguro para stack | ## Resumen de API @@ -119,9 +130,13 @@ func NewNanoBufferPool(capacity int) *NanoBufferBoundedPool func NewMicroBufferPool(capacity int) *MicroBufferBoundedPool func NewSmallBufferPool(capacity int) *SmallBufferBoundedPool func NewMediumBufferPool(capacity int) *MediumBufferBoundedPool +func NewBigBufferPool(capacity int) *BigBufferBoundedPool func NewLargeBufferPool(capacity int) *LargeBufferBoundedPool +func NewGreatBufferPool(capacity int) *GreatBufferBoundedPool func NewHugeBufferPool(capacity int) *HugeBufferBoundedPool +func NewVastBufferPool(capacity int) *VastBufferBoundedPool func NewGiantBufferPool(capacity int) *GiantBufferBoundedPool +func NewTitanBufferPool(capacity int) *TitanBufferBoundedPool ``` ### Asignación de Memoria diff --git a/README.fr.md b/README.fr.md index 88f0c94..059db80 100644 --- a/README.fr.md +++ b/README.fr.md @@ -24,6 +24,13 @@ Pools de buffers bornés sans verrou et économes en mémoire pour Go, optimisé - **Génération IoVec sans copie** pour les appels système d'I/O vectorisées. - **Recul coopératif** : Utilise `iox.Backoff` pour gérer l'épuisement des ressources avec élégance. +## Prérequis + +- **Go 1.25+** +- **CPU 64 bits** (amd64, arm64, riscv64, loong64, ppc64, s390x, mips64, etc.) + +> **Note :** Les architectures 32 bits ne sont pas prises en charge en raison des opérations atomiques 64 bits dans l'implémentation du pool sans verrou. + ## Installation ```bash @@ -79,18 +86,22 @@ addr, n := iobuf.IoVecAddrLen(iovecs) ## Niveaux de Buffer -Progression en puissances de 4, à partir de 16 octets : +Progression en puissances de 4, à partir de 32 octets (12 niveaux, 32 o à 128 Mio) : | Niveau | Taille | Cas d'Usage | |--------|--------|-------------| -| Pico | 16 o | Petites métadonnées, drapeaux | -| Nano | 64 o | Petits en-têtes, jetons | -| Micro | 256 o | En-têtes de protocole | -| Small | 1 Kio | Petits messages | -| Medium | 4 Kio | I/O de taille de page | -| Large | 16 Kio | Grands transferts | -| Huge | 64 Kio | UDP maximum | -| Giant | 256 Kio | I/O en masse, grandes charges | +| Pico | 32 o | UUIDs, drapeaux, petits messages de contrôle | +| Nano | 128 o | En-têtes HTTP, jetons JSON, petits payloads RPC | +| Micro | 512 o | Paquets DNS, messages MQTT, trames de protocole | +| Small | 2 Kio | Frames WebSocket, petites réponses HTTP | +| Medium | 8 Kio | Segments TCP, messages gRPC, I/O de page | +| Big | 32 Kio | Enregistrements TLS (max 16 Kio), chunks de flux | +| Large | 128 Kio | Anneaux de tampon io_uring, transferts réseau massifs | +| Great | 512 Kio | Pages de base de données, grandes réponses API | +| Huge | 2 Mio | Aligné sur huge pages, fichiers mappés en mémoire | +| Vast | 8 Mio | Traitement d'images, archives compressées | +| Giant | 32 Mio | Frames vidéo, poids de modèles ML | +| Titan | 128 Mio | Grands ensembles de données, buffer max sûr pour pile | ## Aperçu de l'API @@ -119,9 +130,13 @@ func NewNanoBufferPool(capacity int) *NanoBufferBoundedPool func NewMicroBufferPool(capacity int) *MicroBufferBoundedPool func NewSmallBufferPool(capacity int) *SmallBufferBoundedPool func NewMediumBufferPool(capacity int) *MediumBufferBoundedPool +func NewBigBufferPool(capacity int) *BigBufferBoundedPool func NewLargeBufferPool(capacity int) *LargeBufferBoundedPool +func NewGreatBufferPool(capacity int) *GreatBufferBoundedPool func NewHugeBufferPool(capacity int) *HugeBufferBoundedPool +func NewVastBufferPool(capacity int) *VastBufferBoundedPool func NewGiantBufferPool(capacity int) *GiantBufferBoundedPool +func NewTitanBufferPool(capacity int) *TitanBufferBoundedPool ``` ### Allocation Mémoire diff --git a/README.ja.md b/README.ja.md index 36d006e..5a366d8 100644 --- a/README.ja.md +++ b/README.ja.md @@ -24,6 +24,13 @@ - **ゼロコピーIoVec生成**:ベクトル化I/Oシステムコール用。 - **協調的バックオフ**:`iox.Backoff` を使用してリソース枯渇を優雅に処理。 +## システム要件 + +- **Go 1.25+** +- **64ビットCPU**(amd64、arm64、riscv64、loong64、ppc64、s390x、mips64など) + +> **注意:** ロックフリープール実装で64ビットアトミック操作を使用しているため、32ビットアーキテクチャはサポートされていません。 + ## インストール ```bash @@ -79,18 +86,22 @@ addr, n := iobuf.IoVecAddrLen(iovecs) ## バッファ階層 -16バイトから始まる4の累乗で増加: +32バイトから始まる4の累乗で増加(12階層、32 B から 128 MiB): | 階層 | サイズ | 用途 | |------|--------|------| -| Pico | 16 B | 小さなメタデータ、フラグ | -| Nano | 64 B | 小さなヘッダ、トークン | -| Micro | 256 B | プロトコルヘッダ | -| Small | 1 KiB | 小さなメッセージ | -| Medium | 4 KiB | ページサイズI/O | -| Large | 16 KiB | 大きな転送 | -| Huge | 64 KiB | 最大UDP | -| Giant | 256 KiB | バルクI/O、大きなペイロード | +| Pico | 32 B | UUID、フラグ、小さな制御メッセージ | +| Nano | 128 B | HTTPヘッダ、JSONトークン、小さなRPCペイロード | +| Micro | 512 B | DNSパケット、MQTTメッセージ、プロトコルフレーム | +| Small | 2 KiB | WebSocketフレーム、小さなHTTPレスポンス | +| Medium | 8 KiB | TCPセグメント、gRPCメッセージ、ページI/O | +| Big | 32 KiB | TLSレコード(最大16 KiB)、ストリームチャンク | +| Large | 128 KiB | io_uringバッファリング、バルクネットワーク転送 | +| Great | 512 KiB | データベースページ、大規模APIレスポンス | +| Huge | 2 MiB | ヒュージページ整列、メモリマップファイル | +| Vast | 8 MiB | 画像処理、圧縮アーカイブ | +| Giant | 32 MiB | ビデオフレーム、MLモデル重み | +| Titan | 128 MiB | 大規模データセット、最大スタック安全バッファ | ## API概要 @@ -119,9 +130,13 @@ func NewNanoBufferPool(capacity int) *NanoBufferBoundedPool func NewMicroBufferPool(capacity int) *MicroBufferBoundedPool func NewSmallBufferPool(capacity int) *SmallBufferBoundedPool func NewMediumBufferPool(capacity int) *MediumBufferBoundedPool +func NewBigBufferPool(capacity int) *BigBufferBoundedPool func NewLargeBufferPool(capacity int) *LargeBufferBoundedPool +func NewGreatBufferPool(capacity int) *GreatBufferBoundedPool func NewHugeBufferPool(capacity int) *HugeBufferBoundedPool +func NewVastBufferPool(capacity int) *VastBufferBoundedPool func NewGiantBufferPool(capacity int) *GiantBufferBoundedPool +func NewTitanBufferPool(capacity int) *TitanBufferBoundedPool ``` ### メモリ割り当て diff --git a/README.md b/README.md index a710a7e..be65c8b 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,13 @@ English | [简体中文](README.zh-CN.md) | [Español](README.es.md) | [日本 - **Zero-copy IoVec generation** for vectored I/O syscalls. - **Cooperative back-off**: Uses `iox.Backoff` to handle resource exhaustion gracefully. +## Requirements + +- **Go 1.25+** +- **64-bit CPU** (amd64, arm64, riscv64, loong64, ppc64, s390x, mips64, etc.) + +> **Note:** 32-bit architectures are not supported due to 64-bit atomic operations in the lock-free pool implementation. + ## Installation ```bash @@ -35,7 +42,7 @@ go get code.hybscloud.com/iobuf ### Buffer Pools ```go -// Create a pool of 1024 small buffers (1 KiB each) +// Create a pool of 1024 small buffers (2 KiB each) pool := iobuf.NewSmallBufferPool(1024) pool.Fill(iobuf.NewSmallBuffer) @@ -79,18 +86,22 @@ addr, n := iobuf.IoVecAddrLen(iovecs) ## Buffer Tiers -Power-of-4 progression starting at 16 bytes: +Power-of-4 progression starting at 32 bytes (12 tiers, 32 B to 128 MiB): | Tier | Size | Use Case | |------|------|----------| -| Pico | 16 B | Tiny metadata, flags | -| Nano | 64 B | Small headers, tokens | -| Micro | 256 B | Protocol headers | -| Small | 1 KiB | Small messages | -| Medium | 4 KiB | Page-sized I/O | -| Large | 16 KiB | Large transfers | -| Huge | 64 KiB | Maximum UDP | -| Giant | 256 KiB | Bulk I/O, large payloads | +| Pico | 32 B | UUIDs, flags, tiny control messages | +| Nano | 128 B | HTTP headers, JSON tokens, small RPC payloads | +| Micro | 512 B | DNS packets, MQTT messages, protocol frames | +| Small | 2 KiB | WebSocket frames, small HTTP responses | +| Medium | 8 KiB | TCP segments, gRPC messages, page I/O | +| Big | 32 KiB | TLS records (16 KiB max), stream chunks | +| Large | 128 KiB | io_uring buffer rings, bulk network transfers | +| Great | 512 KiB | Database pages, large API responses | +| Huge | 2 MiB | Huge page aligned, memory-mapped files | +| Vast | 8 MiB | Image processing, compressed archives | +| Giant | 32 MiB | Video frames, ML model weights | +| Titan | 128 MiB | Large datasets, maximum stack-safe buffer | ## API Overview @@ -119,17 +130,27 @@ func NewNanoBufferPool(capacity int) *NanoBufferBoundedPool func NewMicroBufferPool(capacity int) *MicroBufferBoundedPool func NewSmallBufferPool(capacity int) *SmallBufferBoundedPool func NewMediumBufferPool(capacity int) *MediumBufferBoundedPool +func NewBigBufferPool(capacity int) *BigBufferBoundedPool func NewLargeBufferPool(capacity int) *LargeBufferBoundedPool +func NewGreatBufferPool(capacity int) *GreatBufferBoundedPool func NewHugeBufferPool(capacity int) *HugeBufferBoundedPool +func NewVastBufferPool(capacity int) *VastBufferBoundedPool func NewGiantBufferPool(capacity int) *GiantBufferBoundedPool +func NewTitanBufferPool(capacity int) *TitanBufferBoundedPool ``` ### Memory Allocation ```go +// Page-aligned memory func AlignedMem(size int, pageSize uintptr) []byte func AlignedMemBlocks(n int, pageSize uintptr) [][]byte func AlignedMemBlock() []byte + +// Cache-line-aligned memory (prevents false sharing) +func CacheLineAlignedMem(size int) []byte +func CacheLineAlignedMemBlocks(n int, blockSize int) [][]byte +const CacheLineSize // 64 or 128 depending on architecture ``` ### IoVec Generation diff --git a/README.zh-CN.md b/README.zh-CN.md index 1823d4d..a662ad0 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -24,6 +24,13 @@ - **零拷贝 IoVec 生成**:用于向量化 I/O 系统调用。 - **协作式退避**:使用 `iox.Backoff` 优雅处理资源耗尽。 +## 系统要求 + +- **Go 1.25+** +- **64 位 CPU**(amd64、arm64、riscv64、loong64、ppc64、s390x、mips64 等) + +> **注意:** 由于无锁池实现中使用 64 位原子操作,不支持 32 位架构。 + ## 安装 ```bash @@ -79,18 +86,22 @@ addr, n := iobuf.IoVecAddrLen(iovecs) ## 缓冲区层级 -4 的幂次递增,从 16 字节开始: +4 的幂次递增,从 32 字节开始(12 层,32 B 到 128 MiB): | 层级 | 大小 | 用途 | |------|------|------| -| Pico | 16 B | 微型元数据、标志位 | -| Nano | 64 B | 小型头部、令牌 | -| Micro | 256 B | 协议头部 | -| Small | 1 KiB | 小型消息 | -| Medium | 4 KiB | 页大小 I/O | -| Large | 16 KiB | 大型传输 | -| Huge | 64 KiB | 最大 UDP | -| Giant | 256 KiB | 批量 I/O、大型负载 | +| Pico | 32 B | UUID、标志、微型控制消息 | +| Nano | 128 B | HTTP 头部、JSON 令牌、小型 RPC 载荷 | +| Micro | 512 B | DNS 数据包、MQTT 消息、协议帧 | +| Small | 2 KiB | WebSocket 帧、小型 HTTP 响应 | +| Medium | 8 KiB | TCP 分段、gRPC 消息、页 I/O | +| Big | 32 KiB | TLS 记录(最大 16 KiB)、流块 | +| Large | 128 KiB | io_uring 缓冲环、批量网络传输 | +| Great | 512 KiB | 数据库页、大型 API 响应 | +| Huge | 2 MiB | 大页对齐、内存映射文件 | +| Vast | 8 MiB | 图像处理、压缩归档 | +| Giant | 32 MiB | 视频帧、机器学习模型权重 | +| Titan | 128 MiB | 大型数据集、最大栈安全缓冲区 | ## API 概览 @@ -119,9 +130,13 @@ func NewNanoBufferPool(capacity int) *NanoBufferBoundedPool func NewMicroBufferPool(capacity int) *MicroBufferBoundedPool func NewSmallBufferPool(capacity int) *SmallBufferBoundedPool func NewMediumBufferPool(capacity int) *MediumBufferBoundedPool +func NewBigBufferPool(capacity int) *BigBufferBoundedPool func NewLargeBufferPool(capacity int) *LargeBufferBoundedPool +func NewGreatBufferPool(capacity int) *GreatBufferBoundedPool func NewHugeBufferPool(capacity int) *HugeBufferBoundedPool +func NewVastBufferPool(capacity int) *VastBufferBoundedPool func NewGiantBufferPool(capacity int) *GiantBufferBoundedPool +func NewTitanBufferPool(capacity int) *TitanBufferBoundedPool ``` ### 内存分配 diff --git a/doc.go b/doc.go new file mode 100644 index 0000000..b1411fb --- /dev/null +++ b/doc.go @@ -0,0 +1,97 @@ +// ©Hayabusa Cloud Co., Ltd. 2025. All rights reserved. +// Use of this source code is governed by a MIT-style +// license that can be found in the LICENSE file. + +// Package iobuf provides lock-free buffer pools and memory management utilities +// for high-performance I/O operations. +// +// The package implements a 12-tier buffer size hierarchy and lock-free bounded +// pools optimized for zero-allocation hot paths. All pools use semantic error +// types from iox for non-blocking control flow. +// +// # Buffer Tiers +// +// Buffers are organized into 12 size tiers following a power-of-4 progression: +// +// Tier Size Use Case +// ──── ──── ──────── +// Pico 32 B Tiny metadata, flags +// Nano 128 B Small headers, control frames +// Micro 512 B Protocol frames, small messages +// Small 2 KiB Typical network packets +// Medium 8 KiB Stream buffers, large packets +// Big 32 KiB TLS records, stream chunks +// Large 128 KiB io_uring buffer rings +// Great 512 KiB Large transfers +// Huge 2 MiB Huge page aligned buffers +// Vast 8 MiB Large file chunks +// Giant 32 MiB Video frames, datasets +// Titan 128 MiB Maximum allocation tier +// +// Each tier has corresponding type aliases (e.g., SmallBuffer, LargeBuffer) and +// factory functions for bounded pools (e.g., NewSmallBufferPool). +// +// # Bounded Pool +// +// BoundedPool is a lock-free multi-producer multi-consumer (MPMC) pool based on +// the algorithm from "A Scalable, Portable, and Memory-Efficient Lock-Free FIFO +// Queue" (Ruslan Nikolaev, 2019). Key characteristics: +// +// - Lock-free: Uses atomic CAS operations, no mutexes +// - Bounded: Fixed capacity rounded to power of two +// - Memory-efficient: Single contiguous array, no per-element allocation +// - Cache-optimized: Aligned to cache line boundaries to prevent false sharing +// +// # Indirect Pool Pattern +// +// Pools store indices (int) rather than buffer values directly. This enables: +// +// - Zero-copy access via Value(indirect) method +// - Efficient pool operations without moving large buffers +// - Clear ownership semantics through index hand-off +// +// Usage pattern: +// +// pool := NewSmallBufferPool(100) // Creates pool with ~128 capacity +// pool.Fill(NewSmallBuffer) // Initialize with buffer factory +// idx, err := pool.Get() // Acquire buffer index +// if err != nil { +// // Handle iox.ErrWouldBlock (pool empty) +// } +// buf := pool.Value(idx) // Access buffer by index +// // Use buf[:]... +// pool.Put(idx) // Return buffer to pool +// +// # Page-Aligned Memory +// +// For DMA and io_uring operations requiring page alignment: +// +// mem := AlignedMem(4096, PageSize) // Returns page-aligned []byte +// block := AlignedMemBlock() // Single page using default PageSize +// blocks := AlignedMemBlocks(16, PageSize) // Multiple aligned blocks +// +// # Vectored I/O +// +// IoVec provides scatter/gather I/O support for readv/writev syscalls: +// +// buffers := make([]SmallBuffer, 8) +// iovecs := IoVecFromSmallBuffers(buffers) +// addr, n := IoVecAddrLen(iovecs) // Get pointer for syscall +// +// # Architecture Requirements +// +// This package requires a 64-bit CPU architecture (amd64, arm64, riscv64, loong64, +// ppc64, ppc64le, s390x, mips64, mips64le). 32-bit architectures are not supported +// due to 64-bit atomic operations in BoundedPool. +// +// # Thread Safety +// +// All pool operations are safe for concurrent use. BoundedPool supports multiple +// concurrent producers and consumers without external synchronization. +// +// # Dependencies +// +// iobuf depends on: +// - iox: Semantic error types (ErrWouldBlock, ErrMore) +// - spin: Spinlock and spin-wait primitives for backpressure +package iobuf diff --git a/types.go b/types.go index b8a20ca..b0bf6a8 100644 --- a/types.go +++ b/types.go @@ -6,10 +6,19 @@ package iobuf import "net" -// PageSize defines the standard memory page size (4 KiB) used for alignment. +// PageSize is the memory page size used for aligned allocations. +// +// The default value (4 KiB) matches the typical x86-64 and ARM64 page size. +// Use SetPageSize to configure for systems with different page sizes. var PageSize uintptr = 4096 -// SetPageSize updates the package-level page size used for allocations. +// SetPageSize updates the package-level page size used for aligned allocations. +// +// This should be called once during initialization, before any calls to +// AlignedMem or AlignedMemBlocks. Common values: +// - 4096 (4 KiB): Standard x86-64, ARM64 +// - 16384 (16 KiB): Some ARM64 configurations (Apple Silicon) +// - 65536 (64 KiB): Some embedded systems func SetPageSize(size int) { PageSize = uintptr(size) } @@ -18,7 +27,15 @@ func SetPageSize(size int) { // multiple byte slices for vectored I/O operations. type Buffers = net.Buffers -// noCopy is a sentinel used to prevent copying of synchronization primitives. +// noCopy is a sentinel type that triggers "go vet" warnings when a +// containing struct is copied by value. +// +// Embedding this type in a struct (e.g., BoundedPool) causes go vet to +// report "copies lock value" when the struct is passed by value or assigned. +// This is a compile-time safety mechanism for types that must not be copied. +// +// The Lock/Unlock methods satisfy the sync.Locker interface, which is +// the detection mechanism used by go vet's copylock analyzer. type noCopy struct{} func (*noCopy) Lock() {}