Add onboarding guide (Chinese)#109
Conversation
62a422f to
98f99d7
Compare
The <pre> element should have a background to emphasize the element from the background.
Let the highlighting themes to handle this job.
This does not affect existing cards (omitting `title' is allowed):
{% card(type="warning") %}
Writing a person's name with a red pen is considered disrespectful.
{% end %}
To use an alternative title, specify a title in the `card()` template
call:
{% card(type="warning", title="ACHTUNG") %}
Rauchen ist hier verboten.
{% end %}
This section is dedicated to the onboarding guide for AOSC OS developement.
98f99d7 to
914edbe
Compare
* Fix hardcoded width. Use the relative size of the parent container. * Force images to be blocks to allow positioning.
fc60184 to
8b02363
Compare
stdmnpkg
left a comment
There was a problem hiding this comment.
是否需要介绍 autobuild4 中对应的变量,如 CFLAGS, USECLANG, NOLTO, AB_FLAGS_O3, AB_FLAGS_PIC ?
| - `-march=arrowlake -mtune=arrowlake`: 生成面向 Arrow Lake 微架构支持的指令集的代码,同时面向 Arrow Lake 微架构调优;生成的程序会尽量使用 Arrow Lake 支持的指令集中的指令(例如 AVX2) | ||
| - `-march=armv9-a -mtune=cortex-a710`: 生成面向 ARMv9.0-A 指令集的代码,面向基于 Cortex-A710 核心的处理器调优(更早的核心不支持 ARMv9.0-A 指令集) | ||
|
|
||
| 不过,在 Intel 处理器家族中有一处例外: |
There was a problem hiding this comment.
这一例外是相对于什么常态的?
另外早期 x86 有一处相似的另外,3D Now 扩展
| - [ARM AArch64][gcc-aarch64-options]: AArch64 后端尚未提供 `-m<指令集扩展名>` 格式的开关。若要精细控制指令集扩展,需要传入 `-march=armv8-a+[no]指令集名` 格式的参数 | ||
| - [LoongArch64][gcc-loongarch64-options]: `-mlsx` 和 `-mlasx` | ||
|
|
||
| 发行版预设的架构微调开关一般只包括 `-march=` (或 `-mcpu=`)和 `-mtune=`。部分架构可能需要单独指定诸如 `-mfix-loongson-llsc`(绕过 MIPS 架构的龙芯 3 号上的原子操作问题)等绕过架构特定问题的参数。由于其余的参数数量繁多且大部分参数不参与发行版基线定义,因此不做详细介绍。 |
There was a problem hiding this comment.
只有两个架构实现了这个参数,并且对应的选项有单独的 -m 开关控制。不在这里提示也不介绍。
There was a problem hiding this comment.
On LoongArch -msimd= is never intended to be used by end users. It's basically some internal convention between gcc and cc1.
| } | ||
| ``` | ||
|
|
||
| 在发行版包含的软件项目中会启用其他警告开关,部分要求一定代码质量的项目会强制将所有警告视为错误,以期减少代码中无意产生的不规范行为。GCC 提供数百个覆盖了代码各个方面的警告参数。由于警告参数种类繁多且均能捕获到代码中常见的各类错误,因此在这里只介绍以下参数: |
| - 编译器行为控制:指定编译时遵循的语言标准 (`-std=`)、将所有编译警告视为错误 (`-Werror`) 等会影响编译器行为的参数 | ||
|
|
||
| {% card(type="tips") %} | ||
| 命令行重复指定的同类型参数中,后者的效果会覆盖前者: |
There was a problem hiding this comment.
Some options, such as -Wall and -Wextra, turn on other options, such as -Wunused, which may turn on further options, such as -Wunused-variable. The combined effect of positive and negative forms is that more specific options have priority over less specific ones, independently of their position in the command line. For options of the same specificity, the last one takes effect. Options enabled or disabled via pragmas (see Diagnostic Pragmas) take effect as if they appeared at the end of the command line.
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html ,但需要补充说明。
|
|
||
| - `-fpic`: 使用小代码模型生成位置无关代码(GOT 表的大小有限) | ||
| - `-fPIC`: 使用大代码模型生成位置无关代码(支持无限制的 GOT 大小) | ||
| - `-fPIE`: 指定链接器生成可重定位的二进制可执行文件 |
There was a problem hiding this comment.
IMO we should only say something like "the difference between -fpic and -fPIC, or -fpie and -fPIE depends on the architecture; in general always use the upper-case one." For example, on MIPS64 -fPIC does not help at all for GOT size (you actually need -mxgot).
There was a problem hiding this comment.
Actually -fPIC stands for "position independent code suitable for SVR4-style shared objects" instead of simply "position independent code." On the contrary -fPIE really stands for "position independent code." They are some unfortunate historical terminologies confusing all the people due to the fact that position independent code was only used for SVR4 shared objects 20 or 30 years ago. Let's not repeat the same confusion.
这里只介绍编译器和工具链相关内容,Autobuild 的控制开关在后文单独介绍。 |
|
|
||
| - `llc`: LLVM 中间语言 (LLVM IR) 编译器 | ||
| - `lld`: LLVM 项目的链接器实现 | ||
| - `clang`: C/C++ 编译器前端 |
There was a problem hiding this comment.
是否需额外注解 LLVM 不包含独立的汇编器/汇编步骤,仅由 clang 包含一内置的汇编器实现。
| - `libc++`: LLVM 的 C++ 标准库实现 | ||
| - `compiler-rt`: LLVM 项目的编译器运行时库实现(对应 `libgcc`) | ||
|
|
||
| 相对于 GCC,LLVM [支持的架构较少][llvm-targets]。但 LLVM 编译器套件非常灵活,编译出的 LLVM 套件可以支持多种架构。即便如此,如需为其他架构编译程序,仍需对应架构的 Binutils 和 C 标准运行库等组件。 |
There was a problem hiding this comment.
仍需对应架构的 Binutils
请求来源?就我所知在使用 lld 作为链接器时,LLVM 可以脱离对 binutils 的依赖。
|
|
||
| LTO 能在一定程度上提升二进制程序的性能,但也有可能会过度优化导致程序行为异常。由于 LTO 是在最终链接过程中将组成整个程序的二进制文件全部交给编译器处理,因此 LTO 会显著增加编译时间,以及所需的内存。对 Web 浏览器等大型项目执行 LTO 时通常需要数十甚至数百 GB 的内存。 | ||
|
|
||
| GCC 和 Clang 可以通过 `-flto` 启用 LTO。同样地,`-fno-lto` 强调编译器不要启用 LTO。 |
There was a problem hiding this comment.
是否有必要提及 LTO 启用时编译产物的内容区别(特别在使用 LLVM codegen 的情况下?),以及造成的跨工具链 LTO 困难?
|
|
||
| ### 位置无关代码 | ||
|
|
||
| 位置无关代码能够加载在内存空间的随机地址上,而不影响其分支、跳转、静态变量引用及函数调用。因为所有分支的地址均为相对偏移量,静态变量及指针及函数调用地址需要通过全局偏移量表 (Global Offset Table) 获取,取代了链接时硬编码的绝对地址。这样做能够在一定程度上抵御 libc 跳转攻击 (Return-to-libc attack)。 |
There was a problem hiding this comment.
因为所有分支的地址均为相对偏移量,静态变量及指针及函数调用地址需要通过全局偏移量表 (Global Offset Table) 获取,取代了链接时硬编码的绝对地址。
是否存在事实性错误?仅运行时无法确定偏移量的符号需要通过 GOT 寻址或通过 PLT 间接跳转,运行时可确定偏移量的符号(不跨 DSO 且不可抢占的)编译器/链接器可以决定通过基于 PC 的指令直接寻址
建议修改为
位置无关代码能够加载在内存空间的任意地址上,而不影响其分支、跳转、静态变量引用及函数调用对符号地址的计算。此时链接时不会硬编码符号的绝对地址,而是使用相对于当前代码位置的偏移量计算得出的地址(对编译期可以确定偏移量的符号),或从全局偏移量表(Global Offset Table,GOT)中获取的地址(对于仅运行时可确定相对偏移量的符号)。
可在任意位置执行使得位置无关代码在加载时可以进行一定程度的地址布局随机化(Address Space Layout Randomization,ASLR),这样做能够在一定程度上抵御需要得知特定符号地址的攻击,如 libc 跳转攻击(Return-to-libc attack)。
顺便描述清楚 PIC,ASLR 与 return-to-libc 的关系。
|
|
||
| - `-fpic`: 使用小代码模型生成位置无关代码(GOT 表的大小有限) | ||
| - `-fPIC`: 使用大代码模型生成位置无关代码(支持无限制的 GOT 大小) | ||
| - `-fPIE`: 指定链接器生成可重定位的二进制可执行文件 |
There was a problem hiding this comment.
可能需要强调 -fPIE 与 -fPIC 的不同点,i.e. -fPIE 编译出的二进制不能正确链接成共享库。
| ```c | ||
| void foo () { | ||
| foo3 (); // 函数未预先声明 | ||
| printf ("Hello!\n"); // 函数未预先声明(存在于 <stdio.h> 头文件中) | ||
| } | ||
|
|
||
| foo3 (int x) { // 函数定义没有返回值类型,默认为 int | ||
| int y = x + 42; | ||
| return; // 返回值类型不匹配 | ||
| } | ||
|
|
||
| int *foo4(int x) { | ||
| int y = x + 42; | ||
| return &x; // 返回局部变量的地址(该行为不会按错误对待) | ||
| } |
There was a problem hiding this comment.
是否应当提及在 C89 后被弃用的 K&R 风格函数定义?如
int foo(a, b)
int a;
int b;
{
}
该行为在大量年代较久的项目中存在。
| 编译器的前端程序集成了许多步骤:默认情况下运行编译器会执行编译、汇编及链接,最终生成可执行二进制或共享库。而编译器也提供了用于控制编译器运行模式的参数,方便检查每个编译阶段的输出: | ||
|
|
||
| - `-E`: 使用预处理器预处理源代码文件,输出预处理后的源代码 | ||
| - `-S`: 将源代码编译成汇编语言源代码,不汇编成二进制克重定位文件 |
| - `-U<预处理宏>`: 取消定义命令行或编译器内部定义的预处理宏(无法控制头文件内指定的宏) | ||
| - `-I<头文件路径>`: 在指定路径内搜索头文件,可以指定多个 | ||
| - `-isystem <系统头文件路径>`: 指定 `#include <...>` 语句内头文件的搜索路径 | ||
| - `-iquote <系统头文件路径>`: 指定 `#include "..."` 语句内头文件的搜索路径 |
|
|
||
| 链接时优化 (Link Time Optimization),在 MSVC 编译器中称为链接时代码生成 (Link Time Code Generation),是在链接二进制文件的过程中所执行的优化步骤。LTO 允许链接器将参与链接过程的所有二进制文件重新交给编译器,由编译器在整个程序范围内执行优化过程,如去除未使用的符号、合理安排二进制的布局,并且能够将所有参与链接的文件视为一整个需要编译的文件,对所有函数运行编译器启用的优化步骤。 | ||
|
|
||
| LTO 能在一定程度上提升二进制程序的性能,但也有可能会过度优化导致程序行为异常。由于 LTO 是在最终链接过程中将组成整个程序的二进制文件全部交给编译器处理,因此 LTO 会显著增加编译时间,以及所需的内存。对 Web 浏览器等大型项目执行 LTO 时通常需要数十甚至数百 GB 的内存。 |
There was a problem hiding this comment.
因此 LTO 会显著增加编译时间,以及所需的内存。
此处逗号在中文语境中似乎不通顺?
|
|
||
| # 架构微调参数 | ||
|
|
||
| GCC 中所有架构相关的参数均以 `-m` 开头。每个架构必须自行定义架构相关的选项。这些选项在以下方面控制生成的代码: |
There was a problem hiding this comment.
每个架构必须自行定义架构相关的选项。
修改为
此类选项完全由各架构自行定义,无跨架构的兼容性保证。
是否含义更加明确?
|
To me the document seems deviated too much into the basics of the toolchain usage. IMO if you don't understand those basics, you should simply refrain yourself from altering the toolchain flags instead of messing them up with your own inaccurate interpretation of the texts. |
| - `lld`: LLVM 项目的链接器实现 | ||
| - `clang`: C/C++ 编译器前端 | ||
| - `flang`: Fortran 语言编译器前端 | ||
| - `libc++`: LLVM 的 C++ 标准库实现 |
There was a problem hiding this comment.
Maybe we should emphasize that at least in AOSC we still use GCC libstdc++ for clang++.
| - `-march=arrowlake -mtune=arrowlake`: 生成面向 Arrow Lake 微架构支持的指令集的代码,同时面向 Arrow Lake 微架构调优;生成的程序会尽量使用 Arrow Lake 支持的指令集中的指令(例如 AVX2) | ||
| - `-march=armv9-a -mtune=cortex-a710`: 生成面向 ARMv9.0-A 指令集的代码,面向基于 Cortex-A710 核心的处理器调优(更早的核心不支持 ARMv9.0-A 指令集) | ||
|
|
||
| 不过,在 Intel 处理器家族中有一处例外: |
There was a problem hiding this comment.
Even loongson3 (MIPS) has such an "exception" regarding the DSP instructions. There's also Loongson 3B6000M which lacks LASX (comparing to 3A5000M). I don't think this is so "exceptional."
|
|
||
| ## 指令集扩展开关 | ||
|
|
||
| 指令集扩展开关可以更加精细地控制代码生成时采用的指令。`-march` 参数会隐含地启用一些指令集扩展开关,但也可以在 `-march` 的基础上追加或禁用指令集扩展。一旦额外追加指令集扩展,生成的程序就不能在不支持该指令集扩展的处理器上运行。因此发行版开发中不允许使用此类选项。下面列举一些常用架构的指令集扩展开关: |
There was a problem hiding this comment.
"Unallowed" is too strong. If you are building shared objects for hwcap subdirectories you can absolutely use them.
|
|
||
| ## 其他常用的架构微调选项 | ||
|
|
||
| 以下是发行版开发及其他场景下会观察到的架构微调选项: |
There was a problem hiding this comment.
I think they are not "minor adjustments" because they actually switched the entire ABI.
|
|
||
| 现代编译器通常用 `-O<优化等级>` 参数来控制编译期间运行的优化处理过程 (Pass): | ||
|
|
||
| - `-O0`: 不做优化,生成直接对应源程序的代码 |
There was a problem hiding this comment.
Even at -O0 the compiler still optimizes printf to puts if possible, optimize divisions by const to multiply-shift sequences, etc.
There actually doesn't exist a concept like "object code directly mapped from the source code" because the language standards (at least, ISO 9899 and ISO 14882) only specify the side effects of the code. Let's not mislead people into assuming "hey my code has undefined behavior but it should work at -O0, why not?"
| - `-O0`: 不做优化,生成直接对应源程序的代码 | ||
| - `-O1`: 运行简单的优化过程,平衡编译用时及代码性能 | ||
| - `-O2`: 运行尽可能多的、更为复杂的优化过程,会延长编译时间,但能够处理大多数源程序中需要优化的模式 | ||
| - `-O3`: 运行较为激进的优化过程,会显著延长编译时间,且运行的优化过程可能会破坏程序原有的逻辑 |
There was a problem hiding this comment.
The main point is not it may disrupt the program logic. If the logic is disrupted, either the source code has a bug or the compiler has a bug. -O3 is not (or at least, should not be) an excuse for bugs.
The main point is the performance of the object code produced with -O3 is not guaranteed to be faster than the result of -O2. It depends on the pattern of the code. (I.e. if the result of -O3 is slower than -O2, it's not necessarily considered a bug.)
| 此外,GCC 还有一些其他预设: | ||
|
|
||
| - `-Os`: 运行 `-O2` 定义的优化过程,但除去代码对齐等会显著增加二进制大小的优化过程;同时尽可能地内联较短的函数 | ||
| - `-Og`: 运行 `-O1` 定义的优化过程,但除去所有会改变代码顺序或结构的优化过程,以在尽可能提升性能的情况下不影响调试过程,保证源码与二进制一一对应 |
There was a problem hiding this comment.
Still, do not use the non-exist concept "object code directly mapped from the source code." Per my experience even at -Og you'd still see a lot of "value is optimized away" in GDB.
| 链接时能够帮助加固产生的二进制的机制有: | ||
|
|
||
| - 只读重定位表 (Relocation Read-Only) 机制将 GOT 表所在的内存页标记为只读,防止 GOT 表中的指针被覆盖,以抵御基于缓冲区溢出的攻击 | ||
| - 立即解析符号 (Bind Now) 机制在程序由动态链接器加载时立即解析程序依赖的所有外部函数(即符号),而非随程序运行动态解析,防止 PLT 表中的函数指针被覆盖 |
There was a problem hiding this comment.
GOT instead of PLT. Yes you'll then find out it conflicts with the previous item, so the previous item is incorrect too. -Wl,-z,relro only turns a part of the GOT read-only. To turn the entire GOT read-only you need -Wl,-z,now.
| glibc 的加固机制分两个等级: | ||
|
|
||
| - `_FORTIFY_SOURCE=1`: 编译期间静态检查可能的缓冲区溢出,且允许溢出至父容器 | ||
| - `_FORTIFY_SOURCE=2`: 编译期间插入运行时检查,且限制内存操作范围为该变量的大小(无法溢出至父容器) |
| - [未定义行为监测 (Undefined Behavior Sanitizer, UBSan)][ubsan]: 监测程序运行期间出现的整数上溢或下溢、浮点数转换超限、非对齐或空地址访问等未定义行为 | ||
| - [线程异常行为监测 (Thread Sanitizer)][tsan]: 监测程序运行期间出现的线程间共享数据的竞态条件 | ||
|
|
||
| 此类监测程序属于调试辅助功能,在调试程序的行为问题期间非常有用。因此发行版日常开发时不会启用这些监测程序。 |
There was a problem hiding this comment.
The main point is not performance degradation. The main point is they are never intended for production use. I.e. the bugs in libasan.so etc. are never considered security vulnerabilities but if you use sanitizers in production environment you turn them into security vulnerabilities.
|
|
||
| ## 语言标准控制 | ||
|
|
||
| GCC 的大版本更新通常伴随着默认遵循的语言标准变动或标准库的头文件变动。GCC 15 更新后将默认的 C/C++ 语言标准更换为 C/C++23 之后,欠维护的项目未能及时发布针对 C/C++23 的代码改进,则会在发行版维护过程中导致众多编译问题。软件项目通常会在编译时指定使用的语言标准,以确保能在各大编译器版本上均能编译通过: |
There was a problem hiding this comment.
Actually C23/C++17 with GCC 15, C23/C++20 with GCC 16.
| ``` | ||
| gcc [其他参数] -D_FORTIFY_SOURCE=2 | ||
| ``` | ||
|
|
There was a problem hiding this comment.
There are several other hardening facilities, read the doc for -fhardened in GCC man page for them.
I.e. perhaps we should trim most of them, for the same logic as "the graphical installer shouldn't attempt to cover advanced use cases." If you don't understand compiler flags, don't touch them (like if you don't understand XXXfs, don't try it). |
No description provided.