-
Notifications
You must be signed in to change notification settings - Fork 149
zeroize: replace atomic_fence with optimization_barrier
#1252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I really don't understand the motivation for this or why you're making it a public function.Something like this seems like a potential replacement for the existing Can you start with a demonstration of a problem in the existing implementation, then show what this solves? Making it public seems like an implementation detail leaking out of the API. |
|
You can see a motivation example in the function docs. We can not "zeroize" types like I encountered a need for this type when working on block buffer for
Yes. I plan to do it in a separate PR. |
|
I'm not sure I understand the issue in RustCrypto/stream-ciphers#491 or what it's even trying to do... zeroize the entire keystream it generates? |
|
The current With the // in rand_core
impl<R: Generator> BlockRng<R> {
pub fn zeroize(&mut self) {
self.results = Default::default();
}
}
// in chacha20
struct ChachaRng(BlockRng<ChaChaBlockRng>);
impl Drop for ChachaRng {
fn drop(&mut self) {
self.0.zeroize();
zeroize::observe(self);
}
} |
|
I'm still not sure I follow... what calls the Where exactly is the "hack"? |
|
How is... impl Drop for ChachaRng {
fn drop(&mut self) {
self.0.zeroize();
zeroize::observe(self);
}
}...any different from... impl Drop for ChachaRng {
fn drop(&mut self) {
self.0.zeroize();
}
} |
The
Note that |
|
So the problem is that |
|
In a certain (uncharitable) sense, yes. But while |
|
Perhaps I do still think something like this, internally within |
Not everyone needs RNG buffer zeroization. Introducing a crate feature for it is also not desirable. I think that zeroization should be handled on the I don't understand why you are against exposing the function. It a simple safe functon which can not be misused (as opposed to |
|
I'm a bit worried about the notion that there's a user pulling in one crate which is expected to do some otherwise insecure zeroing, then separately pulling in Perhaps for a start you could use this technique as an internal replacement for |
Yes, it's unfortunate, but sometimes there is just no other way. Potentially unreliable erasure is better than no erasure at all. And it's much better than risking UB with Also note my point about efficiency. struct Foo {
a: [u8; 32],
b: [u64; 16],
c: u64,
}
impl Drop for Foo {
fn drop(&mut self) {
// This impl is more efficient
self.a = [0; 32];
self.b = [0u64; 16];
self.c = 0;
zeroize::observe(self);
// than this impl
self.a.zeroize();
self.b.zeroize();
self.c.zeroize();
}
}See https://rust.godbolt.org/z/K7hxoePKE
Personally, I don't see much point it it, but sure. |
The existing performance problems are already noted in #743. If you replaced |
Well, IIUC even cc @RalfJung |
observe functionatomic_fence with observe
atomic_fence with observeatomic_fence with optimization_barrier
Do you have more information about that? I wasn't aware compilers would change codegen based on an analysis of the ASM. |
|
I did not mean that existing compilers do it, but that AFAIK it's not explicitly forbidden. For example, an advanced LTO pass could in theory apply optimizations directly to generated assembly, thus removing "useless" writes to stack frame which immediately gets released. I tagged Ralf in the hope that he would clarify this moment in the case if I am mistaken. |
|
The details around
At the same time, the entire concept of zeroize is to try to do something that can't reliably be done in the Rust AM. From a formal opsem perspective, reliable zeroizing is impossible. (I wish it were different, but sadly that's where we are. Even if we wanted we couldn't do much about that in rustc until LLVM has proper support for this.) Best-effort zeroizing is a lot more about what compilers and optimizers actually do than about their specifications, and I'm the wrong person to ask about that. |
@RalfJung what term would you use for this sort of tactical defense against code elimination that, at some future date, could be subject to another round of cat-and-mouse when the compiler outsmarts it? |
|
I still think the best thing we could do with |
|
If we are to trust the compiler to not mess with |
The paragraph you quoted was talking about reliable / sound reasoning principles. |
|
I think we only need one implementation of memset/bzero per architecture. The implementation of zeroize itself is already abstracted such that there are two places to actually invoke it from: |
I’ve heard “optimization barrier” used for this by several people, though if that sounds like it has a property this isn’t providing, perhaps something like “optimization impediment” is clearer? |
I only see disadvantages when compared to the observation We don't win anything in language guarantees. While by making it general you prevent the compiler from applying optimizations which we want, such as unrolling loops, reusing zeroed registers, or SIMD-ifying code when appropriate. And it obviously much harder to write correctly than slapping the same empty |
That sounds like it may be in conflict with self-modifying code which needs the asm blocks to be unchanged. OTOH something like BOLT could do all sorts of "fun" things and we'd probably say it's fine, it's just incompatible with self-modifying code. But you're asking about things way outside of what I can confidently talk about. I suggest you bring this to the t-opsem Zulip; there are other people there that know more about these low-level things.
I always interpreted "barrier" as something that actually guarantees certain reorderings don't happen. So yeah I think a term that makes the best-effort nature more clear would be preferable. |
By using We get somewhat similar guarantees out of If we remove the volatile writes to improve performance, I don't think we really have any guarantees at that point, just something future compiler versions are unlikely to see through. Maybe that's fine, as I said earlier it seems fine for many e.g. libc implementations and their |
|
Last time I asked about LTO I was told LTO would not touch |
My point was that it also fully applies to
IIUC they rely on dynamic linking to act as an optimization "impediment" (in other words, on keeping the functions "external" and thus out of the optimizer reach). We probably do not want to rely on it by default. |
The major difference is you're still relying on Rust code to do the actual writing to memory, and I would be a lot more worried about that code being eliminated than I would anything in the
@newpavlov I think the most common way to implement them is with an ASM barrier immediately after calling the normal Sidebar: we could potentially use those libc functions (#1254) |
If we are to trust the black box model of Technically, IIUC it's allowed for the compiler to allocate a similarly sized buffer on stack, zero it out, and then pass pointer to it. But the same hypothetical also applies to "secure" memsets as well (i.e. the compiler may call it on a copy of zeroized object). But I believe we are in the realm of wild speculations at this point. |
FWIW, I agree -- |
|
@RalfJung pub struct Foo {
a: u64,
}
impl Drop for Foo {
fn drop(&mut self) {
self.a = 0;
unsafe {
core::arch::asm!(
"# {}",
in(reg) &self.a,
options(readonly, preserves_flags, nostack),
);
}
}
}
impl Drop for Foo {
fn drop(&mut self) {
unsafe {
core::arch::asm!(
"mov qword ptr [{}], 0",
in(reg) &mut self.a,
options(preserves_flags, nostack),
);
}
}
}Assuming that the compiler is forbidden from analyzing the |
|
As I said above, that's outside my expertise and I recommend asking on the t-opsem Zulip. At least to a first-order approximation, I'd prefer the 2nd variant as that guarantees that the actual instructions you wrote there are in the final binary, so you don't even have to begin reasoning about what exactly the compiler may or may not do. |
|
Should this function be /// Observe the referenced data and prevent the compiler from removing previous writes to it.
/// ...
fn optimization_barrier<R: ?Sized>(val: &R) { /* ... */ }To quote from the The Rustonomicon:
The function states in its contract that it prevents the compiler from removing previous writes to it. There are two ways to interpret this:
My opinion is thus that /// Observe the referenced data and prevent the compiler from removing previous writes to it. The user must ensure the existence of previous writes.
/// ...
unsafe fn enforce_write<R: ?Sized>(val: &R) { /* ... */ } |
|
It obviously should not. This function is essentially a "hardened" variant of |
|
|
|
This guarantee has nothing to do with the notion of memory safety. |
c8ee20e to
1f03a14
Compare
No description provided.