Files
minhtran_dev/src/routes/blog/rust_stack_jmp.mdx

130 lines
4.9 KiB
Plaintext

---
title: A Rusty Stack Jump
description: Jumping into a new stack with Rust
date: 2025-02-27
featuredImage:
featuredImageDesc:
tags:
- rust
- asm
- systems
- operating systems
- async
---
import { Notes, PostImage } from "~/components/Markdown";
import { Tree } from "~/components/Tree";
In my quest to learn to build an async runtime in Rust, I have to learn about CPU context switching. In order to switch from one async task to another, our async runtime has to perform a context switch. This means saving the current CPU registers marked as `callee saved` by the System V ABI manual and loading the CPU registers with our new async stack.
In this article, I will show you what I have learned about jumping onto a new stack in a x86_64 CPU.
<Notes>
I'm learning about async runtimes in Rust based on the amazing book [Asynchronous Programming in Rust: Learn asynchronous programming by building working examples of futures, green threads, and runtimes](https://www.packtpub.com/en-mt/product/asynchronous-programming-in-rust-9781805128137)
It's an amazing book, don't get me wrong, but I feel like the explanation can be hand-wavy sometimes. Thus, I write this to archive my own explanation and potentially help other people who also struggle with the subject.
</Notes>
<Notes>
Most async runtimes in Rust do not use stackful coroutines (which are used by
Go's `gochannel`, Erlang's `processes`) and instead, use state machines to
manage async tasks.
</Notes>
## Contents
<hr />
## Setting the stage
Why do we need to swap the stack of async tasks in a runtime with stackful coroutines ?
Async tasks, by nature, are paused and resumed. Everytime a task is paused to move into a new task, we would have to save the current context of the task that is running and load the context of the upcoming task.
## Jumping into the new stack
Here is the code in its entirely, I'd recommend you run this on the [Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024). I have left comments through out the code so you can get the general idea.
Note that you have to manually stop the process.
```rust file="stack_swap.rs"
use core::arch::asm;
// stack size of 48 bytes so its easy to print the stack before we switch contexts
const SSIZE: isize = 48;
// a struct that represents our CPU state
//
// This struct will stores the stack pointer
#[derive(Debug, Default)]
#[repr(C)]
struct ThreadContext {
rsp: u64,
}
// Returning ! means
// it will panic OR runs forever
fn hello() -> ! {
println!("I LOVE WAKING UP ON A NEW STACK!");
loop {}
}
// new is a pointer to a ThreadContext
unsafe fn gt_switch(new: *const ThreadContext) {
// inline assembly
asm!(
"mov rsp, [{0} + 0x00]", // move the content of where the new pointer is pointing to, into the rsp register
"ret", // ret pops the return address from our custom stack—in our example, the address of hello.
in(reg) new,
);
}
fn main() {
// initialize
let mut ctx = ThreadContext::default();
// stack initialize
// ie. 0x10
let mut stack = vec![0_u8; SSIZE as usize];
unsafe {
// we get the bottom of the stack
// remember that the stack grows downward from high memory address to low memory address
// i.e 0x40 -> because 0x30 = 0x40 - 0x10 and 0x30 = SSIZE in decimal
// NOTE: offset() is applied in units of the size of the type that the pointer points to
// in our case, stack is a pointer to u8 (a byte) so offset(SSIZE) == offset(48 bytes) == offset(0x30)
let stack_bottom = stack.as_mut_ptr().offset(SSIZE);
// we align the bottom of the stack to be 16-byte-aligned
// this is for performance reasons as some CPU instructions (SSE and SIMD)
// The technicality: 15 is b1111 so if we do (stack_bottom AND !15) we will zero out the bottom 4 bits
//
// we also want the bottom of the stack pointer to point to a byte (8bit or u8)
let sb_aligned = (stack_bottom as usize & !15) as *mut u8;
// Here, we write the address of the hello function as 64 bits(8 bytes)
// Remember that 16 bytes = 0x10 in hex
// So we go DOWN 10 memory addresses, i.e from 0x40 to 0x30
// NOTE: 16 bytes down (0x10) even though, the hello function pointer is ONLY 8 bytes
// This is because the System V ABI requires the stack pointer to be always be 16-byte aligned
std::ptr::write(sb_aligned.offset(-16) as *mut u64, hello as u64);
// we write the stack pointer into the rsp inside context
ctx.rsp = sb_aligned.offset(-16) as u64;
for i in 0..SSIZE {
println!("mem: {}, val: {}",
sb_aligned.offset(-i as isize) as usize,
*sb_aligned.offset(-i as isize))
};
// we go into the function
// we will write our stack pointer to the cpu stack pointer
// and `ret` will pop that stack pointer
gt_switch(&mut ctx);
}
}
```