130 lines
4.9 KiB
Plaintext
130 lines
4.9 KiB
Plaintext
---
|
|
title: A Rusty Stack Jump
|
|
description: Jumping into a new stack with Rust
|
|
date: 2025-02-27
|
|
featuredImage:
|
|
featuredImageDesc:
|
|
tags:
|
|
- rust
|
|
- asm
|
|
- systems
|
|
- operating systems
|
|
- async
|
|
---
|
|
|
|
import { Notes, PostImage } from "~/components/Markdown";
|
|
import { Tree } from "~/components/Tree";
|
|
|
|
In my quest to learn to build an async runtime in Rust, I have to learn about CPU context switching. In order to switch from one async task to another, our async runtime has to perform a context switch. This means saving the current CPU registers marked as `callee saved` by the System V ABI manual and loading the CPU registers with our new async stack.
|
|
|
|
In this article, I will show you what I have learned about jumping onto a new stack in a x86_64 CPU.
|
|
|
|
<Notes>
|
|
I'm learning about async runtimes in Rust based on the amazing book [Asynchronous Programming in Rust: Learn asynchronous programming by building working examples of futures, green threads, and runtimes](https://www.packtpub.com/en-mt/product/asynchronous-programming-in-rust-9781805128137)
|
|
|
|
It's an amazing book, don't get me wrong, but I feel like the explanation can be hand-wavy sometimes. Thus, I write this to archive my own explanation and potentially help other people who also struggle with the subject.
|
|
|
|
</Notes>
|
|
|
|
<Notes>
|
|
Most async runtimes in Rust do not use stackful coroutines (which are used by
|
|
Go's `gochannel`, Erlang's `processes`) and instead, use state machines to
|
|
manage async tasks.
|
|
</Notes>
|
|
|
|
## Contents
|
|
|
|
<hr />
|
|
|
|
## Setting the stage
|
|
|
|
Why do we need to swap the stack of async tasks in a runtime with stackful coroutines ?
|
|
|
|
Async tasks, by nature, are paused and resumed. Everytime a task is paused to move into a new task, we would have to save the current context of the task that is running and load the context of the upcoming task.
|
|
|
|
## Jumping into the new stack
|
|
|
|
Here is the code in its entirely, I'd recommend you run this on the [Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024). I have left comments through out the code so you can get the general idea.
|
|
|
|
Note that you have to manually stop the process.
|
|
|
|
```rust file="stack_swap.rs"
|
|
use core::arch::asm;
|
|
|
|
// stack size of 48 bytes so its easy to print the stack before we switch contexts
|
|
const SSIZE: isize = 48;
|
|
|
|
// a struct that represents our CPU state
|
|
//
|
|
// This struct will stores the stack pointer
|
|
#[derive(Debug, Default)]
|
|
#[repr(C)]
|
|
struct ThreadContext {
|
|
rsp: u64,
|
|
}
|
|
|
|
// Returning ! means
|
|
// it will panic OR runs forever
|
|
fn hello() -> ! {
|
|
println!("I LOVE WAKING UP ON A NEW STACK!");
|
|
loop {}
|
|
}
|
|
|
|
// new is a pointer to a ThreadContext
|
|
unsafe fn gt_switch(new: *const ThreadContext) {
|
|
// inline assembly
|
|
asm!(
|
|
"mov rsp, [{0} + 0x00]", // move the content of where the new pointer is pointing to, into the rsp register
|
|
"ret", // ret pops the return address from our custom stack—in our example, the address of hello.
|
|
in(reg) new,
|
|
);
|
|
}
|
|
|
|
fn main() {
|
|
// initialize
|
|
let mut ctx = ThreadContext::default();
|
|
|
|
// stack initialize
|
|
// ie. 0x10
|
|
let mut stack = vec![0_u8; SSIZE as usize];
|
|
|
|
unsafe {
|
|
// we get the bottom of the stack
|
|
// remember that the stack grows downward from high memory address to low memory address
|
|
// i.e 0x40 -> because 0x30 = 0x40 - 0x10 and 0x30 = SSIZE in decimal
|
|
// NOTE: offset() is applied in units of the size of the type that the pointer points to
|
|
// in our case, stack is a pointer to u8 (a byte) so offset(SSIZE) == offset(48 bytes) == offset(0x30)
|
|
let stack_bottom = stack.as_mut_ptr().offset(SSIZE);
|
|
|
|
// we align the bottom of the stack to be 16-byte-aligned
|
|
// this is for performance reasons as some CPU instructions (SSE and SIMD)
|
|
|
|
// The technicality: 15 is b1111 so if we do (stack_bottom AND !15) we will zero out the bottom 4 bits
|
|
//
|
|
// we also want the bottom of the stack pointer to point to a byte (8bit or u8)
|
|
let sb_aligned = (stack_bottom as usize & !15) as *mut u8;
|
|
|
|
// Here, we write the address of the hello function as 64 bits(8 bytes)
|
|
// Remember that 16 bytes = 0x10 in hex
|
|
// So we go DOWN 10 memory addresses, i.e from 0x40 to 0x30
|
|
// NOTE: 16 bytes down (0x10) even though, the hello function pointer is ONLY 8 bytes
|
|
// This is because the System V ABI requires the stack pointer to be always be 16-byte aligned
|
|
std::ptr::write(sb_aligned.offset(-16) as *mut u64, hello as u64);
|
|
|
|
// we write the stack pointer into the rsp inside context
|
|
ctx.rsp = sb_aligned.offset(-16) as u64;
|
|
|
|
for i in 0..SSIZE {
|
|
println!("mem: {}, val: {}",
|
|
sb_aligned.offset(-i as isize) as usize,
|
|
*sb_aligned.offset(-i as isize))
|
|
};
|
|
|
|
// we go into the function
|
|
// we will write our stack pointer to the cpu stack pointer
|
|
// and `ret` will pop that stack pointer
|
|
gt_switch(&mut ctx);
|
|
}
|
|
}
|
|
```
|