Threads
OxidizedOS is a multicore, x86-64 kernel written in Rust. For more information, see An Introduction.
The code for this post is located here.
In this Series, we will be discussing the implementation of kernel threads and a scheduler in Rust.
Background
In its present state, the amount of tasks OxidizedOS can currently work on concurrently is limited to the number of physical cores on the system. Clearly, this is a major problem – an operating system needs to be able to handle multiple tasks simultaneously. As a result, we introduce the thread: an abstraction that represents a task being performed. Threads are able to be paused and restarted, either by voluntarily giving up control of the CPU (cooperative multitasking) or being kicked off the CPU and replaced with another thread (preemption). How are we able to accomplish this? Abstraction!
Introducing the TCB
A Thread Control Block (TCB) is the typical name of the representation of a thread by a kernel, and it’s what we’ll use in OxidizedOS. First, we need to figure out what determines the task’s state. Clearly, we need the program counter (RIP), which leads us to realize that we must somehow save all of the registers. Additionally, since one of the registers points to a stack (RSP), we need to make sure we have a stack that isn’t corrupted by other threads running on the system. As well, CR3 points to a paging structure, which could be different per thread, but since we have identity mappings, and are only dealing with physical memory currently, we can overlook this for now. CR2 (page fault address) and RFLAGS could be overwritten by other threads, yet are important to our state, so we need to save those as well. Finally, we need to store the work that needs to be performed. In Rust, our first attempt would look something like this:
type Task = 'static + FnOnce() + Send + Sync;
#[repr(C)]
pub struct TCB {
stack: Box<[u8]>,
work: Box<Task>,
// RIP, RSP, CR3, other registers...?
}
We use a trait bound as we want to allow different types of functions/closures to be used as a task. However, we will restrict ourselves to move closures, as this helps prevent lifetime problems, hence the additional trait bounds. We use a C representation of the struct, since we will pass pointers to the struct to an assembly function.
This seems like a good start. Let’s try to construct a TCB.
impl TCB {
pub fn new(work: Box<Task>) -> TCB {
TCB {stack: box [0; 4096], work: work}
// Umm...?
}
(There’s an incredibly important reason we need to use box syntax to construct the stack, which I will explain in another post.)
Now we have a bootstrapping problem - how do we initialize the TCB so that we can actually run it’s task? Maybe writing the function to switch threads will help us find a solution.
Context Switch
We call the switch between threads a context switch, as the running thread represents the core’s context. Since we need to save registers, context_switch must be written in assembly. Naively, a context switch should look something like this:
context_switch:
# save all the registers
# restore all the registers
ret
Great. We figured out 1 instruction.
Let’s think about where we can store the contents of our registers. Since a stack is private to each process, the current thread’s
stack is an ideal candidate for where to save all our registers. Additionally, the System-V ABI splits
the general purpose registers into caller-saved and callee-saved registers, so by calling context_switch,
all the caller saved registers are automatically saved on the stack, so we are only responsible for saving the callee-saved ones.
Improving our context_switch, we get:
context_switch:
push rbx
push rbp
push r12
push r13
push r14
push r15
mov rax, cr2
push rax
pushfq
push rsp ???
mov rsp, ???
It doesnt’t make a lot of sense to store RSP on the stack, as we need it to switch back to the current stack when some thread in the future yields to us. We have the TCB, so perhaps that will work.
Since we’re switching to a thread that is not currently running, it must have yielded to another thread by calling context_switch, which means it saved all its registers on its stack. Noticing this, we realize that we simply just need to pop the (other thread’s) stack in reverse order to restore all of the registers.
Recall that the first 2 arguments to subroutines are stored in RDI and RSI, so we get:
# context_switch(current: *mut TCB, next: *mut TCB)
context_switch:
push rbx
push rbp
push r12
push r13
push r14
push r15
mov rax, cr2
push rax
pushfq
mov [rdi], rsp
mov rsp, [rsi]
cli
popfq
pop rax
mov cr2, rax
pop r15
pop r14
pop r13
pop r12
pop rbp
pop rbx
ret
We include a cli immediately after we switch stacks, as we don’t know if the new thread wanted interrupts enabled or disabled. We then immediately restore the flags to solve that issue (enabling interrupts if they were supposed to be enabled).
The first time we switch to a new thread, its stack won’t be setup properly, but since we know what it needs to look like, we can do that now.
Fixing the TCB
Let’s refine the TCB.
#[repr(C)]
#[derive(Clone, Copy)]
pub struct TCBInfo {
stack_pointer: usize,
}
impl TCBInfo {
pub fn new(stack_pointer: usize) -> TCBInfo {
TCBInfo {
stack_pointer: stack_pointer,
}
}
}
Why a separate structure? As it turns out, we really have two types of TCBs – the ones that are created when each core is initialized, and the ones that are created explicitly by calling TCB::new, and as a result, require a different implementation. TCBInfo is a simple way to represent the things that all TCBs share in common. At the moment, this is simply the current RSP.
As a result, TCB really needs to represent a common interface (or in Rust terms, a trait) between the different types of TCB’s.
pub trait TCB: Send + Sync {
fn get_info(&mut self) -> *mut TCBInfo;
fn get_work(&mut self) -> Box<'static + FnOnce() + Send + Sync>;
}
Why FnOnce() + Send + Sync? We’re sending this between cores, so that explains Send + Sync. FnOnce represents a function (or closure) that can be run at least once, which includes anything that is Fn(). Since we’re using move closures, values may be dropped after the closure is ran, so the closure cannot be called again.
We then change our original TCB implementation to:
#[repr(C)]
pub struct TCBImpl {
tcb_info: TCBInfo,
stack: Box<[u64]>,
work: Option<Box<Task>>,
}
impl TCBImpl {
const NUM_CALLEE_SAVED: usize = 6;
pub fn new(work: Box<Task>) -> TCBImpl<T> {
let mut stack: Box<[u64]> = box [0; 512];
let end_of_stack = 511;
stack[end_of_stack] = thread_entry_point as *const () as u64;
let index: usize = end_of_stack - TCBImpl::NUM_CALLEE_SAVED - 1; // Represents the callee saved registers
stack[index] = 0; // CR2
stack[index - 1] = 0; // RFLAGS
let stack_ptr = Box::into_raw(stack);
let stack_ptr_as_usize = stack_ptr as *mut u64 as usize;
stack = unsafe {Box::from_raw(stack_ptr)};
let stack_ptr_start = stack_ptr_as_usize + ((index - 1) * core::mem::size_of::<usize>());
let tcb_info = TCBInfo::new(stack_ptr_start);
TCBImpl {
tcb_info: tcb_info,
stack: stack,
work: Some(work),
}
}
}
This simply makes the TCB’s stack appear as if context_switch has been called with it as the current thread by making space on the stack for the callee saved registers, CR2, and RFLAGS, and pointing RSP at the location of RFLAGS on the stack. When all of the values have been popped off the stack, RSP will point at the bottom of the stack. Since ret will pop the stack and set RIP to that value, we need to return somewhere that will setup the calling thread’s environment, which includes finding and the performing the task stored inside the TCB representing that thread. thread_entry_point will serve that purpose for us.
#[no_mangle]
pub extern "C" fn thread_entry_point() -> ! {
println!("Thread made it to entry point!");
// Access the thread's TCB and do work
loop {}
}
We currently have no way of accessing the currently running TCB – we will deal with this in the next post.
Implementing TCB is pretty trivial:
impl TCB for TCBImpl {
fn get_info(&mut self) -> *mut TCBInfo {
&mut self.tcb_info as *mut TCBInfo
}
fn get_work(&mut self) -> Box<Task> {
let mut work = None;
core::mem::swap(&mut work, &mut self.work);
match work {
Some(task) => task,
None => panic!("TCBImpl had no work!")
}
}
}
And now we can write a simple example:
pub fn context_switch_test() {
let mut test1 = Box::new(TCBImpl::new(Box::new(move || ())));
let mut test2 = Box::new(TCBImpl::new(Box::new(move || ())));
unsafe {
machine::context_switch(test1.get_info(), test2.get_info());
}
}
Once we run it, we’ll see the output from thread_entry_point in the console.
In the next post, I will discuss the implementation of a cooperative scheduler.