Zig Version Of @nullprogram's Thread API

published: [nandalism home] (dark light)

@nullprogram's Thread API

@nullprogram's thread api doesn't use libc and it inspired me to try the same thing in zig. zig can do a lot without linking libc already, using direct syscalls on linux to access the os. (zig also has it's own thread api, which I will not be using here.)

An Overview Of @nullprogram's Ideas

(spoiler alert)

Linux uses the clone() syscall to create new threads. However, linux' clone() syscall doesn't take a thread function entry point, it only takes a new stack space pointer and creates a new thread with that stack. @nullprogram found a way to change the api so we can pass a thread body function when creating a new thread.

@nullprogram's idea is to pre-load a return address on that new stack and have the newthread() function return into the new thread entry function instead of returning from the newthread() call. In fact one path (the main thread) will return as normal, since it has the normal stack (without any manipulated return address), and the other will "return" into the new thread's code.

The thread entry function can no longer return, since it doesn't have anything on the stack to which to return. It must call exit() to exit its thread.

@nullprogram also includes a futex api so that the main thread can join/sync with the threads it starts.

zig Conversion

These are the main issues I encountered when trying to port @nullprogram's C/asm code to zig/asm.

Building The Example:

$ zig build-exe -target native -fsingle-threaded -O ReleaseSmall threads.zig

# TODO debug build fails (cos of the newthread function enter/leave code)
$ zig build-exe -target native -fsingle-threaded threads.zig

The Main Function - Starts a Thread and Joins It

The overall test code, starts a thread which counts to twenty. The main thread then joins it using a futex and uses exit_group() to exit all threads.

pub fn main() !void{
  @setAlignStack(16); // TODO superfluous?
  const stack = try newstack(64*1024);
  stack.entry = thread_body;
  stack.message = "hello world\n";
  stack.print_count = 20;
  stack.join_futex = 0;
  futex_wait(&stack.join_futex, 0);

Thread Body Function

The thread body function. This is run when the thread starts. It should never return, rather exit its thread.

const thread_entry_fn_t = *const fn(sh: *StackHead) void;

fn thread_body(stack: *StackHead) noreturn {
  debug.print("thread_body entered entry:{p} stackmsg:{s}\n", .{stack.entry, stack.message});
  const message = stack.message;
  const count = stack.print_count;
  for(0..count) |_| {
    write_all(os.STDOUT_FILENO, message) catch unreachable;
  @atomicStore(usize, &stack.join_futex, 1, std.builtin.AtomicOrder.SeqCst);

Pre Loading New Thread Stack

This is the stack overlay struct which is used to pass the thread entry function (as a fake return address) and data which the thread entry function receives.

const StackHead = struct {
  entry: thread_entry_fn_t align(16), // this should be equivalent to nullprogram's align on the struct
  message: []const u8,
  print_count: usize,
  join_futex: usize,

New Thread

newthread() enters the new thread function using the return address trick (i.e. it puts the function address on the stack in the return address position and uses instruction [ret] to "return" to it).

This means newthread() breaks if it is inlined (and zig will normally inline it). However, zig doesn't allow marking a function as noinline (AFAICT), for that I need the @call wrapper below.

fn _newthread(sh: *StackHead) void {
  asm volatile (
      \\xchg   %rdi, %rsi
      \\mov    %rsp, %rdi
      : // no return
      : [a] "{rax}" (linux.SYS.clone),
        [sh] "{rdi}" (sh),
        [S] "{rsi}" (0x50f00)
      : "rdi", "rcx", "r11", "memory"

This function should not have the normal enter/leave asm compiled in and zig uses callconv(.Naked) to mark a function as being without enter/leave asm. However, when I set that on the function I can no longer call it.

fn newthread(sh: *StackHead) callconv(.Naked) noreturn {
threads.zig:22:3: error: unable to call function with naked calling convention

Currently I rely on a hack, the release build of newthread removes the enter/leave and the code works. However, the debug build fails and I need to add these asm instructions at the head of newthread() to counteract the unwanted enter code.

\\add    $0x8,%rsp
\\pop    %rbp

This is not a proper solution. Does anyone know why zig will not allow me to call a naked function. As long as I provide a [ret] and don't mess up the stack (unintentionally) I see no problem; what am I missing?

zig doesn't appear to have a way to mark a function noinline, only a call. To avoid having to use this special call syntax for every new thread, I wrap it in a function. Then I mark that function as inline so it should disappear and serve only to trick the compiler into never inlining _newthread.

fn newthread(sh: *StackHead) callconv(.Inline) void {
  @call(.never_inline, _newthread, .{sh});

Creating The New Thread Stack With mmap

The stack for the new function should be created using mmap(). newstack() wraps a syscall to mmap, along with correct flags, and finally overlays the StackHead struct on the resulting memory block in such a way that the return address trick will work.

fn newstack(size: usize) !*StackHead {
  const m1 = @as(isize, -1);
  const m4096 = @as(isize, -4096);
  var p = linux.syscall6(linux.SYS.mmap, 0, size, 3, 0x22, @bitCast(usize,m1), 0);
  if(p > @bitCast(usize,m4096)){
    return error.StackCreateFailed;
  const count = size / @sizeOf(StackHead);
  p += (count-1)*@sizeOf(StackHead);
  return @intToPtr(*StackHead, p);

Futex For Thread Sync

We need sync primitives so that the main thread can wait for the second thread. We use linux futex for this.

fn futex_wait(pfutex: *usize, expect: usize) void {
  const futex = @ptrToInt(pfutex);
  _ = linux.syscall4(linux.SYS.futex, futex, linux.FUTEX.WAIT, expect, 0);

fn futex_wake(pfutex: *usize) void {
  const futex = @ptrToInt(pfutex);
  _ = linux.syscall3(linux.SYS.futex, futex, linux.FUTEX.WAKE, 0x7fffffff);

The Code

The entire working code is here in threads.zig

site built using mf technology