odin notes: lists/malloc&free

Lists with malloc with new and free

When people say that there are 'memory unsafe languages', this API (malloc and free) amounts to most of the validity of what they mean, and even if you don't accept Rust's memory safety, you can pick a better way to manage memory than this. If you've no idea what I'm talking about please read https://www.rfleury.com/p/untangling-lifetimes-the-arena-allocator immediately.

Anyway, consider the following program:

package main

import "core:fmt"

// A list is nil or a pointer to one of these.
Cons :: struct($T: typeid) {
	car: T,
	cdr: ^Cons(T),
}

cons :: proc(car: $T, cdr: ^Cons(T)) -> (result: ^Cons(T)) {
	result = new(Cons(T))
	if result == nil {
		fmt.panicf("memory exhausted in cons")
	}
	result.car = car
	result.cdr = cdr
	return
}

// Free every cell of the provided list.
free_list :: proc(list: ^Cons($T)) {
	list := list
	for list != nil {
		next := list.cdr
		free(list)
		list = next
	}
}

// Print e.g. [1, 2, 3, 4] given a list of those values
println :: proc(list: ^Cons($T)) {
	list := list
	fmt.print('[')
	for list != nil {
		fmt.print(list.car)
		if list.cdr != nil {
			fmt.print(", ")
		}
		list = list.cdr
	}
	fmt.println(']')
}

// Return the last element of a list, if it exists.
last :: proc(list: ^Cons($T)) -> Maybe(T) {
	list := list
	for list != nil {
		switch list.cdr {
		case nil:
			return list.car
		case:
			list = list.cdr
		}
	}
	return nil
}

// Build e.g. the list [1, 2, 3, 4], given 4
iota :: proc(n: int) -> (result: ^Cons(int)) {
	n := n
	for n > 0 {
		result = cons(n, result)
		n -= 1
	}
	return
}

main :: proc() {
	list := cons(1, cons(2, cons(3, nil)))
	fmt.println(last(list).? or_else -1)
	free_list(list)

	list = iota(5)
	println(list)
	free_list(list)
}

Compared to lists/stack-allocated this might seem to have a new problem--that failure is possible, with a panic in cons, but failure was already possible with stack allocations! There, it was just pretty sure to result in a stack overflow and segfault. On modern Linux systems with overcommit the practical 'failure condition' is murky even with that nice panic, as new can claim to succeed when memory isn't actually available, and the Linux OOM killer can start making life unpleasant while this program allocates unimpeded. On a server with extremely heavy I/O that almost always hits filesystem cache, nightmarish conditions are possible even without the OOM killer kicking in as memory pressure reduces the cache and that I/O starts hitting slow disks.

Low memory conditions are pretty bad all around, in any language.

Anyway, we can confirm that dynamic allocation happens in this program, and also confirm how it responds to memory exhaustion, by setting the nil allocator:

$ odin run malloc1.odin -file -default-to-nil-allocator
malloc1.odin(14:7) Panic

Great! Now, running it with the default context allocator:

$ odin run malloc1.odin -file 
3
[1, 2, 3, 4, 5]

This is pretty standard malloc/free code. From main's perspective all that's changed is that there's now a convenience function to construct cons cells, and there's explicit freeing of lists. Despite the insistence above that this is a bad way to manage memory, it's a very popular way to manage memory and you could certainly make a useful and even a correct application that manages memory like this.

For fun, here's that cons-wasteful program again, this time 100mil_malloc:

package main

import "core:fmt"
import "core:time"

// A list is nil or a pointer to one of these.
Cons :: struct($T: typeid) {
	car: T,
	cdr: ^Cons(T),
}

// Free every cell of the provided list.
free_list :: proc(list: ^Cons($T)) {
	list := list
	for list != nil {
		next := list.cdr
		free(list)
		list = next
	}
}

// Return the last element of a list, if it exists.
last :: proc(list: ^Cons($T)) -> Maybe(T) {
	list := list
	for list != nil {
		switch list.cdr {
		case nil:
			return list.car
		case:
			list = list.cdr
		}
	}
	return nil
}

// Build e.g. the list [1, 2, 3, 4], given 4
iota :: proc(n: int) -> (result: ^Cons(int)) {
	n := n
	for n > 0 {
		cell := new(Cons(int))
		if cell == nil {
			fmt.panicf("exhausted memory in iota on n=%d", n)
		}
		cell^ = Cons(int){n, result}
		result = cell
		n -= 1
	}
	return
}

main :: proc() {
	start := time.tick_now()

	list := iota(100_000_000)
	fmt.println("iota:", time.tick_since(start))

	fmt.println(last(list))
	fmt.println("last:", time.tick_since(start))

	free_list(list)
	fmt.println("free:", time.tick_since(start))
}

With output:

$ odin run 100mil_malloc.odin -file -o:speed 
iota: 5.331800493s
100000000
last: 5.768355374s
free: 7.13808926s

That's only about seven times slower than the static option, but it also uses 4.5 GB of RAM. Note that free_list certainly isn't free, needing 1.3s just to clean up. In a batch program like this you could save that time by just not calling free_list. The only casualty is that memory checkers will be harder to use, as they'll report this as a leak. But as soon as you're not just exiting immediately after cleanup, you start to have to take 1.3s to do it.

Why is freeing so expensive?!

You have to involve the allocator 100 million times, once per cell of the list. So it's really lists that are more expensive. But, even if you use an array instead, once the array contains RAII objects (or objects implementing the Drop trait in Rust), you have to pay a 100,000,000x cost one way or another. You might appreciate a language that didn't push you into such patterns of memory management.

Slower than Python?

Even a completely naive Python program is faster and uses less memory than this. Taking <6s and 3.7 GB of RAM:

def iota(n):
    return [i for i in range(1, n+1)]

list = iota(100_000_000)
print(list[-1])

How is that possible? Python's not using linked lists. Here's a comparable Odin program:

package main

import "core:fmt"

iota :: proc(n: int) -> (result: [dynamic]int) {
	for i in 0 ..< n do append(&result, i+1)
	return
}

main :: proc() {
	list := iota(100_000_000)
	fmt.println(list[len(list)-1])
	delete(list)
}

This runs in about 830ms, needing 1 GB of RAM.

More on linked lists

Linked lists aren't associated with garbage collection (Lisp, SML) by accident: they permit constructions like circular lists, or multiple 'heads' of lists sharing the same 'tail', that greatly complicate the task of a function like free_list. For example, here's a program that prints a list forever:

package main

import "core:fmt"
import "core:time"

// A list is nil or a pointer to one of these.
Cons :: struct($T: typeid) {
	car: T,
	cdr: ^Cons(T),
}

cons :: proc(car: $T, cdr: ^Cons(T)) -> (result: ^Cons(T)) {
	result = new(Cons(T))
	if result == nil {
		fmt.panicf("memory exhausted in cons")
	}
	result.car = car
	result.cdr = cdr
	return
}

// Print e.g. [1, 2, 3, 4] given a list of those values
println :: proc(list: ^Cons($T)) {
	list := list
	fmt.print('[')
	for list != nil {
		fmt.print(list.car)
		if list.cdr != nil {
			fmt.print(", ")
		}
		list = list.cdr
		time.sleep(500 * time.Millisecond) // !
	}
	fmt.println(']')
}

main :: proc() {
	list := cons(1, cons(2, cons(3, nil)))
	list.cdr.cdr.cdr = list
	println(list)
}

Usage:

$ odin run circular.odin -file
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... continues forever

Circular lists can come up even when you don't want them, as in this bug in the Apache webserver:

an apache module tries to spawn a subprocess
to do that, it uses fork() to separate itself off into a new process
it then, before calling exec(), dutifully cleans up some internal Apache datastructures
... which datastructures included some circular lists, because other threads were modifying those lists concurrently with the fork, and those modifications got interrupted.
result: instead of spawning a subprocess, Apache spawned a process that consumed 100% CPU doing nothing to a circular list