odin notes: lists/stack-allocated

Stack-Allocated Lists?!

You probably shouldn't use bare linked lists all the time in Odin, but they're simple while needing a lot of allocation, so they're illustrative.

Consider the following program:

package main

import "core:fmt"

// A list is nil or a pointer to one of these.
Cons :: struct($T: typeid) {
	car: T,
	cdr: ^Cons(T),
}

// Print e.g. [1, 2, 3, 4] given a list of those values
println :: proc(list: ^Cons($T)) {
	list := list
	fmt.print('[')
	for list != nil {
		fmt.print(list.car)
		if list.cdr != nil {
			fmt.print(", ")
		}
		list = list.cdr
	}
	fmt.println(']')
}

// Return the last element of a list, if it exists.
last :: proc(list: ^Cons($T)) -> Maybe(T) {
	list := list
	for list != nil {
		switch list.cdr {
		case nil:
			return list.car
		case:
			list = list.cdr
		}
	}
	return nil
}

// Build e.g. the list [1, 2, 3, 4], given 4
iota :: proc(n: int) -> (result: ^Cons(int)) {
	n := n
	for n > 0 {
		result = &Cons(int){n, result}
		fmt.println(result) // NB. addresses!
		n -= 1
	}
	return
}

main :: proc() {
	list := &Cons(int){1, &Cons(int){2, &Cons(int){3, nil}}}

	fmt.println(last(list).? or_else -1)
	println(iota(5))
}

This defines a generic Cons type that contains either a value of the given type, or a pointer to another Cons - i.e., a cons cell of a linked list, using Lisp terminology. It also defines last to return the last element of a list (if it exists) and iota to produce a list of integers. In main it constructs a list, prints the last element of it (or -1), and then it prints... garbage:

$ odin run static1.odin -file -default-to-nil-allocator
3
&Cons(int){car = 5, cdr = <nil>}
&Cons(int){car = 4, cdr = 0x7FFC8E972EC0}
&Cons(int){car = 3, cdr = 0x7FFC8E972EC0}
&Cons(int){car = 2, cdr = 0x7FFC8E972EC0}
&Cons(int){car = 1, cdr = 0x7FFC8E972EC0}
[140722700758704,

'cause this ain't Go. The compiler isn't performing escape analysis to decide to allocate those Cons cells on the heap because they're getting returned. The compiler is dutifully allocating them on the stack and then letting the return invalidate the data that was returned. Although it was invalid before it was returned as iota's output shows: each of the 'created' cons cells is the same.

So that won't do at all.

iota, though, is nice in that you know in advance exactly how many cons cells it will need, so those could be provided:

// Build e.g. the list [1, 2, 3, 4], given 4
iota :: proc(n: int, cells: []Cons(int)) -> (result: ^Cons(int)) {
	n := n
	for n > 0 {
		cells[n - 1] = Cons(int){n, result}
		result = &cells[n - 1]
		n -= 1
	}
	return
}

main :: proc() {
	list := &Cons(int){1, &Cons(int){2, &Cons(int){3, nil}}}
	fmt.println(last(list).? or_else -1)

	cells: [5]Cons(int)
	println(iota(5, cells[:]))
}

Which does work:

$ odin run static2.odin -file -default-to-nil-allocator
3
[1, 2, 3, 4, 5]

There! Fully stack-allocated lists, with -default-to-nil-allocator confirming there's no sneaky dynamic allocation. But obviously this won't get you far with list functions.

But this idea, of giving iota exactly the cons cells it needs to function, that that could be adapted to giving cons cells in abundance from which it'd only pick out those it needs, or to giving it a function that would return cons cells somehow. And with list functions taking a 'source of cons cells' as an argument, they can pass it on to other list functions that they call internally. What an idea.

Let's end with some cons-wasteful code, a program called 100mil_static (no longer using the stack to be clear, as stacks tend not to fit 1.5 GB arrays, but it's similar):

package main

import "core:fmt"

// A list is nil or a pointer to one of these.
Cons :: struct($T: typeid) {
	car: T,
	cdr: ^Cons(T),
}

// Return the last element of a list, if it exists.
last :: proc(list: ^Cons($T)) -> Maybe(T) {
	list := list
	for list != nil {
		switch list.cdr {
		case nil:
			return list.car
		case:
			list = list.cdr
		}
	}
	return nil
}

// Build e.g. the list [1, 2, 3, 4], given 4
iota :: proc(n: int, cells: []Cons(int)) -> (result: ^Cons(int)) {
	n := n
	for n > 0 {
		cells[n - 1] = Cons(int){n, result}
		result = &cells[n - 1]
		n -= 1
	}
	return
}

CELLS: [100_000_000]Cons(int)

main :: proc() {
	list := iota(100_000_000, CELLS[:])
	fmt.println(last(list))
}

On my machine this takes, with default (-o:minimal) optimizations, about 2.5 seconds and 1.5 GB of RAM to calculate that the last member of [1, 2, ..., 100e6] is 100e6. That ain't bad, relatively speaking. With -o:speed it goes down to 1.2 seconds. We'll come back to this.

for reference...

(defun iota (n)
  (loop for m upto n collect m))

(defun main ()
  (princ (last (iota 100000000)))
  (terpri))

(main)

This Common Lisp program gets the result in about 3s and 2.5 GB of RAM, run with sbcl --dynamic-space-size $((5*1024)) --script 100mil.lisp. Optimizing with DECLAIM didn't significantly change this.

This Scheme program, compiled with chicken-csc -O5 -unsafe -disable-interrupts -fixnum-arithmetic -disable-stack-overflow-checks -specialize -strict-types, completes in about 12 seconds and 4.5 GB of RAM.

(define (last lst)
  (if (null? (cdr lst))
    (car lst)
    (last (cdr lst))))

(define (iota n acc)
  (if (= n 0)
    acc
    (iota (- n 1) (cons n acc))))

(define (main)
  (let ((lst (iota 100000000 (list))))
    (display (last lst))
    (newline)))

(main)