Zero-Cost Abstractions

We’re very happy to present Airspeed Velocity, the nom de plume behind the eponymous, and highly regarded Swift blog. In this talk, Airspeed considers the problems of working dynamically at run time. Using a number of persuasive examples, he shows how Swift’s static functionality lets you make much better use of compile time optimizations, of the kind that more dynamic code would block. Much of Swift’s type system is focused on correctness, and managing complexity, but it also allows you to write faster code at a higher level of abstraction — and with almost zero cost.

This talk is presented with audio and slides only, to preserve anonymity.

Annotated slides with extra information, credits and further links, can be found here.


Swift Problems: Background (0:00)

I spent a lot of time over the Christmas holidays on Stack Overflow, answering as many questions about Swift as I could. It was really interesting, because it told me a lot about the kinds of things that people new to Swift were finding confusing, and what kind of idioms they were missing. A number of things frequently crop up, so I collected them together. Some of them are fairly obvious, like people using var instead of let. I’ve come to really appreciate the way F# constructs mutable values, declared with let mutatable — lazy people do the right thing by default. Objective-C interoperability can be quite surprising, it sets up an expectation that you can cast between different kinds of types, when really that only applies specifically to the NS types.

It’s the other type of question in my collection that I’m going to talk about today. People love doing things dynamically at runtime. They’re desperate to do it in fact. Even when the types that they’re dealing with, and the things that they’re operating on, can be statically known at compile time, they still want to determine the types dynamically. I think this comes from the fact that many languages, including Objective-C, don’t have much by the way of elaborate static functionality. Swift does, which means that there’s an alternative. A relevant quote comes to mind: “Programming languages teach you not to want what they don’t provide.”. Swift does provide some of these alternatives.

The second most popular Stack Overflow Swift question is generally, “How do I make this work? What is this?”. Users want to define a generic function, then at compile time they want to give a it a type, and will cast an Any to that type. Then, they give it another function, which will apply it to that specific type, whether that type is an Int, a String, or a custom type. This is a weird, and very inefficient runtime polymorphism.

The answer they get back is usually not very satisfying, because generics belong to the compile time. They’re designed to be used with static types, and mixing static generics with dynamic types at runtime is rather like mixing steak and ice cream. Steak is great. Ice cream is great. Steak with ice cream, well, it’s bit unpleasant. At that point, they usually give up.

Here’s an example: you create two classes, and the only thing in common between them is that they have a method with the same name. You can write a function that takes an AnyObject and calls the method of the same name. It works, and so the question is “Is this Swift?”, both in the sense, “Are you writing Swift, or are you writing Objective-C in Swift?”, and, “Is it Swift”, as in, “Is it fast?”. The answer is, “It might be fast enough”. In many circumstances however, it isn’t necessarily fast.

@objc class C {
    func doThing() { println("something") }
}

@objc class D {
    func doThing() { println("something else") }
}

func takeThing(o: AnyObject) {
    o.doThing()
}

takeThing(C())
takeThing(D())

Dynamic Typing (3:40)

The way I think about dynamic typing in Swift is that it’s like any other language feature; if your use case is appropriate, it can be really useful. If you want to build something like a serialization framework, dynamic typings are really handy. But it’s definitely possible to overuse it in inappropriate circumstances. So the question really is, “If I’m not going to use dynamic typing, and I use static typing instead, what do I get in return?” This is the “What’s in it for me?” question. A number of people have given really excellent talks about the value of values, and how, if you use value types rather than reference types, they can help reduce the complexity of your code, or, they can help make your code significantly easier to reason about. But, these talks can pretty much be paraphrased as, “Programming, you’re doing it wrong”.

Four-Character Codes (4:43)

I’m going to detail a couple of examples where static types can let you do something that’s actually very appealing. If you do more, statically, at compile time, you can make your code faster, and let’s face it, everybody wants fast code. Another question that comes up quite frequently is “Aren’t these value types going to make my code slower, because I’ll be copying them all the time?”. In languages like Haskell, the copying is often an illusion, it’s just the way that the language models. It’s not actually what happens when you compile your code.

The first example is a four-character code, which is a way of representing four ASCII characters as a 32-bit unsigned integer. You do this by taking each of the four characters, shifting them into place in each of the four bits of the 32-bit integer, and OR-ing them all together in turn. Here, you can see this applied to a string, to generate a four-character code:

extension FourCharCode {
	init(fromFourCharString str: StaticString) {
		precondition(str.byteSize == 4)

		self = str.withUTF8Buffer { buf in
			var result: FourCharCode = 0
			for byte in buf {
			    result <<= 8
			    result |= FourCharCode(byte)
			}
			return result
		}
	}
}

It extends four-character code, which is just a type alias for an unsigned 32 bit integer. It gives it an init method that takes a static string, which is a type in the Swift Standard Library — a very cut-down and mutable string that can be completely known at compile time. You can only create it from string literals, you can’t create it by composing them together at runtime. The code checks that there are only four characters in the string, then iterates over the string buffer, shifting the result to the left, and OR-ing in the next character in turn. If you want to create the four-character code for unit code text, use this init method, and you can print it out.

println("0x" + String(FourCharCode(fromFourCharString: "utxt"), radix: 16))
// prints 0x75747874

If you’re doing bit-wise arithmetic, there’s a really handy feature of String that allows you to supply radix, which allows you to print out hex digits. If you supply two, it allows you to print out the binary bits. If you compile this code with Swift C, and look at the assembly with the optimizer turned on, you’ll see that the LLVM compiler is really awesome. It looks at your code and says, “I know what this is, I know what these four characters are at compile time, and I know exactly what the code does”. It doesn’t even call your function, but collapses it into a single constant at compile time. It runs your code at compile time on that string. This kind of optimization becomes increasingly available as you use more static features of the language.

Using reduce (8:13)

You might look at that code previously and say, “Okay, but I thought we were writing Swift. We’re supposed to use things like map and reduce”. If you wanted to rewrite this with reduce, the code becomes a little bit simpler because it’s leveraging the Standard Library reduce function. You supply reduce with an initial value, and a closure that takes the running total, shifts it to the left by eight bits, then OR-s in the next character in turn. If you do this, at compile time it will still figure out a way to reduce the code down to a single compile time constant. Even though you’re using a higher order function that’s part of the Swift standard library, the compiler is still able to do compile time optimization.

extension FourCharCode {
    init(fromFourCharString str: StaticString) {
        precondition(str.byteSize == 4)

        self = str.withUTF8Buffer { buf in
            reduce(buf, 0) { code, byte in
                (code << 8) |  FourCharCode(byte)
            }
	}
    }
}

Using String (9:05)

Suppose you don’t want to use a static string, because sometimes you create the four-character code at runtime from something like user input. You might say, “Okay, I’ll just replace static string with string”. Here’s the version of the code that does just that:

extension FourCharCode {
    init(fromFourCharString str: String) {
        precondition(str.byteSize == 4)

        self = reduce(str.utf8, 0) { code, byte in
            (code << 8) |  UInt32(byte)
	}
    }
}

It’s even simpler because strings are easy to use, but unfortunately, if you compile this code, you won’t get the compile time generation of a constant value. The behaviour of String is slightly more dynamic, so the compiler is unable to optimize the code and generate a constant at compile time. In the future, the compiler might improve; if you’re initializing with a string literal, it would take care of the constant collapsing. This is another big reason in favor of using Swift’s static features: as Swift improves, so long as you’re using as static a type as you can, it’s quite possible that your code will get faster without you making any changes.

I think about compiler optimizations like this by analogy with the kid’s game, “Buckaroo!”. If you’re not familiar with Buckaroo, you take turns putting bags on the back of a donkey; eventually, one unlucky child puts a new bag on the donkey, and the irritable plastic animal kicks upwards, throwing everything into the air. You can think of the compiler in a similar way: the more dynamic runtime behavior you put into your code, the more you are hampering the optimizer’s ability to do things like constant collapsing, or to remove certain bits of code that it knows can’t execute. The kinds of things that will guarantee a kicking optimizer — which will just give up, leaving your code as it found it — include using Any and AnyObject, or using dynamic typing capabilities. The compiler simply doesn’t know what the types are at compile time.

Nibble Sort Challenge (11:07)

Here’s another example. John Regehr, a professor at the University of Utah, posted a challenge to write the fastest sorting algorithm, to sort all of the hex digits in an unsigned 64-bit integer — a pretty straightforward bit-shifting challenge. If you put in the hex string, “badbeef”, it returns all of the hex digits, sorted in order, with the zeroes that are implicit on the end. He gave a reference implementation that was deliberately substandard. The implementation provided a read operation, where you supply an index and get back the hex digit at that index, and a write operation, where you give it an index and a value, and it sets the hex digit at that index in the int.

nibble_sort_word_ref(0xbadbeef)
// returns 0xfeedbba000000000

func read_nibble(w: UInt64, i: UInt64) -> UInt64 {
    let res = w >> (i * 4)
    return res & 0xf
}

func write_nibble(inout w: UInt64, i: UInt64, v: UInt64) {
    var mask: UInt64 = 0xf
    mask <<= (i * 4);
    w &= ~mask
    var prom = v;
    prom <<= (i * 4)
    w |= prom
}

Once you have the read and write operation, it’s fairly straightforward to write a sorting function using that read and write. This is basically a selection sort. However, if you think about it, reading and writing hex digits in an index is basically a collection, and so those are subscript operations. You can write a structure that wraps that 64-bit integer and gives it a subscript, get and set, then makes it conform to the MutableCollectionType collection protocol. If you do that, then you can use the Standard Library’s sort on something that isn’t even a collection — it’s just a single integer, but it still works, and actually does the nibble-sorting challenge for you.

struct Nibbles: MutableCollectionType {
    var val: UInt64

    var startIndex: UInt64 { return 0 }
    var endIndex:   UInt64 { return 16 }

    subscript(idx: UInt64) -> UInt64 {
        get {
            return (val >> (idx*4)) & 0xf
        }
        set(n) {
            let mask = 0xf << (idx * 4)
            val &= ~mask
            val |= n << (idx * 4)
	}
    }

    typealias Generator = IndexingGenerator<Nibbles>
    func generate() -> Generator { return Generator(self) }
}

// use the stdlib's sort
var col = Nibbles(val: 0xbadbeef)
sort(&col)
col.val // is 0xfeedbba000000000

You might think that you’re paying a big penalty for wrapping an integer in a structure like that, and giving it all of these getting and setting operations, but you’re really not. If you wrote this in a Class form, or in Objective-C, and made these methods have message sending, then yes, it’s likely that you would pay a runtime penalty. With structs, there is no penalty. The size of that collection class, and the behavior of that collection class, is exactly as if you are operating directly on the underlying type itself. The upshot of this: you can write the challenge and beat the reference implementation purely by using the Standard Library sort. This isn’t because mutable collections are a kind of magic, but because the Standard Library sort is really good. Instead of being a selection sort as we saw before, it’s a hybrid of a quick sort, a heap sort at certain recursion depths, and an insertion sort for very small input.

Zero-Cost Abstractions (14:11)

Those are examples of the kinds of things that Swift is really good at, namely, zero-cost abstractions. These abstractions are really important, so important that you’re actually using one every time you write Swift. The most fundamental type in Swift is actually not a fundamental type. Int is actually just a struct that wraps the real integer inside it, and this is what allows you to do things like extend Int with your own functions. It’s what allows the standard library to do assertions on buffer overflows, and so on. Because structs in Swift are zero-cost, it allows you to do some really efficient operations while still having a really high level of abstraction.

struct Int {
    var value: Builtin.Int64
}

func +(lhs: Int, rhs: Int) -> Int {
    precondition(no overflow)
}

Generics are another zero-cost abstraction that make sort work. If you compare the following two functions, the first function takes a protocol which allows you to perform operations on the type. The second function doesn’t take a protocol, but a generic placeholder that needs to conform to that protocol.

func f(x: P) { }
// versus
func f<T: P>(x: T) { }

The distinction between these two different function declarations is that, at compile time, the second one gets replaced, so you’re not actually operating through the protocol. You are operating directly on the type, and the compiler will actually write the function for you as if you had written it directly on an Int. The good thing is, depending on what protocol you’re conforming to, it can work on ints and strings. The sort collection is an example where you’re writing a generic function that works on any type of collection. It works on arrays, and things like nibble-sorting collections. It really doesn’t care. It’s as efficient as if you’d written it directly against the types themselves, which, I think, is a pretty powerful capability of generics.

Write your Own fast Generic Functions! (16:06)

Generics are available to you, so you too can write really efficient code that operates directly on the types using protocol conformance. Suppose for a second that the Swift sort wasn’t quite so good for small collections, perhaps it didn’t implement insertion sort, and was just a quicksort. You could write an insertion sort yourself that did exactly what the Swift sort did, and it would be just as fast… or it would be just as fast if you wrote it properly.

The final moral here: don’t unintentionally write your own Standard Library Functions. This is something that you see people doing quite often. It’s really important to stay familiar with the contents of Standard Library, because the Standard Library functions are very efficiently implemented. In many cases, for example with map or reduce, they actually drop down to pointers and avoid a lot of the overhead that you sometimes experience with things like subscripting or indexing.

Stay Familiar with the Swift Standard Library (17:15)

Q&A (18:13)

Q: What happened with the protocol and the generic protocol? Do you think it’s a temporary limitation, or is that a structural problem?
Airspeed: So the difference here is that, as of Swift 1.2, you can have a protocol that even points to structs, and you can decay an actual struct to a protocol. At that point, it’s a reference to that struct. If, for example, you do the size of that protocol object, you’ll see that it’s actually 24 bytes. It’s not the eight bytes of, say, an integer. And so protocols, and operating on things through protocols, are fundamentally not as efficient under all circumstances as operating directly on the types itself. You are really adding a layer of abstraction.

The cool thing about generics is that there is no layer of abstraction. At compile time, everything drops away. That’s a small simplification, there is more to it than that. Sometimes it doesn’t do that because you might get code bloat. It’s a problem that C++ suffers from, where too much template usage causes your binary to bloat to a huge size. The Swift team has done some very clever stuff to allow a kind of runtime templating as well, but for the most part it’s much more efficient to operate directly on the types, and that’s what generics give you the capability to do.

Q: What are your tools of the trade? How do you find all this information?
Airspeed: Import Swift, option-click is actually quite useful. If you play around with Swift C a bit, you can actually get it to spit out lots of the intermediary stages. You can get it to dump the Abstract Syntax Tree, which shows you exactly what kind of type inference is going on. You can also get it to dump the assembly language directly to the console.

If you’re writing little short snippets, one of the nice things about Swift is the ability to use really useful bits of code in 10 lines. Then you can compile it and look at the assembly, and it actually makes sense to you, so long as you know a little bit of assembly. One of the things I like about Swift is the tradeoff that allows you to write code like it’s Ruby. You can use map and reduce — it’s a high-level language. For example, the reference code for the nibble sorting on John Regehr’s website was originally in C, but I ported it to Swift. If you compile it with the most aggressive optimization in Swift, it performs the same as the C. If you look at the assembly code that it spits out and compare it to the C assembly code, it’s very similar.

What’s so exciting to me about Swift is the fact that we have a high-level language, with the potential of having the actual runtime capability of C and C++ when it gets down to the assembly level, so long as you use it in the right way.

The key is learning how to use it in the right way.



Airspeed Velocity

Airspeed Velocity