Embrace Immutability

Parsing inherently heterogeneous data has always been a point of conversation within the Swift community. After nearly 2 years of different approaches to this problem, Keith will look at where we have been and how today’s Swift language features provide for cleaner and safer solutions to immutable models.


Introduction (0:00)

I’m Keith Smiley, and I work at Lyft in San Francisco. In this post, I will address two related topics: parsing heterogeneous data in Swift (focusing on JSON), and immutability. Finally, I’ll also discuss why you should use the open source library Mapper.

Mapper: Not a JSON Parser (1:03)

First of all, Mapper has nothing to do with actual JSON parsing, i.e. taking a string and making a dictionary, like NSJSONSerialization. Mapper also does not allow you to create a JSON from model objects, because we did not find a solution that we liked within the confines of how Mapper works today. I am going to talk about this blob of JSON, and how we can use it to get a useful model object. I’ll reiterate, though, that this is not specific to JSON:

{
"user": {
    "id": "123",
    "photoURL": "http://example.com/123.jpg",
    "phone": {
      "verified": false
    }
  }
}

The original version of Mapper (pre-Swift 1.0) had disadvantages: we are not initializing properties in our initializer; everything has to be mutable. We also have optional properties or default values.

Optional properties can be great, i.e. photoURL can be optional because not every user may have a photo, but consider the option string User.id. We could leave this as an optional and deal with that through our code, but we would not know what to do in the else statements - that is not a case we are supporting. We could also make it a normal string, non-optional, with a default value (e.g. -1 or something). In our apps, if you try to do a network request for a user, we use the ID for something in the path or the authentication, and having a default of -1 would blow up just as much as not having an ID would. When creating a user from JSON, we want to verify that we will get a user with a valid ID; if we did not have a valid ID, we would ignore it entirely.

struct User: Mappable {
    var id: String?
    var photoURL: NSURL?
    var verified: Bool = true

init() {}

    mutating func map(mapper: Mapper) {
        id       « mapper["id"]
        photoURL « mapper["photoURL"]
        verified « mapper["phone.verified"]
    }
}

This is because we are not initializing the properties in the initializer. You’ll notice that it is easy to create an “empty” user, whatever that means. You can call the initializer with no arguments and still get something back (huge source of problems, e.g. pass the wrong JSON in and still get a user back). We’d like to make sure that on this layer we can say, “This JSON is invalid, let’s ignore it”.

Another limitation is the complexity of the library. Mapper was a protocol:

protocol Mapper {
    func baseType<T>(field: T?) -> T?
    func baseTypeArray<T>(field: [T]?) -> [T]?
    // ...
    subscript(keyPath: String) -> Mapper { get }
}

We had a lot of baseType functions which defined how to get specific types from JSON, such as the optional T? and the array [T], as well as a subscript that takes a string and returns an instance of Mapper. We also had custom operators.

func « <T>(inout left: T?, mapper: Mapper) {
    left = mapper.baseType(left)
}

func « <T>(inout left: [T]?, mapper: Mapper) {
    left = mapper.baseTypeArray(left)
}

For the subscripts, I would mutate this currentValue property on Mapper itself, and set that to the JSON without the strong type information from the field that you were expecting, and return an instance of self. That meant that the baseType function had already happened by the time you got inside the operator.

class MapperFromJSON: Mapper {
    var JSON: NSDictionary
    var currentValue: AnyObject?

    // ...

    subscript(keyPath: String) -> Mapper {
        get {
            self.currentValue = self.JSON.valueForKeyPath(keyPath)
            return self
        }
    }
}

The baseType functions would grab that current value (see below), and switch on the expected value. The huge switch statement had many different types. We have a custom case for NSURL (if you have a string and you expect a URL, you can try to create one from it). We have other ones too, which worked for strings and other native Swift types.

func baseType<T>(field: T?) -> T? {
    let value = self.currentValue
    switch T.self {
        case is NSURL.Type where value is String:
            return NSURL(string: value as! String) as? T

        // ...

        default:
            return value as? T
    }
}

We defined a coordinate as a subobject with few keys (specific to our app). This is not ideal - it would be one huge switch statement. As a result, we tried to write a new Mapper. We did not want to sacrifice the interface of a library for the complexity of its implementation.

case is CLLocationCoordinate2D.Type:
    if let castedValue = value as? [String: Double],
      let latitude = castedValue["lat"],
      let longitude = castedValue["lng"]
    {
        return CLLocationCoordinate2D(latitude: latitude,
            longitude: longitude) as? T
}

    return nil

With open source JSON parsing libraries, you see a lot of duplicate type information. You have to redefine that it is a string by calling the .string function at the end, which seems unfortunate to have to duplicate. At the moment in Swift, subscripts cannot be generic. Before, we didn’t have to do that since we used return type inference instead. We were trying to solve these problems with Mapper.

id = JSON["id"].string

Mapper Today (8:15)

With Mapper as it stands today, our properties are all immutable. This works well because we are setting them in the initializer. The initializer can fail by throwing: if an ID does not exist in the JSON, or if it is not a string, we end up throwing (you do not get a user object back).

We also avoid the “two JSON” thing. We encode the key to property definition in the initializer, meaning that we can’t reverse that process by calling the initializer. In the old Mapper, we owned the subscript definitions and the custom operator: we could call that function again on an existing object and get a JSON out. We would either have a separate protocol/library to duplicate that, which we didn’t want. But that is not how we update model objects with our API.

struct User: Mappable {
    let id: String
    let photoURL: NSURL?
    let verified: Bool

    init(map: Mapper) throws {
        try id       = map.from("id")
            photoURL = map.optionalFrom("photoURL")
            verified = map.optionalFrom("phone.verified") ?? false
    }
}

We also wanted to avoid the implementation complexity. The library gets the field from JSON, given a specific string, and gets the correct type. Function works similarly, except that you get an optional T?.

func from<T>(field: String) throws -> T {
    if let value = self.JSONFromField(field) as? T {
        return value
    }

    throw MapperError()
}

We wanted to decentralize where the custom types were defined, so we created a convertible protocol. Definition for NSURL: try to get a string, try to create a URL and return it if you can; otherwise, throw an error. This is the only implementation of this convertible protocol that lives in the Mapper library itself (everything else was specific to our app). We have defined the coordinate one in our model layer, but we do not have to leave that in the open source library.

extension NSURL: Convertible {
    static func fromMap(value: AnyObject?) throws -> NSURL {
        if let string = value as? String,
          let URL = NSURL(string: string)
        {         
            return URL
        }

        throw MapperError()
    }
}

With the generic functions, we can do transformations. Below, we have an object (AppInfo) and a container for controlled components of our app, specifically strings. “Hints” are user on-boarding bubbles, to guide users through new flows. The server sends us an array of hints from this hints key, but we really want the dictionary we define above that has an ID to a hint that we can access from a view controller; you want to check if there is a hint that matches the ID to present. This toDictionary transformation takes a closure that defines how we get the key from the object we are creating. The $0.id is generating the hintID for use in the dictionary.

struct AppInfo {
    let hints: [HintID: Hint]

    init(map: Mapper) throws {
        try hints = map.from("hints",
            transformation: Transform.toDictionary { $0.id })
    }
}

We have functions that define similar protocol conformances (Swift isn’t always happy about that; it is hard to tell what function is going to get called, which is tough). But, the resulting API is appealing, and still simpler than the previous implementation.

func from<T>(field: String,
  transformation: AnyObject? throws -> T) rethrows -> T
{
    return try transformation(self.JSONFromField(field))
}

Embrace Immutability (13:13)

I think everyone agrees that immutability is a good thing, in the larger sense, so I won’t argue for it specifically. Instead, I’ll mention how adding Mapper has changed how we treat model objects and how we handle immutable model objects throughout our entire app.

Dumb models (13:47)

A great affordance of immutability is dumb models, like this one from above:

struct User: Mappable {
    let id: String
    let photoURL: NSURL?

    init(map: Mapper) throws {
        try id       = map.from("id")
            photoURL = map.optionalFrom("photoURL")
    }
}

Those models are a mapping from the server to the client, and they hold no hidden complexities. We cannot do anything with didSet here (where you set a property and then you change some other state on the object), which can cause hidden issues on models. We are limited by the compiler since we are initializing this id property, so any didSet would not get called even if it was a var.

When everything was a var in our app, we had many didSets. Maybe one property would mutate another property, leading to undefined behavior. The new way helps with clarity: you have a simple interface to model objects. The user has no idea how it is created (except from a “Mapper”) or how it’s updated. We have also moved all of our models to a separate framework.

Code smell (15:03)

Another advantage of having immutable properties on all of our models is code smell you can see when you look at pull requests. If you see:

-    let pickup: Place?
+    var pickup: Place?

… it’s pretty obvious you are doing something the wrong way, and that we should try to find a better solution (not have to make everything mutable).

Separate creation (15:32)

As another advantage, we now have the restriction on how you can use models. In our old ride model, before we made everything immutable (example below), it was easy to create empty ride models: we could call an empty initializer and get a ride model out of it. Since we could do that, and because a ride shared similar properties with the action of requesting a ride, we could reuse the model in both places. Since pickup was mutable, that was easy. We could set a pickup to a place and pass it around (and eventually request a ride).

struct Ride: Mappable {
    var pickup: Place?
}

This affects the rest of our app. We have an object called RideManager which owns the current ride. It has lots of one-way observers that can update the view hierarchy or do network calls. Then, we have this “evil function”, which can, for example, decide to mutate a pickup state:

RideManager.ride.pickup = Place(name: "Realm")

Let’s say something happens in the app (e.g. update the pickup location with the user’s current location). We change the pickup location to the location that corresponds to here at Realm, and changes that state on the RideManager itself. In this case, what should happen? Should this propagate to the UI, update the server, both or neither? What if the user was in the ride when we did this, since there was no limitation for that not to happen?

The “evil function” would update the Ridemanager, which would fire a didSet on the manager itself (because it owns its ride, the ride changed, and that would fire a didSet). Might sound harmless, until you think about all the observers.

We re-fire all of the observers. They do the expected behavior (e.g. when they get a new ride - if the user is in a ride, and we update it to the pick up location of Realm). Should that change the UI to change the pick up location, but the user has already been picked up? There is nothing in place to prevent that behavior.

By making all of our models entirely immutable, we have locked this down on the compiler level. We do not have the option of updating the pickup place on the ride even if we wanted to. The ride comes from the server, and that is the only ride that exists. The server, in our case, is the source of truth for this information.

We also needed a new way to request a ride, so we created a one-off model (I only showed the pick up property before, but): our ride model might have probably >40 properties (pickup locations, multiple passengers, drivers).

struct RideRequest {
    var pickup: Place?
}

There was only a little overlap between requesting/being in a ride. If you are requesting a ride you only need a few properties; if you are in a ride, you had to pass that around between people. There was no contract in the code that that was the case.

Now, there is no server side counterpart - requesting a ride is entirely a client-side action. You can mutate this ride because you are going through a few steps. However, we have no more unexpected mutations on the ride (especially when having a ride determines the state of our app). We also avoid the duplicate or the properties that you may/may not use at a certain point and time.

Simplicity Checklist (20:02)

When writing any line of code, one can think about:

  1. How can I do this?
  2. How can I do this easier?
  3. How can I do this simpler?
  4. How can I not do this?

Simplicity was the goal of Mapper. Since we need model objects, we cannot get to #4 here, but as Swift evolves, we hope to make it even simpler.

Q&A (21:33)

Q: How do you handle data consistency with immutable models? Say a ride object has a user attached to it, and that user changes somehow, like the user editing their profile picture. How is the ride object then notified of that change? Before you would probably use something like KVO.

Keith: I don’t have a great answer, because I feel like it is specific to our app and our implementation. When you update a property, you get everything that is changed (and relevant to that change), back from the server. Our app is 100% server driven. We have zero persistence of model objects; if we get a new thing that is entirely different from the last version that we had, we had to change the entire UI and everything about our app to reflect the new state. We do not have a great solution for the normal case (where you have some persistence and some client state that you have to reconcile), but that is how it has worked for us.


Keith Smiley

Keith Smiley

Keith is an iOS engineer at Lyft in San Francisco. Previously, he coded at Thoughtbot, and has occasionally clicked the big green button for CocoaPods.