With the new tvOS came a new way to interact with views, called the focus engine. Saniul Ahmed discusses the sometimes overlooked details of the focus engine, and takes a look at building a UI for tvOS using custom touch processing to implement a Tinder-style swipe gesture. He also presents a more advanced example of what’s possible on tvOS, with custom touch handling to jiggle your imagination and inspire you to build interesting and unique experiences for the newest Apple platform.
Introduction (0:00)
Hi, my name is Saniul, and I’ve been playing around with the focus engine in tvOS for the past few weeks. In this talk, I’ll cover the focus engine in general and some caveats that I found, Siri remote touch processing, and I’ll also present two demos: one about gesture recognition, and another about multistroke gesture recognition.
The Focus Engine (1:03)
The focus engine is what drives interaction on tvOS. Whenever I slide my finger on the touch pad, the highlight on screen moves left and right. You can think of it as a smart spotlight that shines on the elements.
Views and other view-related classes, like view controllers, define the focus environment in which the engine operates. The user then controls the focus with the remote. However, you as the developer can influence it as well.
I think the workhorse for us is this property of the UIFocusEnvironment
protocol, preferredFocusView
. The focus engine uses it whenever it needs to figure out which view to give focus to in various scenarios. These scenarios are selecting initial focus, user-initiated focus change, and forcing a focus update programatically.
Initial focus selection (2:30)
This happens whenever we launch an app, and therefore when we present a new view controller. During initial focus selection, the focus engine starts figuring out which view to focus on by querying the rootViewController’s preferredFocusView
, and it goes on recursively.
By default, the preferredFocusView
for view controllers is the rootView
. Views aren’t actually focusable by default, so there’s a fallback procedure; if the focus engine cannot find anything, it will just jump to the closest focusable view to the top leading corner of the screen. (For languages like English, that is the top left corner.)
When I open that view controller, the focus jumps to the top left corner. However, I can specify the preferredFocusView
to be the button on the bottom right. By specifying preferredFocusView
and controlling the canBecomeFocused
method of UIView
, you can have indirect control over the focus engine. However, most of the time the user is in control with the touch pad.
User-Initiated Focus Changes (3:52)
When a user gestures on the remote, it triggers a focus engine update procedure and it goes roughly like this: the engine extends the frame of the currently focused view in the given direction. If I swipe down, it’s going to take that button and basically extend the rect, and whatever focusable view intersects that extended rect is considered for the focus update.
When I first played around with tvOS, this looked like magic in the sense that I expected things to work without much work on the part of the developer. However, there’s actually quite a bit of legwork that you have to do to make it seem intuitive.
Imagine a cross-like layout of buttons, like this:
Without focus guides, if I swipe up and down while on one of the buttons on the left or right sides, nothing happens. As a user, I would expect it to jump to a neighboring button. If I swipe up, I would expect it to jump up to the top button. If I swipe down, I would expect it to jump down to the bottom button.
If I add focus guides, I can move my finger around in a circular motion, and it’s going to do what I expect it to. However, I had to manually place a UI focus guide on the edge of every button and tell it which view to redirect focus to. It’s quite a bit of work for something as simple as this.
What are Focus Guides? (5:35)
Focus guides are non-UIView
elements that participate in the focus system. They’re subclasses of layout guides. It turns out that Interface Builder doesn’t know how to build them, so you can only build them in code. They have the preferredFocusView
property with which you return where to redirect focus to.
Forcing Focus Change Programatically (6:19)
There are some situations where you might want to trigger a focus update programatically. Similar to setNeedsLayout
and layoutIfNeeded
, you have two methods here: setNeedsFocusUpdate
and updatedFocusIfNeeded
.
Imagine a container A, with two subviews: subview A and subview B. When I press a button on the remote, the outer container will change its preferredFocusView
and trigger a focus update. You have to notify the focus engine that you want to trigger the focus update. And if I press a second button, it’s going to change the preferredFocusView
in the subcontainer, and the correct button will get the initial focus.
Touch Processing (8:03)
The property type
on UITouch
was added in 9.2, and it has three different touch types: .Direct
, which is basically what we’ve been working on for years, and the touch type for iPhones and iPads; .Stylus
, which is the touch type for Apple Pencil–related touches; and .Indirect
, which is what the remote touchpad works with.
I was sad to learn that the touches processed from this remote always begin in the center of the view they’re happening in. You have to think of it like those little joysticks on game pads that always start out at (0,0). It doesn’t matter where the user actually touched the remote. If you’re thinking, “I’m going to do this cool thing whenever the user swipes from the right edge”…NOPE.
Another remote touchpad downside: no multitouch. This thing can only work with one touch at a time. However, the good news is that you can use the good old UIKit APIs that you’re familiar with. They work pretty much the same way as you expect them to.
touchesBegan(touches: Set<UITouch>, withEvent event: UIEvent?)
touchesMoved(touches: Set<UITouch>, withEvent event: UIEvent?)
touchesEnded(touches: Set<UITouch>, withEvent event: UIEvent?)
touchesCancelled(touches: Set<UITouch>, withEvent event: UIEvent?)
UIGestureRecognizer
s (9:40)
UIGestureRecognizer
s work, but only some of them. We have tap, double tap, long press, swipe, and pan. The tap gesture recognizers don’t work with a tap on the remote, they work with a click. You can set the numberOfTaps
property on the tap gesture recognizer.
You can also filter the gesture recognizer to only work with a certain press type. For instance, I could filter it so that nothing happens when I’m pressing the touch pad, but it may only react to pressing the play/pause button.
Long press also works, and swipes work. If I set swipe right, I can swipe right, but swiping left or up won’t make anything happen.
As for the pan gesture recognizer, there’s weird thing that happens. If I start panning left and right, it updates the axis information as expected. However, if I start panning up, it will just continue updating. It still captures my original touch and it keeps working with it, but the focus system is actually independent from your processing of the touches. If you want to stop processing touches whenever your view loses focus, you have to handle that yourself; it’s not automatic.
Demo 1: Tinder for Art (12:42)
The first demo, which you can watch in the video above, is a “Tinder for Art” project I initially talked about a year ago. It’s loosely based on Brian Gesiak’s “MDCSwipeToChoose,” and it uses the UIPanGestureRecognizer
for the main interaction.
It tracks my touch on the touch pad, and I can swipe right to choose art I like, and swipe left to reject art. I can also press the play/pause button to reset the theta source. I was careful to allow the user to mindlessly flick without forcing you to wait for the end of the animation.
Demo 2: Magic (14:09)
You can also see this demo in the video above. This one is a bit more complicated, and none of it is really original work. It’s built on top of a MultistrokeGestureRecognizer iOS project, which is actually an implementation of the “$N Gesture Recognizer”, where you can teach it how to recognize glyphs made of multiple strokes. It’s pretty useful on something small as a remote touch pad, and it’s quite versatile and extensible. You can come up with any sort of combination of strokes and teach it how to recognize it.
I loaded it with three templates: square, X, and triangle. If I draw an X, it understands what the strokes are, and it recognizes the X. I can do the same with a square and a triangle. It’s powerful enough to understand what the strokes are, even if they’re a bit moved.
You might think that this is useless, but I guess you could use it in something like a Harry Potter game, where the user can draw the gesture for a “spell.”
Debug Focus Changes (15:59)
Let’s say you have a view and you don’t know why it’s not focusable. You can pause execution., and there’s a hidden method in UIKit that you can call in every view. It’s called _WhyIsThisViewNotFocusable
, and it’s going to run some simple diagnostics and hopefully help you figure out why that view’s not focusable.
The cooler thing is that if you pause execution in methods to do with focus changes, you can quick look on the structure that Apple provides, and it will give you a visual representation of the focus change. It draws a search path in red, you can see which views it’s going to consider for the focus change.
Q&A (16:56)
Q: I’m curious how you specify what button you’ll want to register gesture recognizers for? You mentioned double tap with the play button. At what point in the API are you specifying this?
Saniul: This is a property on the tap gesture recognizer, I think it’s allowedPressTypes
. PressType
is an enum, so in Swift you can just pass them as a set. That PressType
is unique to tvOS, but you can get the TouchType
on iOS as well. This is how you filter stylus touches from the normal touches.
Q: Imagine that you would want to see a layer-based view. Not view-based views, but something completely custom. How would you go about working with focus when you have layers and not really views?
Saniul: I don’t think you can. However, you can always have an invisible view. You have to make sure its alpha is 1 and it’s not hidden, because otherwise it doesn’t participate in the focus environment. You can have an invisible view that will capture the focus, and then you can do whatever you want.
Q: I remember with accessibility you can have custom paths and a lot of stuff to kind of hint to the system that this is where I want rectangles and the focus to be. There’s nothing like that for focus engine?
Saniul: I don’t think so. In the CALayer
-based environment, why do you care where the focus is? What are you trying to achieve?
Q: Hypothetically, let’s say I wanted to hack a web view into Apple TV. Hypothetically.
Saniul: The interesting thing is that your view doesn’t have to be in focus for you to capture the touches. If I’m getting you correctly, you want to display a browser, but then you still want to capture touches to do something, right? As long as you have a UIView
that has user interaction enabled, even if it’s not in focus, if it’s the only thing that the user can interact with, it will get the touches.
Q: Can you access media in the API?
Saniul: Yeah, definitely, I’m pretty sure you have access to AVFoundation, which means you can stream HLS, you can stream anything. This is the part of the SDK that doesn’t really deal with focus, and I was trying to focus on the focus.
Receive news and updates from Realm straight to your inbox