Try bring speech recognition programmers to eye-tracking p1

Want to discuss stuff not related to The Eye Tribe Tracker? Do it here!

Try bring speech recognition programmers to eye-tracking p1

Postby JeffKang » 10 Jul 2014, 03:30

https://docs.google.com/document/d/1iN9 ... ejolA/edit

VoiceCode “is an Open Source initiative started by the National Research Council of Canada, to develop a programming by voice toolbox.
The aim of the project is to make programming through voice input as easy and productive as with mouse and keyboard.”

Just like Palaver (Linux speech recognition) is an add-on for Google’s speech recognition, VoiceCode is an add-on for Dragon speech recognition.

I posted a message in their group about consumer-level, developer eye trackers and SDKs that are now available for preorder because some of the users have disabilities (repetitive strain injury, a.k.a. tendinosis, and a.k.a. chronic tendinitis), and I believe that eye tracking can be a technology that can help, and can assist in enhancing speech recognition.

The majority of them are programmers, and since some of them have a vested interest in accessibility hardware and software, I tried to see if I could nudge a couple of them to consider dabbling in eye tracking development.

I’d like to post it here just in case I decide to refer to it later.
When I try to market eye-tracking on places like Reddit or Google plus, I like to have information ready.

#eyetracking #speechrecognition #ux #ui #assistivetechnology #accessibility #tendinosis #disability #opensource

**Eye trackers, accessibility, eye tracking interfaces, eye tracking + speech recognition together, eye tracking advantages**

This post is about eye-tracking, which has recently become affordable for consumers.
A while back, Frank Olaf Sem-jacobsen put up a thread on eye tracking: https://groups.yahoo.com/neo/groups/Voi ... opics/7223.
(Edit: I also found Anthony’s post on eye tracking: https://groups.yahoo.com/neo/groups/Voi ... opics/7695).

Mark Lillibridge mentioned that a number of individuals in this group have disabilities, so I think eye tracking is worth another repost to update people on what’s available.

I would like to post this here because:

1) Eye-Tracking is a powerful accessibility technology

Eye-tracking is one of the most powerful technologies for accessibility, and can aid many people with a motor disability.
If you look at people with ALS/Lou Gehrig’s disease, they are some of the most disabled individuals.
As the neurons degenerate, and they lose control of their muscles, eye tracking is one of the few technologies that they can turn to to give them back control.

2) Eye-tracking can be used in varying degrees

Eye-Tracking can be incorporated in varying degrees to suit a user’s situation.
More disabled individuals can use more active control, such as manipulating interface objects with just your eyes.
Then there’s the more common passive control e.g. when your eye gaze approaches the bottom of a window or page of text, it automatically scrolls down (video demo: http://youtu.be/2q9DarPET0o?t=22s).
If you were to instead press page down, your gaze most likely already went to the bottom of the window before pressing page down.
Or, you look at an interface widget to highlight it, and then press a keyboard button to select it.
Had you been using a mouse or a touchscreen, chances are that you would have looked at the widget first before moving the mouse, or touching the touchscreen to select it.

3) Eye-tracking interfaces can now be explored

I talk about types of interfaces that work with eye tracking, which are newly accessible because eye-tracking itself is recently accessible.

4) Eye-tracking can assist other input methods like speech recognition

I discuss eye tracking working in conjunction with speech recognition.

5) Mouse control vs. eye-tracking + mouse control (eye control to initially teleport cursor near target, and mouse to precisely place cursor)

There are features that are for using mouse control and eye control in conjunction.
E.g. Eye-tracking is used to initially teleport your cursor near your target, and then you can use the mouse to precisely place the cursor.
I discuss a research paper later on that pits mouse control by itself against “mouse control + eye-tracking teleport” in a series of computer tasks.
“Mouse control + eye-tracking teleport” ends up being the clear winner.
(If you want to skip to a demo, the authors of the paper put up a YouTube video.
2:41 shows a competition where you have to click targets as fast as possible: http://youtu.be/7BhqRsIlROA?t=2m41s).

6) Augmenting with eye-tracking is almost always faster – it’s not choose between “mouse + keyboard” vs. “eye tracker + keyboard”– it’s “mouse + eye-tracking teleport + keyboard” or “eye tracker + keyboard” – that is, eye-tracking is always there

Eye-tracking +keyboard: Eye-Tracking doesn’t have the precision of a mouse, but if an interface element and hit state is large enough, a “click-where-I’m-looking at” keyboard button will work.

Eye tracking + keyboard two-step process: I later talk about some eye tracking features that allow an eye controlled cursor to snap, zoom, etc. to a smaller target element, or make smaller elements project into large elements.
Sometimes it’s a two-step process, so even if you have the ability to instantly teleport the cursor, “both-hands-on-keyboard + eye-tracking two-step process” may not be suitable in certain situations.

Eye tracking teleport + mouse and keyboard: However, whenever you need the mouse, eye-tracking will still be there to provide an initial cursor teleport.

Without eye-tracking: If you have both hands on the keyboard, you lose time switching one hand to the mouse, and bringing the hand back to the keyboard.
You’re usually choosing between both hands on the keyboard, or one hand on the mouse.

With eye-tracking: With eye tracking, it can be used either with both hands on the keyboard (click-what-I’m-looking-at keyboard button), or one on the mouse (initial cursor teleport, then use the mouse).
You never have to forgo something to use eye-tracking; it’s always ready to make normal computer interaction faster.

7) Eye-tracking can make on-screen buttons , and thus macros more prevalent

Eye-tracking can make macros more popular because eye-tracking allows for easier activation, and thus more use of custom widgets and on-screen buttons.
A collection of custom on-screen macro buttons with recognizable, self-documenting text labels is easier to maintain than a collection of Control + Alt + Shift + <whatever> keyboard shortcuts for activating macros.

8) Eye-tracking is now very cheap

Besides the under-$100 external eye trackers that are available now, eye-tracking companies will eventually have their eye tracking components put into devices like smartphones, tablets, notebooks, and laptops, as they already have front-facing cameras, and sensors built in.
It makes the cost of additional modifications extremely low for manufacturers.
(“OEM vendors could likely add this sensor to their handsets for just five dollars” – http://www.cnet.com/news/eye-tribe-show ... ile-phone/).
(“sensors are coming out that can switch between regular camera and infrared camera” – http://www.wired.co.uk/news/archive/201 ... be-android).
Therefore, your next device might have eye tracking by default.

7) Many people aren’t aware that eye-tracking is now accessible

I would like to make more of the public aware of eye-tracking, as it has a lot of potential to elevate productivity.
Also, the demand from general users is the only way to progress eye-tracking for those that need it for accessibility purposes.

It’s a very long write-up, so a lot of it might be off-topic.
If too much of this post is irrelevant to put here, feel free to remove the post.

Warning: I’m not an expert of anything, and I’m not a developer, so there’s a good chance that a lot of the suggestions don’t make any sense.

1 **Consumer-level, developer eye trackers, and SDKs available**
2 **Eye tracking can remove physical limits**
3 **First round of eye trackers intended for developers**
4 **Eye tracking companies – Tobii and Eye Tribe**
5 **Eye-tracking (for initial warping/teleporting of cursor, and large cursor movements) + game controller (for precise cursor movements, finishing with an accurate cursor placement, and clicking)**
6 **Eye-tracking pointer motion teleport + mouse: use eye-tracking for initial warping/teleporting of cursor, and then use precision of mouse to finish selection – research paper on benefits of initially warping cursor**
6.1 **Mouse-cursor-teleport user setting: time that mouse controlled cursor must be in rest before eye control is involved again (mouse precision still in use)**
6.2 **Mouse-cursor-teleport user setting: point-of-gaze must be a certain distance from the mouse controlled cursor before eye control is involved again (eye-tracking is activated for larger cursor jumps)**
6.3 **Research paper: “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks – competition between mouse vs. an eye tracker + mouse to click randomly generated buttons as fast as possible”**
7 **Eye tracking working with speech recognition**
7.1 **Adding context, and limiting the number of selection choices that a “select-what-I-say" speech command will match**
7.2 **Fast speech corrections with “correct-what-I’m-looking-at” button**
7.3 **Google Earth - eye tracker to get location-based information, and speech recognition to initiate an action according to location e.g. what city is this?**
7.4 **Commands that use both speech and eye input – e.g. select range of text lines**
7.4.1 **Detecting interface elements – Vimium, VimFx, LabelControl, et al.**
7.4.2 **Select range of text lines**
7.4.3 **Advantages of eye tracking and speech in range selection command**
8 **Make eye tracking work with existing, smaller-element applications**
8.1 **Magnification/zoom for clarification of smaller, close-together elements**
8.1.1 **Magnification in eye-tracking interfaces**
8.1.2 **Zoom in touch interfaces e.g. Chrome on Android**
8.1.3 **E.g. of zoom in PCEye interface**
8.1.4 **Multiple magnifications for commands that require multiple user input steps**
8.2 **Make interface elements of existing applications responsive to gaze e.g. detecting words on a webpage**
8.2.1 **e.g. research paper: detecting words on a webpage**
8.3 **Cursor snapping to interface elements**
8.3.1 **Dwell Clicker 2**
8.3.2 **EyeX for Windows**
8.3.3 **MyGaze tracker – snapping to desktop interface elements, buttons of WizKeys on-screen keyboard, and Peggle game**
8.4 **Generate/project large-button, touch/eye-tracking UI from existing non-touch/non-eye-tracking UI**
8.4.1 **Tag elements near point-of-gaze w/ colors, IDs, and lines – project to large elements (screenshot and mock-up)**
8.4.2 **Only pop out alternate elements that are near point-of-gaze, instead of considering all the elements, like in Vimium**
8.4.3 **Can have less organization, and use flow layout for projected larger elements: easier to set up, and quick eye movements can still work with disorganization (eye-controlled cursor covers distances between scattered objects quickly)**
8.4.4 **Possibly keep generated letters/numbers/IDs of Vimium-like, element-detection programs for labeling large elements**
8.4.5 **Generate colors to match smaller elements with larger elements – similar to semantic color highlighting in programming editors**
8.4.6 **Displaying lines to connect smaller elements to larger elements**
8.4.7 **Conclusion – two-step projection process can still be fast**
8.4.8 **E.g. DesktopEye: open-source prototype for generating eye-tracking compatible versions of window processes**
8.4.9 **E.g. generating speech-compatible versions of open windows**
8.5 **Conclusion – give people a sample of a potentially speedier touch/eye-tracking interface**
9 **Current interfaces, and open-source interfaces for eye-tracking**
9.1 **Interfaces by eye tracking organizations: PCEye, EyeX, GazeMouse, GazeTalk**
9.1.1 **Floating window, or detachable quick-access toolbar for quicker access?**
9.1.2 **GazeTalk – free, predictive text entry system**
9.1.2.1 **Notable feature of GazeTalk: accumulate dwell time: resume dwell time after looking away**
9.2 **Action upon slowing down, or stoppage of the cursor = eye is fixating**
9.2.1 **E.g. Activating widgets in GazeMouse, PCEye, and bkb**
9.2.2 **E.g. video demonstration: slow-down upon fixation to increase cursor steadiness in a game**
9.2.3 **E.g. Reverse cursor upon slowing down, or stoppage of the cursor (minimize overshooting) – found in video for paper, “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks”**
9.2.3.1 **Conclusion**
9.3 **Other software interfaces – free head tracking software: Camera Mouse, eViacam, and Mousetrap – head-tracking interfaces similar to eye-tracking interfaces**
9.4 **Open-source interfaces**
9.4.1 **Eye-tracking Python SDK for buttons + another scripting language (AutoHotkey, AutoIt, AutoKey, PYAHK, Sikuli, etc.) for macros**
9.4.1.1 **AutoIt example – “left-click-with-magnification”**
9.4.1.2 **Automatic generation of touch/eye alternatives for smaller elements, vs. manual creation of on-screen buttons for more complex macros**
9.4.2 **Open-Source keyboards**
9.4.2.1 **e.g. web-based virtual keyboard, and JavaScript framework**
9.4.2.2 **e.g. Virtual Keyboard using jQuery UI**
9.4.3 **Alt Controller – cursor going over designed regions can launch actions – e.g. page-down or scroll-down when gaze reachs a region at the bottom**
9.4.4 **DesktopEye: open-source prototype for switching between window processes**
9.4.5 **Open source, eye-tracking, predictive-typing software program by Washington State University student group, Team Gleason**
9.4.6 **bkb: open source application to control keyboard and mouse with the Tobii EyeX, The Eye Tribe gaze tracker, or an Airmouse (e.g. Leap Motion, Haptix) – has mouse actions, virtual keyboard, and automatic scrolling**
9.4.6.1 **Repeating click mode**
9.4.6.2 **Automatically scroll down when eyes reach bottom of window**
9.4.7 **Gazespeaker – design your own grids that have cells that launch actions, predictive keyboard, automatic scrolling, shareable grids, desktop or tablet**
9.4.7.1 **Work from within the program with built-in interfaces for eye-tracking e.g. custom email interface, and web browser – similar to The Grid 2, Maestro, and Vmax+ software**
9.4.7.2 **Customized grids with customized cells made with a visual editor (cells launch AutoHotkey, AutoKey, Autoit, PYAHK, Sikuli etc.?)**
9.4.7.3 **Sharing custom grids and cells – visual grids are easier to share, as opposed to sharing something like an AutoHotkey or Autoit script/text file**
9.4.8 **Eye-Tracking Universal Driver (ETU-Driver) – independence from particular eye tracking device**
9.5 **Conclusion**
10 **Google speech and accessibility**
10.1 **Growth of speech**
10.1.1 **Taking more and more context into account**
10.1.1.1 **Google Research: - “we are releasing scripts that convert a set of public data into a language model consisting of over a billion words”**
10.1.2 **Valuing speech recognition**
10.1.2.1 **E.g. “Spell Up - a new word game and Chrome Experiment that helps you improve your English”**
10.1.2.2 **Speech recognition in Android Wear, Android Auto, and Android TV**
10.1.3 **E.g. speech-to-text dictation is coming to the desktop version of Google Docs**
10.2 **Google accessibility**
11 **Eye tracking + Google speech recognition**
12 **Eye tracking potential depends on software – e.g. eye-typing**
12.1 **Video examples of eye-typing: PCEye, JavaScript web-based virtual keyboard, bkb, and Gazespeaker**
12.2 **E.g. of fast eye-typing: Eyegaze Edge**
12.3 **Thesis – review of the research conducted in the area of gaze-based text entry**
12.4 **Advanced software e.g. Swype, Fleksy, SwiftKey – eye-typing with Minuum on Google Glass**
12.4.1 **BBC video interview: Gal Sont, a programmer with ALS, creates Click2Speak, an on-screen keyboard that is powered by Swiftkey**
12.4.2 **Eye-typing with Minuum on Google Glass**
12.4.3 **Google software**
12.5 **Touch software and UIs (and thus close-to-eye-tracking interfaces) coming to desktops, laptops, notebooks – e.g. touchscreen Windows 8 laptops, foldable Chromebook, Project Athena in Chromium, Material Design, Android apps going to Chrome OS, HTC Volantis (Flounder/Nexus 9) 8.9" tablet**
12.6 **Auto-complete and auto-correct for eye-typing code**
12.6.1 **Mirror the drop-down list of auto-complete suggestions to the on-screen keyboard**
12.6.2 **Virtual keyboard indicator for number of drop-down list suggestions**
13 **Building eye-tracking applications by using web tools**
13.1 **Research paper: Text 2.0 Framework: Writing Web-Based Gaze-Controlled Realtime Applications Quickly and Easily e.g. interactive reading: words disappear when you skim them**
13.2 **Text 2.0 framework (create eye tracking apps using HTML, CSS and JavaScript) now known as gaze.io**
13.3 **Advantages of using web technology**
13.4 **Tobii EyeX Chrome extension, and JavaScript API**
13.5 **Pupil: web-based virtual keyboard, and JavaScript framework**
13.6 **Conclusion**
14 **Future of eye tracking**
14.1 **Eye tracking patents, and possible, future support from larger companies**
14.2 **OpenShades – Google Glass eye tracking – WearScript: JavaScript on Glass**
14.3 **Augmented reality – future AR: manipulating virtual objects, disability profiles, overlay forearm with buttons (Minuum video), current AR: labeling objects (OpenShades)**
14.3.1 **Manipulating virtual objects, invisible disability profiles**
14.3.2 **Augmented reality Minuum keyboard on forearm**
14.3.3 **OpenShades augmented reality**
15 **Advantages of eye tracking:**
15.1 **Already using your eyes**
15.2 **Comfort and ergonomics**
15.2.1 **Vertical touchscreen pain**
15.3 **Augmentation, not replacement – “click-where-I’m-looking-at” keyboard button, or cursor teleport before using mouse**
15.4 **Bringing speed, recognition, and malleability of virtual buttons in a touch UI to desktop users and vertical touchscreen users that are using a non-touch UI**
15.4.1 **e.g. Control + <whatever> = action vs. visual shortcut: button that is labeled with action**
15.4.2 **Sharing easily learnable scripts and visual interfaces e.g. sharing Gazespeaker grids as XML files**
15.5 **Using only a keyboard for maximum speed**
15.5.1 **Elements that are hard to access, or access quickly by keyboard**
15.5.2 **E.g. editing code without using a mouse: “fold-the-block-of-code-that-I’m-looking-at”**
15.6 **Mobile touch: eye-highlighting + only needing a few buttons (e.g. “single-tap-where-I’m-looking”, “double-tap-where-I’m-looking”) – hands-free scrolling – vertical mobile touchscreen – two-step process for selecting smaller elements, like text links, on non-touch-optimized websites while using mobile**
15.6.1 **Touch gestures + “touch-where-I’m-looking” buttons vs. touch gestures alone vs. mouse-clicking on a desktop**
15.6.1.1 **Advantages of eye tracking + few function buttons: speed, comfort, and less finger and hand movement – single-taps and tablets**
15.6.1.2 **Example apps on mobile**
15.6.1.2.1 **Customizing the Android Navigation Bar (easy-to-reach buttons)**
15.6.1.2.2 **Eye Tribe’s Android demo: tap anywhere on the screen**
15.6.1.2.3 **Launchers that require more than just taps i.e. swiping, double taps – replace with eye tracking + single-taps**
15.6.1.2.4 **Eye Tribe’s corner thumb buttons**
15.6.2 **Vertical touchscreen + “tap-where-I’m-looking” button**
15.6.3 **Hands-free interaction: while eating, while using another computer, etc.**
15.6.4 **Two-step process for selecting harder-to-press links and characters on non-touch-optimized (or touch-optimized) websites while using mobile**
15.6.4.1 **Eye-tracking two-step process for selecting a range of characters and words**
15.7 **Future: head-mounted display with eye tracking + armband, watch, ring, computer vision system, augmented reality system, etc. for input (helps work with limited space and limited gestures)**
15.7.1 **Watches – small screen real estate works with eye-tracking**
15.7.2 **Clothes: Makey Makey and OpenShades**
15.7.3 **Computer vision recognition of forearm (Minuum video) – overlay augmented reality keys on forearm**
15.7.4 **Gestures**
15.7.4.1 **While mobile: armbands, rings**
15.7.4.2 **While stationary: Tango, Haptix, Leap Motion Table mode**
15.7.4.2.1 **Haptix (transform any flat surface into a 3-D multitouch surface)**
15.7.4.2.2 **Leap Motion Table mode (interact with surfaces) – if using air gestures without eye tracking, many more-physically-demanding gestures to memorize**
15.7.4.2.3 **Project Tango and Movidius chip**
15.7.5 **Conclusion**
15.7.6 **Oculus Rift + eye tracking**
15.7.6.1 **Oculus Rift with the Haytham gaze tracker**
15.7.6.2 **Selecting and manipulating virtual objects with more speed and comfort, and less hand positions – e.g. Oculus Rift with camera for augmented reality**
15.7.6.3 **Navigating 20 virtual stock trading screens in Oculus Rift**
15.7.6.4 **Oculus Rift + eye tracking for traversal - non-gamers – “go-to-where-I’m-looking-at” e.g. eye-tracking wheelchair by researchers at Imperial College London **
16 **Conclusion**
17 **Communities**

1 **Consumer-level, developer eye trackers, and SDKs available**

A while back, Tobii started preorders for their EyeX Controller eye tracker, and their EyeX Engine and SDK:
https://www.youtube.com/watch?v=P8a46q6u8_s.

The price of Tobii’s package, which consists of the Tobii EyeX Controller, and the Tobii EyeX Engine and SDK, is $195.
Edit: it’s now $140.

Eye Tribe is another eye tracking company, and their eye tracker and software development kit is priced at $99:
https://www.youtube.com/watch?v=2q9DarPET0o.

(Eye Tribe is a spinoff of Gaze Group, a research group located at the IT University of Copenaghen).

2 **Eye tracking can remove physical limits**

I wanted to be a more active participant in this group, but my repetitive strain injury/tendinosis (chronic tendinitis) made me lose too much hand endurance for speech recognition to make up (I’ve had laryngitis twice).

Hopefully, eye tracking can allow many people in the disability community to control the computer more fully.

3 **First round of eye trackers intended for developers**

The eye trackers come with SDKs, and the first couple batches are meant for developers (it was kind of improper of me to order one for my disability, but I plan to program and develop, and the eye tracker can help out here).

4 **Eye tracking companies – Tobii and Eye Tribe**

The Tobii EyeX Controller, and the Tobii EyeX Engine and SDK are pricier than the eye tracker from Eye Tribe, but I got both because I wanted to try both of them, and I wanted to see what kind of support would be offered for each.
But mostly, it’s because a motor disability makes obtaining these devices at these prices an absolute bargain compared to what I’d actually pay for them.

Some people might recall Eye Tribe when they were Gaze Group, as they created the open-source ITU GazeTracker software for use with cameras.
The software allows low-cost webcams to become eye trackers, and provides a low-cost alternative to commercial gaze tracking systems.
Lots of students and hobbyists use the software, and it helped bring the technology to many more individuals.

(Gaze Group is a member of the nonprofit Communication by Gaze Interaction Association (COGAIN), a European network that aims to help citizens with motor impairments communicate by “developing new technologies and systems, improving existing gaze-based interaction techniques, and facilitating the implementation of systems for everyday communication”.

At one time, I was about to buy a high definition webcam, and was learning how to rip out the infrared filter so that I could pair the webcam with the GazeTracker software, but Eye Tribe revealed at CES 2013 that a product would come in the next several months, so I just decided to wait.
But the open-source ITU GazeTracker software is still available for anyone that is interested: http://www.gazegroup.org/downloads.
In the “downloads” section of Gaze Group’s website, there are already some accessibility eye-tracking applications to do some basic actions).

Umoove is another company that’s doing face tracking and eye-tracking, but I think you have to be selected in order to get access to a product and SDK.

Nevertheless, people are going to have to consider looking at all offerings.
Either way, I’m rooting for all companies to do very well.
Everyone still has to cooperate because a lot of people have yet to be aware of eye-tracking, or accept the technology as being useful.

5 **Eye-tracking (for initial warping/teleporting of cursor, and large cursor movements) + game controller (for precise cursor movements, finishing with an accurate cursor placement, and clicking)**

In the eye-tracking subreddit (http://www.reddit.com/r/EyeTracking), there’s a video of a redditor (has some RSI) controlling the desktop, and surfing Reddit with an eye tracker and a game controller (https://www.youtube.com/watch?v=2IjTZcbXYQY).
Eye gaze is for initial, instant, and possibly large cursor movements, and then the joystick of the controller overrides the gaze-control to offer an accurate selection of a target.
The controller buttons are for clicking.
https://github.com/gigertron/EyeTracker.

6 **Eye-tracking pointer motion teleport + mouse: use eye-tracking for initial warping/teleporting of cursor, and then use precision of mouse to finish selection – research paper on benefits of initially warping cursor**

Similar to the game controller example, eye tracking can be used to initially teleport a mouse-controlled cursor near an intended target.
Once there, the mouse can override eye-control when precision is needed.

(For large interface elements like Windows 8 tiles, the mouse may not be needed to accurately place the cursor in order to finish the interaction.
You might just keep your hands on the keyboard, and use a “click-where-I’m-looking-at” keyboard button).

6.1 **Mouse-cursor-teleport user setting: time that mouse controlled cursor must be in rest before eye control is involved again (mouse precision still in use)**

Although both the developer eye trackers of Eye Tribe and Tobii are not really meant for operating the computer out-of-the-box, they both have the option to turn on a basic gaze-controlled cursor that could be potentially used with a mouse.

Tobii has a time setting that determines how quickly a teleport-to-point-of-gaze-upon-movement-of-mouse will occur.
You can set the time that a mouse-controlled cursor has to be still before moving the mouse will cause a teleport.

You can decide the amount of time that the mouse has to sit still before eye control is involved again (return of eye control could mean that either gaze controls the cursor again, or the next movement of the mouse will warp/teleport the cursor to the point-of-gaze).
It’s for, “wait, I’m still using the mouse for stability and precision.
The mouse-controlled cursor is still working in this area”.

6.2 **Mouse-cursor-teleport user setting: point-of-gaze must be a certain distance from the mouse controlled cursor before eye control is involved again (eye-tracking is activated for larger cursor jumps)**

Another setting involves deciding the distance from the mouse-controlled cursor that the point-of-gaze has to be before gaze-teleporting is involved.
It’s for, “some of the targets are close enough, so I can just use the mouse.
I’ll save eye teleporting for when the distance is large”.).

6.3 **Research paper: “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks – competition between mouse vs. an eye tracker + mouse to click randomly generated buttons as fast as possible”**

A paper called “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks” evaluates how initially teleporting the cursor with eye tracking in other common human computer interaction can affect the interaction.
They find that adding the teleportation clearly makes computer interaction faster.
The authors have a video demonstration here: https://www.youtube.com/watch?v=7BhqRsIlROA.
A segment of the video has a scenario where “click-me” buttons are generated in random locations.
The task requires the user to click the buttons as fast as possible, and the test pits a mouse vs. an eye tracker + mouse.
You can see the performance of the eye-tracking warping + mouse at 2:41 of the video: http://youtu.be/7BhqRsIlROA?t=2m41s.

Abstract: “This work explores how to use gaze tracking to aid traditional cursor positioning methods with both the mouse and the keyboard during standard human computer interaction (HCI) tasks.
The proposed approach consists of eliminating a large portion of the manual effort involved in cursor movement by warping the cursor to the estimated point of regard (PoR) of the user on the screen as estimated by video-oculography gaze tracking.
With the proposed approach, bringing the mouse cursor or the keyboard cursor to a target position on the screen still involves a manual task but the effort involved is substantially reduced in terms of mouse movement amplitude or number of keystrokes performed.
This is accomplished by the cursor warping from its original position on the screen to whatever position the user is looking at when a single keystroke or a slight mouse movement is detected.
The user adjust then the final fine-grained positioning of the cursor manually.
Requiring the user to be looking at the target position to bring the cursor there only requires marginal adaptation on the part of the user since most of the time, that is the default behavior during standard HCI.
This work has carried out an extensive user study on the effects of cursor warping in common computer tasks involving cursor repositioning.
The results show how cursor warping using gaze tracking information can speed up and reduced the physical effort required to complete several common computer tasks: mouse/trackpad target acquisition, text cursor positioning, mouse/trackpad/keyboard based text selection and drag and drop operations.
The effects of gaze tracking and cursor warping on some of these tasks have never been studied before.
The results show unequivocally that cursor warping using gaze tracking data can significantly speed up and reduce the manual effort involved in HCI for most, but not all, of the previously listed tasks.”.

In summary, adding eye-tracking undoubtedly speeds up computer interaction.

7 **Eye tracking working with speech recognition**

7.1 **Adding context, and limiting the number of selection choices that a “select-what-I-say" speech command will match**

As Frank mentioned, it would be very convenient if you could use the tracker to narrow down the context to what’s near the current point-of-gaze.
You could limit the targets for speech to match, and consequently increase the accuracy of speech recognition selection.

7.2 **Fast speech corrections with “correct-what-I’m-looking-at” button**

A lot of the time, when you produce text with speech on Android, there will be a grey line underneath the text that you just produced.
It indicates that you can select it to optionally see a drop-down list of alternatives, just in case you need to correct.
Since I’m using the speech recognition on a touchscreen, it’s fast and not a bother to touch in order to do corrections.
If desired corrections appear more often in lists, touching is usually quicker than doing corrections in Dragon on a desktop.

If you have a vertically propped up tablet with an external keyboard, a notebook, or a laptop with an eye tracker, you could remap a keyboard button to be the “correct-what-I’m-looking-at” button, and get fast corrections on a vertical monitor.

7.3 **Google Earth - eye tracker to get location-based information, and speech recognition to initiate an action according to location e.g. what city is this?**

Olav Hermansen made a program that uses the Eye Tribe eye tracker and speech recognition to control Google Earth.
He posted a YouTube video here: https://www.youtube.com/watch?v=zNpll95MvnE.
This is the video description: “By combining an eye tracker and speech recognition you can rule the world.
Or at least navigate Google Earth.
The idea is to use an eye tracker to get location based information from Google Earth and use speech recognition to initiate an action accordingly to location.
By looking at Google Earth and give voice commands like "zoom in", "zoom out", "center" and "what country is this" will navigate Google Earth accordingly to where the user is looking.”.

http://olavz.com/rule-the-world-with-ey ... cognition/.

7.4 **Commands that use both speech and eye input – e.g. select range of text lines**

7.4.1 **Detecting interface elements – Vimium, VimFx, LabelControl, et al.**

The VimFx, and Mouseless Browsing add-ons for Firefox, the Vimium extension for Chrome, the Show Numbers command for Windows speech recognition, the Show Numbers Plus application, and the LabelControl AutoHotkey script can overlay buttons, controls, and other interface elements with a number and/or letters.
You can access an interface control at any time by inputting the number and/or letter ID that belong to one of them.
(E.g. here is a GIF of the LabelControl AutoHotkey script with its numbers: i.imgur.com/INB0Jt1.gif).
(LabelControl AutoHotkey script: http://www.donationcoder.com/Software/S ... ontrol.ahk or http://www.donationcoder.com/Software/S ... ontrol.exe).
(E.g. here is a timestamp for a video that shows the letters of Vimium: http://youtu.be/t67Sn0RGK54?t=23s.
Here’s a picture of the letters: http://i.imgur.com/YxRok5K.jpeg).

7.4.2 **Select range of text lines**

If numbers were bound to text lines, you could possibly have a speech command select a range of lines by saying something like, “select <number that corresponds to line that begins the selection range> <number that corresponds to line that ends the selection range>” e.g. “select 17 39”.

Let’s say that you were to do a similar selection command, but instead, you mix in eye tracking.
You could just look at the starting line, and say “select line range” to select that starting line.
Move your gaze to the line that ends the selection, and then it would select that ending line to complete the selection range, with no declaration of the numbers involved.

7.4.3 **Advantages of eye tracking and speech in range selection command**

If you have numbers overlaying things like classes, members, lines, tokens, characters, and more in combination, packing in too many numbers could clutter your view (e.g after you say “select comma” in Dragon NaturallySpeaking, Dragon will highlight all visible instances of a comma by attaching a number to each instance: http://i.imgur.com/VeomWwK.jpg).
Bringing in eye tracking can help you avoid that by not needing the numbers.

Eye input naturally takes over the highlighting step of a command.

Lastly, combining eye tracking with speech commands can let you complete commands with less use of your voice.

8 **Make eye tracking work with existing, smaller-element applications**

Touch usually has similar demands for widget sizes as eye-tracking does.
Large buttons are good for the restlessness of an eye-controlled cursor, and thicker fingers.

More and more applications will be built for eye tracking and touch.
Until then, there are some possible ways to make programs indirectly compatible with eye tracking.

8.1 **Magnification/zoom for clarification of smaller, close-together elements**

8.1.1 **Magnification in eye-tracking interfaces**

To deal with some of the impreciseness of eye tracking, some eye tracking interfaces have a magnification feature (known as “Gaze Clarification” to some) for selecting small targets.
As you look at your intended target, a portion of the screen that includes that target becomes magnified, and the zoom level keeps increasing and increasing as you home in on your desired spot.

8.1.2 **Zoom in touch interfaces e.g. Chrome on Android**

Touch interfaces have a similar feature.
When you touch multiple web elements simultaneously, perhaps due to the widgets being too small, and/or too close together, the area around where you touched pops out, giving you a magnified view of that area.
You can then clarify the interface control that you intended to touch.
Here’s a picture of magnification in Chrome on Android: (http://i.stack.imgur.com/95nHA.png).

8.1.3 **E.g. of zoom in PCEye interface**

You can see a demonstration of gaze clarification, and magnification at 4:22 of this video: http://youtu.be/6n38nQQOt8U?t=4m22s.

The video shows the interface for controlling Windows for Tobii’s PCEye (I think the tracker sells for around $2000.
Thank goodness for rapidly dropping technology prices).

Docked to the right is a vertical menu bar (it’s kind of like the Windows 8 Charm Bar, except that I don’t think it’s context-sensitive) which has icons for actions such as left click, double-click, drag-and-drop, etc..
At the 4:22 timestamp, the user dwells and fixates on the left click icon, moves the gaze to the target, magnification begins, and then a left click is executed.

8.1.4 **Multiple magnifications for commands that require multiple user input steps**

So for the above “select line range” speech command that involves eye tracking, there could be a magnification and clarification before selecting the line that starts the range, and a magnification before selecting the line that will end the range.

To get a visual idea of this, a process that involves two separate magnifications can be found at 6:00 (http://youtu.be/6n38nQQOt8U?t=6m) of the PCEye video with the demonstration of drag-and-drop.
The drag-and-drop icon is first activated from the vertical menu bar on the right.
The user moves their gaze to the first target, magnification occurs, and the target is selected and picked up.
The user moves their focus to the second target, which is a folder where the first selected object will be dropped, magnification occurs, and the first target, which is a Word document, is placed into the folder.

8.2 **Make interface elements of existing applications responsive to gaze e.g. detecting words on a webpage**

The eye tracker SDKs can be used for building new programs that are compatible with eye tracking, but I think they can also make interface controls of existing desktop and web applications react to gaze.

8.2.1 **e.g. research paper: detecting words on a webpage**

According to an eye tracking research paper called “The Text 2.0 Framework
Writing Web-Based Gaze-Controlled Realtime Applications Quickly and Easily” (http://gbuscher.com/publications/Bieder ... mework.pdf), there is a way to make words on webpages responsive to gaze: “Updates of the page’s general geometry are observed by a JavaScript function that tags modified DOM elements, sorts them into buckets, rechecks these buckets periodically, and batch-transmits changes back to the plugin.
The problem of word access is handled by a process we call spanification that segments text nodes consisting of multiple words into a set of span nodes containing one word each.
Using this technique we are able to obtain the bounding boxes for every word on the web page”.

8.3 **Cursor snapping to interface elements**

8.3.1 **Dwell Clicker 2**

There is a program called Dwell Clicker 2 (http://sensorysoftware.com/more-assisti ... clicker-2/).
It apparently works with a headpointer, joystick, and the Eye Tribe tracker, which was recently tested.
It “allows you to use a mouse or other pointing device without clicking buttons”.
It allows you to snap your clicks to targets.
“Target snapping is a feature that makes it easier to click on specific elements on the screen.
These elements include buttons, menu items and links.
Target snapping works by detecting elements near the pointer that you might want to click on, and locking onto the nearest element.”.

(There is a free and paid version, and I think the paid version has the snapping.)

Programs like this can provide a way to make an application unofficially compatible with eye tracking.

8.3.2 **EyeX for Windows**

On the Tobii forums, I read something about being able to let a “click-by-gaze snap to the nearest activatable interactor within range” in an application.
It’s possible that this is Tobii’s version of Dwell Clicker’s target snapping, and it might be found in EyeX for Windows, which is their computer control software.

Here is a response from the forums: “The EyeX for Windows software is designed for multi-modal use as a complement to keyboard, mouse, touchpad or other hand-based input.
We have no intent of adapting it for mono-modal use at the moment, but a third-party developer could potentially build a separate software to accomplish this, for example by using eye-based input to generate activation events for EyeX for Windows.
The precision should be enough, especially since EyeX for Windows is context-aware and can snap to clickable objects etc.”.

https://www.youtube.com/watch?v=1dFSZC2a5kI apparently shows a demo of Tobii EyeX for Windows, but I don’t see any snapping to elements in that particular video.
Windows 8, with its large buttons, is used in the video, so perhaps it’s not needed.

8.3.3 **MyGaze tracker – snapping to desktop interface elements, buttons of WizKeys on-screen keyboard, and Peggle game**

SpecialEffect (a charity that helps to bring gaming to those with disabilities) has a YouTube video that exhibits the MyGaze gaze tracker (a bit more expensive at €500).
At 1:50 of the video (http://youtu.be/2YQf_lmx_oI?t=1m50s), eye-tracking is turned on, and you can immediately see the cursor gravitate towards any interface elements that are nearby, such as desktop icons, and Windows controls.
At 2:12, a commercial on-screen keyboard called WizKeys is shown.
The cursor drifts towards the center area of keyboard buttons.
At the request of SpecialEffect, the people of MyGaze modified a game to work with eye-tracking, and Peggle was chosen.
At 5:25, you’ll see the cursor stick to close-by elements in the game.

8.4 **Generate/project large-button, touch/eye-tracking UI from existing non-touch/non-eye-tracking UI**

While trying to deal with non-eye-tracking elements, zooming and snapping might not consistently get good results, as the size and arrangement of elements can vary greatly.

To help, an additional process could involve projecting larger-sized, touch, Windows 8 Metro-like versions of non-touch/non-eye-tracking elements.
After an activation, a program would scan and detect the elements of an application (links, menus, menu items, drop-down list items, tabs et al.) that are near the point-of-gaze.
Then, the program would project larger-sized versions of those elements, while still somewhat preserving the layout.

It might operate on demand like Vimium, where you bring out the letter IDs for elements when needed, and then they disappear.
Similarly, you would bring out the alternate large-button elements, make the selection, and then continue in the regular, non-touch/non-eye-tracking interface.

8.4.1 **Tag elements near point-of-gaze w/ colors, IDs, and lines – project to large elements (screenshot and mock-up)**

Here’s a screenshot with a mock-up of the larger, colored elements that project from the smaller elements: http://i.imgur.com/3erfG6K.png.

8.4.2 **Only pop out alternate elements that are near point-of-gaze, instead of considering all the elements, like in Vimium**

Unlike Vimium, the elements that are acted upon to pop out larger elements in place could be just the ones that are near the point-of-gaze, instead of detecting everything, and putting IDs on every link.

8.4.3 **Can have less organization, and use flow layout for projected larger elements: easier to set up, and quick eye movements can still work with disorganization (eye-controlled cursor covers distances between scattered objects quickly)**

To faster set up this process, the projection program might disregard a lot of the structure and exact locations of the regular, non-touch elements of an application.
The projected, larger elements might be arranged randomly, and be packed together in a flow layout (an example of this can be seen in Gazespeaker, and eye-tracking programs that is mentioned below.
If you take its virtual keyboard, and set the positioning mode to automatic, the cells will order and position themselves to fill the available space, and cells will grow in size if other cells are deleted).
(Although, it would be better if the popped-out, larger elements somewhat mimic how the original, smaller elements are positioned relative to other smaller elements).

If it takes time to get proper organization, and one is still stuck with a disorderly arrangement of the elements, the instant movements of an eye-controlled cursor can help to handle a more cluttered interface.

8.4.4 **Possibly keep generated letters/numbers/IDs of Vimium-like, element-detection programs for labeling large elements**

It might be difficult to extract the necessary information to label the generated element.
E.g. you’re trying to project from a picture icon with no text label.
The generated letters and numbers of the element-detection programs could be useful here for automatic labeling the larger elements when they can’t be labeled.

You very briefly look at the character IDs that are attached to the smaller elements, and then when you pop out the larger elements, the smaller-element IDs are put on, and matched to their larger counterparts to more easily find them.

8.4.5 **Generate colors to match smaller elements with larger elements – similar to semantic color highlighting in programming editors**

The speed of this two-step selection process depends on how fast a person can track the location of projected larger elements.
In addition to IDs, the smaller elements that will be selected could be tagged with colors.
A color of a small element would be the same color as its corresponding large element.

The color matching would kind of be like the following code editor plug-ins that match similar tokens with the same color:

Color coder plug-in for Sublime: https://github.com/vprimachenko/Sublime-Colorcoder.
“this plugin for Sublime Text will highlight every variable in its own, consistent color — feature known as semantic highlighting, variable-name highlighting, contextual highlighting — you name it”.
(http://i.imgur.com/X4pu379.png).

“This Atom package enables semantic highlighting for JavaScript code.
Identifiers are highlighted in different colors (the same identifier always in the same color) while everything else (like language keywords) is displayed in various shades of gray”.
(http://i.imgur.com/Ae9OH6G.png).
https://atom.io/packages/language-javascript-semantic.

Color Identifiers mode for Emacs: https://github.com/ankurdave/color-identifiers-mode.
“Color Identifiers is a minor mode for Emacs that highlights each source code identifier uniquely based on its name.”.
(http://i.imgur.com/CLQk4Ov.png).

Polychromatic for Xcode: https://github.com/kolinkrewinkel/Polychromatic.
“By giving properties, ivars, and local variables each a unique, dynamic color, and stripping away color from types which do not need it, logic becomes clear and apparent”.
(http://i.imgur.com/dHMDo34.jpg).

Recognizer for Brackets: https://github.com/equiet/recognizer.
“Experimental implementation of semantic highlighting for JavaScript development”.

Most of these plug-ins were inspired by a post called “Coding in color” by Evan Brooks: (https://medium.com/programming-ideas-tu ... 6db2743a1e).
(picture of semantic highlighting in JavaScript: http://i.imgur.com/rQDcAQB.png).

Semantic highlighting has also been in the KDevelop IDE for many years.
http://zwabel.wordpress.com/2009/01/08/ ... hlighting/.
(http://i.imgur.com/MYVeKzW.png).

They make it possibly easier to quickly locate another instance of something, like a variable, so you can see where it comes from.
Colors could also make it easier to locate a popped-out element.

Since you may only have to detect, and project from smaller elements that are near the point-of-gaze, instead of all the elements on the screen, there would be fewer colors to generate.
This means that each color could be more distinct from other colors (I believe that another way to say this is that there is more distance between the perceptually equidistant colors), and there would be less incidents of using colors that are similar to each other, which are harder to distinguish.

8.4.6 **Displaying lines to connect smaller elements to larger elements**

Instead of using colors to match tokens, “the Racket Scheme environment performs a similar function by drawing lines from a selected token to other places where it is used”.
http://i.imgur.com/jsoDsPG.jpg.

Matching smaller elements to larger elements could involve both colors and lines.

8.4.7 **Conclusion – two-step projection process can still be fast**

A two-step pop-out process might seem slower, but with the ability to instantly eye-move the cursor before a selection, and not needing to reach for and move a mouse, the process may be faster in many instances.
Even without appropriate positioning of button projections, interacting with an eye tracker + keyboard in a crudely generated touch UI still has the potential to be faster than interacting with a mouse + keyboard in an organized, non-touch UI in certain situations.

8.4.8 **E.g. DesktopEye: open-source prototype for generating eye-tracking compatible versions of window processes**

There is an open source program on the Eye Tribe forums by Olav Hermansen: “The DesktopEye (lack of name creativity) is a prototype to easy access and changing between window processes.
It uses the eye tracker from The Eye Tribe to record eye position and trigger a menu to show when looking in the bottom left corner.
The menu will build itself from running processes present as windows on the desktop.
By looking at the menu entries, it will trigger the process to show and be brought to front.
When looking upwards and away from the menu will result in automatically closing the menu.
Eye tracking to switch between windows.
The project can be downloaded here as DesktopEye” [http://olavz.com/wp-content/uploads/2014/02/DesktopEye.zip].
(https://github.com/Olavz/DesktopEye).

Here is a picture of the windows of processes: http://i.imgur.com/DHuhv3e.png.
The windows at the bottom that are generated from the running processes, and represent processes are large and suitable for eye tracking.).

8.4.9 **E.g. generating speech-compatible versions of open windows**

There is a speech recognition add-on for Dragon NaturallySpeaking called VoiceComputer that I tried a couple years ago.
At 0:11 of the video, http://youtu.be/24ihD1dCK70?t=11s, a user brings up a window that has a list of the open windows that are on the desktop.
The difference with the items in this list is that they have numbers attached to them.
You can use a speech recognition command with a number, such as “switch to 14”.

DesktopEye mirrors window processes into substitutes that work with eye tracking, and VoiceComputer mirrors open windows into variants that work with speech recognition.

8.5 **Conclusion – give people a sample of a potentially speedier touch/eye-tracking interface**

Methods like magnification, snapping, and projection of a touch/eye-tracking UI could speed up computer control, give people a glimpse of the possibilities, and create more demand for applications that have layouts that are fully designed for eye tracking and touch.

#ux #design #ui #usability #tendinitis #tendonitis #repetitivestraininjury #rsi #repetitivestrain #a11y #Google #palaver #AAC #stroke #strokerecovery #spinalcordinjury #quadriplegia #googleglass #cerebralpalsy #hci #humancomputerinteraction
JeffKang
 
Posts: 129
Joined: 15 Feb 2014, 23:59

Return to Off topic



cron