Try bring speech recognition programmers to eye-tracking p2

Want to discuss stuff not related to The Eye Tribe Tracker? Do it here!

Try bring speech recognition programmers to eye-tracking p2

Postby JeffKang » 10 Jul 2014, 03:38

https://docs.google.com/document/d/1iN9 ... ejolA/edit

#eyetracking #speechrecognition #ux #ui #assistivetechnology #accessibility #tendinosis #disability #opensource

1 **Consumer-level, developer eye trackers, and SDKs available**
2 **Eye tracking can remove physical limits**
3 **First round of eye trackers intended for developers**
4 **Eye tracking companies – Tobii and Eye Tribe**
5 **Eye-tracking (for initial warping/teleporting of cursor, and large cursor movements) + game controller (for precise cursor movements, finishing with an accurate cursor placement, and clicking)**
6 **Eye-tracking pointer motion teleport + mouse: use eye-tracking for initial warping/teleporting of cursor, and then use precision of mouse to finish selection – research paper on benefits of initially warping cursor**
6.1 **Mouse-cursor-teleport user setting: time that mouse controlled cursor must be in rest before eye control is involved again (mouse precision still in use)**
6.2 **Mouse-cursor-teleport user setting: point-of-gaze must be a certain distance from the mouse controlled cursor before eye control is involved again (eye-tracking is activated for larger cursor jumps)**
6.3 **Research paper: “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks – competition between mouse vs. an eye tracker + mouse to click randomly generated buttons as fast as possible”**
7 **Eye tracking working with speech recognition**
7.1 **Adding context, and limiting the number of selection choices that a “select-what-I-say" speech command will match**
7.2 **Fast speech corrections with “correct-what-I’m-looking-at” button**
7.3 **Google Earth - eye tracker to get location-based information, and speech recognition to initiate an action according to location e.g. what city is this?**
7.4 **Commands that use both speech and eye input – e.g. select range of text lines**
7.4.1 **Detecting interface elements – Vimium, VimFx, LabelControl, et al.**
7.4.2 **Select range of text lines**
7.4.3 **Advantages of eye tracking and speech in range selection command**
8 **Make eye tracking work with existing, smaller-element applications**
8.1 **Magnification/zoom for clarification of smaller, close-together elements**
8.1.1 **Magnification in eye-tracking interfaces**
8.1.2 **Zoom in touch interfaces e.g. Chrome on Android**
8.1.3 **E.g. of zoom in PCEye interface**
8.1.4 **Multiple magnifications for commands that require multiple user input steps**
8.2 **Make interface elements of existing applications responsive to gaze e.g. detecting words on a webpage**
8.2.1 **e.g. research paper: detecting words on a webpage**
8.3 **Cursor snapping to interface elements**
8.3.1 **Dwell Clicker 2**
8.3.2 **EyeX for Windows**
8.3.3 **MyGaze tracker – snapping to desktop interface elements, buttons of WizKeys on-screen keyboard, and Peggle game**
8.4 **Generate/project large-button, touch/eye-tracking UI from existing non-touch/non-eye-tracking UI**
8.4.1 **Tag elements near point-of-gaze w/ colors, IDs, and lines – project to large elements (screenshot and mock-up)**
8.4.2 **Only pop out alternate elements that are near point-of-gaze, instead of considering all the elements, like in Vimium**
8.4.3 **Can have less organization, and use flow layout for projected larger elements: easier to set up, and quick eye movements can still work with disorganization (eye-controlled cursor covers distances between scattered objects quickly)**
8.4.4 **Possibly keep generated letters/numbers/IDs of Vimium-like, element-detection programs for labeling large elements**
8.4.5 **Generate colors to match smaller elements with larger elements – similar to semantic color highlighting in programming editors**
8.4.6 **Displaying lines to connect smaller elements to larger elements**
8.4.7 **Conclusion – two-step projection process can still be fast**
8.4.8 **E.g. DesktopEye: open-source prototype for generating eye-tracking compatible versions of window processes**
8.4.9 **E.g. generating speech-compatible versions of open windows**
8.5 **Conclusion – give people a sample of a potentially speedier touch/eye-tracking interface**
9 **Current interfaces, and open-source interfaces for eye-tracking**
9.1 **Interfaces by eye tracking organizations: PCEye, EyeX, GazeMouse, GazeTalk**
9.1.1 **Floating window, or detachable quick-access toolbar for quicker access?**
9.1.2 **GazeTalk – free, predictive text entry system**
9.1.2.1 **Notable feature of GazeTalk: accumulate dwell time: resume dwell time after looking away**
9.2 **Action upon slowing down, or stoppage of the cursor = eye is fixating**
9.2.1 **E.g. Activating widgets in GazeMouse, PCEye, and bkb**
9.2.2 **E.g. video demonstration: slow-down upon fixation to increase cursor steadiness in a game**
9.2.3 **E.g. Reverse cursor upon slowing down, or stoppage of the cursor (minimize overshooting) – found in video for paper, “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks”**
9.2.3.1 **Conclusion**
9.3 **Other software interfaces – free head tracking software: Camera Mouse, eViacam, and Mousetrap – head-tracking interfaces similar to eye-tracking interfaces**
9.4 **Open-source interfaces**
9.4.1 **Eye-tracking Python SDK for buttons + another scripting language (AutoHotkey, AutoIt, AutoKey, PYAHK, Sikuli, etc.) for macros**
9.4.1.1 **AutoIt example – “left-click-with-magnification”**
9.4.1.2 **Automatic generation of touch/eye alternatives for smaller elements, vs. manual creation of on-screen buttons for more complex macros**
9.4.2 **Open-Source keyboards**
9.4.2.1 **e.g. web-based virtual keyboard, and JavaScript framework**
9.4.2.2 **e.g. Virtual Keyboard using jQuery UI**
9.4.3 **Alt Controller – cursor going over designed regions can launch actions – e.g. page-down or scroll-down when gaze reachs a region at the bottom**
9.4.4 **DesktopEye: open-source prototype for switching between window processes**
9.4.5 **Open source, eye-tracking, predictive-typing software program by Washington State University student group, Team Gleason**
9.4.6 **bkb: open source application to control keyboard and mouse with the Tobii EyeX, The Eye Tribe gaze tracker, or an Airmouse (e.g. Leap Motion, Haptix) – has mouse actions, virtual keyboard, and automatic scrolling**
9.4.6.1 **Repeating click mode**
9.4.6.2 **Automatically scroll down when eyes reach bottom of window**
9.4.7 **Gazespeaker – design your own grids that have cells that launch actions, predictive keyboard, automatic scrolling, shareable grids, desktop or tablet**
9.4.7.1 **Work from within the program with built-in interfaces for eye-tracking e.g. custom email interface, and web browser – similar to The Grid 2, Maestro, and Vmax+ software**
9.4.7.2 **Customized grids with customized cells made with a visual editor (cells launch AutoHotkey, AutoKey, Autoit, PYAHK, Sikuli etc.?)**
9.4.7.3 **Sharing custom grids and cells – visual grids are easier to share, as opposed to sharing something like an AutoHotkey or Autoit script/text file**
9.4.8 **Eye-Tracking Universal Driver (ETU-Driver) – independence from particular eye tracking device**
9.5 **Conclusion**
10 **Google speech and accessibility**
10.1 **Growth of speech**
10.1.1 **Taking more and more context into account**
10.1.1.1 **Google Research: - “we are releasing scripts that convert a set of public data into a language model consisting of over a billion words”**
10.1.2 **Valuing speech recognition**
10.1.2.1 **E.g. “Spell Up - a new word game and Chrome Experiment that helps you improve your English”**
10.1.2.2 **Speech recognition in Android Wear, Android Auto, and Android TV**
10.1.3 **E.g. speech-to-text dictation is coming to the desktop version of Google Docs**
10.2 **Google accessibility**
11 **Eye tracking + Google speech recognition**
12 **Eye tracking potential depends on software – e.g. eye-typing**
12.1 **Video examples of eye-typing: PCEye, JavaScript web-based virtual keyboard, bkb, and Gazespeaker**
12.2 **E.g. of fast eye-typing: Eyegaze Edge**
12.3 **Thesis – review of the research conducted in the area of gaze-based text entry**
12.4 **Advanced software e.g. Swype, Fleksy, SwiftKey – eye-typing with Minuum on Google Glass**
12.4.1 **BBC video interview: Gal Sont, a programmer with ALS, creates Click2Speak, an on-screen keyboard that is powered by Swiftkey**
12.4.2 **Eye-typing with Minuum on Google Glass**
12.4.3 **Google software**
12.5 **Touch software and UIs (and thus close-to-eye-tracking interfaces) coming to desktops, laptops, notebooks – e.g. touchscreen Windows 8 laptops, foldable Chromebook, Project Athena in Chromium, Material Design, Android apps going to Chrome OS, HTC Volantis (Flounder/Nexus 9) 8.9" tablet**
12.6 **Auto-complete and auto-correct for eye-typing code**
12.6.1 **Mirror the drop-down list of auto-complete suggestions to the on-screen keyboard**
12.6.2 **Virtual keyboard indicator for number of drop-down list suggestions**
13 **Building eye-tracking applications by using web tools**
13.1 **Research paper: Text 2.0 Framework: Writing Web-Based Gaze-Controlled Realtime Applications Quickly and Easily e.g. interactive reading: words disappear when you skim them**
13.2 **Text 2.0 framework (create eye tracking apps using HTML, CSS and JavaScript) now known as gaze.io**
13.3 **Advantages of using web technology**
13.4 **Tobii EyeX Chrome extension, and JavaScript API**
13.5 **Pupil: web-based virtual keyboard, and JavaScript framework**
13.6 **Conclusion**
14 **Future of eye tracking**
14.1 **Eye tracking patents, and possible, future support from larger companies**
14.2 **OpenShades – Google Glass eye tracking – WearScript: JavaScript on Glass**
14.3 **Augmented reality – future AR: manipulating virtual objects, disability profiles, overlay forearm with buttons (Minuum video), current AR: labeling objects (OpenShades)**
14.3.1 **Manipulating virtual objects, invisible disability profiles**
14.3.2 **Augmented reality Minuum keyboard on forearm**
14.3.3 **OpenShades augmented reality**
15 **Advantages of eye tracking:**
15.1 **Already using your eyes**
15.2 **Comfort and ergonomics**
15.2.1 **Vertical touchscreen pain**
15.3 **Augmentation, not replacement – “click-where-I’m-looking-at” keyboard button, or cursor teleport before using mouse**
15.4 **Bringing speed, recognition, and malleability of virtual buttons in a touch UI to desktop users and vertical touchscreen users that are using a non-touch UI**
15.4.1 **e.g. Control + <whatever> = action vs. visual shortcut: button that is labeled with action**
15.4.2 **Sharing easily learnable scripts and visual interfaces e.g. sharing Gazespeaker grids as XML files**
15.5 **Using only a keyboard for maximum speed**
15.5.1 **Elements that are hard to access, or access quickly by keyboard**
15.5.2 **E.g. editing code without using a mouse: “fold-the-block-of-code-that-I’m-looking-at”**
15.6 **Mobile touch: eye-highlighting + only needing a few buttons (e.g. “single-tap-where-I’m-looking”, “double-tap-where-I’m-looking”) – hands-free scrolling – vertical mobile touchscreen – two-step process for selecting smaller elements, like text links, on non-touch-optimized websites while using mobile**
15.6.1 **Touch gestures + “touch-where-I’m-looking” buttons vs. touch gestures alone vs. mouse-clicking on a desktop**
15.6.1.1 **Advantages of eye tracking + few function buttons: speed, comfort, and less finger and hand movement – single-taps and tablets**
15.6.1.2 **Example apps on mobile**
15.6.1.2.1 **Customizing the Android Navigation Bar (easy-to-reach buttons)**
15.6.1.2.2 **Eye Tribe’s Android demo: tap anywhere on the screen**
15.6.1.2.3 **Launchers that require more than just taps i.e. swiping, double taps – replace with eye tracking + single-taps**
15.6.1.2.4 **Eye Tribe’s corner thumb buttons**
15.6.2 **Vertical touchscreen + “tap-where-I’m-looking” button**
15.6.3 **Hands-free interaction: while eating, while using another computer, etc.**
15.6.4 **Two-step process for selecting harder-to-press links and characters on non-touch-optimized (or touch-optimized) websites while using mobile**
15.6.4.1 **Eye-tracking two-step process for selecting a range of characters and words**
15.7 **Future: head-mounted display with eye tracking + armband, watch, ring, computer vision system, augmented reality system, etc. for input (helps work with limited space and limited gestures)**
15.7.1 **Watches – small screen real estate works with eye-tracking**
15.7.2 **Clothes: Makey Makey and OpenShades**
15.7.3 **Computer vision recognition of forearm (Minuum video) – overlay augmented reality keys on forearm**
15.7.4 **Gestures**
15.7.4.1 **While mobile: armbands, rings**
15.7.4.2 **While stationary: Tango, Haptix, Leap Motion Table mode**
15.7.4.2.1 **Haptix (transform any flat surface into a 3-D multitouch surface)**
15.7.4.2.2 **Leap Motion Table mode (interact with surfaces) – if using air gestures without eye tracking, many more-physically-demanding gestures to memorize**
15.7.4.2.3 **Project Tango and Movidius chip**
15.7.5 **Conclusion**
15.7.6 **Oculus Rift + eye tracking**
15.7.6.1 **Oculus Rift with the Haytham gaze tracker**
15.7.6.2 **Selecting and manipulating virtual objects with more speed and comfort, and less hand positions – e.g. Oculus Rift with camera for augmented reality**
15.7.6.3 **Navigating 20 virtual stock trading screens in Oculus Rift**
15.7.6.4 **Oculus Rift + eye tracking for traversal - non-gamers – “go-to-where-I’m-looking-at” e.g. eye-tracking wheelchair by researchers at Imperial College London **
16 **Conclusion**
17 **Communities**

9 **Current interfaces, and open-source interfaces for eye-tracking**

9.1 **Interfaces by eye tracking organizations: PCEye, EyeX, GazeMouse, GazeTalk**

I’ve read some online posts, and I believe that PCEye’s “Tobii Windows control” (can move the cursor, and click with your eyes only), and “EyeX for Windows” (move the cursor with your eyes, but use another input to click) might only be bundled with their respective eye trackers, and they will not be sold separately.

Gaze Group has a free gaze-based interface for simulating mouse clicks by gaze input only called GazeMouse (http://www.gazegroup.org/downloads).
It’s less polished, but it looks like it functions similarly to PCEye and bkb’s (eye-tracking application mentioned later) interface.

(One thing that I’ve noticed with GazeMouse is that choosing a command, like left click, means that it will be continually be repeated when the cursor stops (thinks that the user is fixating) after the cursor has been moving.
Every movement and subsequent stoppage of the cursor will produce an action until you dwell on/mouse-over a “pause” widget in a docked vertical menu bar (similar to PCEye’s bar).
On the other hand, with PCEyen and bkb’s interface, you keep going back to a widget in the vertical menu bar for each action.

You don’t need an eye tracker to test out GazeMouse, as you can just use your mouse to mouse-over the widgets in order to activate them.
Stopping mouse movement will simulate a fixation, and will resultantly apply the action).

9.1.1 **Floating window, or detachable quick-access toolbar for quicker access?**

The interaction on both programs might speed up if they had a movable, floating window like Paint.net’s floating “Tools” window to optionally house shortcuts of the widgets for quicker access (the gaze doesn’t have to go all the way to the edges of the screen).

Edit: I found an older video of what appears to be an older version of the PCEye software.
At 56 seconds in (http://youtu.be/euBDysPgRPQ?t=56s), you can see a floating window with 5 icons for actions.
You can pin the floating window to the edge of the screen, which shrinks it (small window with 2 small widgets).
Activating the small window from the edge brings out a full-screen version (large window with 9 large icons).
You can take the shrunken-down version of the window (2 widgets) out from the edge to make it a floating window again (5 widgets).

9.1.2 **GazeTalk – free, predictive text entry system**

GazeTalk is a free, predictive text entry system by Gaze Group: http://wiki.cogain.org/index.php/Gazetalk_About.

9.1.2.1 **Notable feature of GazeTalk: accumulate dwell time: resume dwell time after looking away**

If any jumpiness from eye tracking, or if a brief, voluntary withdrawal causes the fixation on an intended button to be interrupted, you could have the option to recognize a resumption of the dwelling, and continue building the dwell time for that particular button.
GazeTalk has an "Accumulate dwell time" function to “avoid the re-setting of a button”.
That is, you don’t have to start over again.
A successful activation of one button resets partial dwell time buildups in any other buttons.
(I’m you still not sure how to, or if you can transfer text out of GazeTalk to the window in focus.
With Dragon NaturallySpeaking, you can dictate into a Dictation Box, and then transfer the text.
Likewise, pressing the buttons in Windows’s on-screen keyboard transfers output to the current window in focus).

9.2 **Action upon slowing down, or stoppage of the cursor = eye is fixating**

9.2.1 **E.g. Activating widgets in GazeMouse, PCEye, and bkb**

In GazeMouse, bkb, and PCEye’s interface, a widget is selected as soon as the gaze slows down to a stop on that widget.

In the video, http://youtu.be/EIGq7oV5T8A?t=2m49s, a user plays a point-and-click adventure game with an eye tracker.
At 2:49 of the video, the user talks about the stream of eye tracking data that is being shown, and in that stream, there are “Gaze Data” coordinates for when the point-of-gaze is in motion, and “Fixation Data” coordinates for when the point-of-gaze is at rest.
I’m presuming that fixation might be detected when the latest and last several eye coordinates are deemed to be close together

9.2.2 **E.g. video demonstration: slow-down upon fixation to increase cursor steadiness in a game**

In another game video, the same user writes a program to detect when movement of the point-of-gaze starts to slow down into fixation, and then further slows it down to increase steadiness (at 1:08 (http://youtu.be/_bz-4ZNs5tg?t=1m08s), there is a “Slow Mode/Slow Zone”.
With a big change, the cursor moves quickly as usual.
If the new eye point is less than 100 pixels away, then it doesn’t go as fast; there’s a damp response.).

9.2.3 **E.g. Reverse cursor upon slowing down, or stoppage of the cursor (minimize overshooting) – found in video for paper, “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks”**

The cursor movement change upon slowing down could be similar to what is talked about at 3:34 of the video for the paper, “Mouse and Keyboard Cursor Warping to Accelerate and Reduce the Effort of Routine HCI Input Tasks”: http://youtu.be/7BhqRsIlROA?t=3m34s.

“To minimize overshooting of the mouse cursor when placed over the target, according to the gaze estimation area, an inertia compensation mechanism can be put in place by which instead of placing the cursor at the center of the gaze estimation area, the cursor position can be offset towards the incoming path of the manual mouse activation vector.
This technique increases the time required to perform target acquisition, but makes the cursor appear in motion toward the target after the warping occurs, which some users find convenient”.

9.2.3.1 **Conclusion**

I think these actions that occur when a cursor slows down is a way to simulate fixating on an object that’s designed to respond to eye-tracking.
I believe that a proper eye tracking interface means that the application’s objects know where the point-of-gaze is, even if it’s not necessarily near an object.

If you could just use mouse-hover events for fixating and dwelling, where would the eye tracking SDKs come in?

I’m not sure how much of the eye tracking SDKs and APIs are involved in software like GazeMouse, if any.
You can after all use GazeMouse by normally moving the mouse.

9.3 **Other software interfaces – free head tracking software: Camera Mouse, eViacam, and Mousetrap – head-tracking interfaces similar to eye-tracking interfaces**

In addition to the programs mentioned above, the nonprofit organization behind Camera Mouse, a free program for controlling the cursor with head-tracking, gives brief reviews of some of the applications that work well with it, such as Dwell Clicker 2, on their website: http://www.cameramouse.org/downloads.html.
Applications like these also work well with eye trackers, as head-tracking interfaces also use large interface elements.

Two other free head-tracking programs are eViacam and Mousetrap (both are open-source, and work on Linux).

9.4 **Open-source interfaces**

I’m sure that there can be good open source interfaces for controlling the PC that function similarly to Tobii’s interfaces, Dwell Clicker 2, and GazeMouse (it’s free, but I’m not sure if its open-source).

9.4.1 **Eye-tracking Python SDK for buttons + another scripting language (AutoHotkey, AutoIt, AutoKey, PYAHK, Sikuli, etc.) for macros**

E.g. I’m wondering if an option to create similar widgets and interfaces is to use a GUI tool like wxpython or Qt Designer to design the buttons.
I read that Eye Tribe might have a Python SDK in the near future, and I think that there are Python API wrappers that are already available: https://github.com/baekgaard/peyetribe, https://github.com/sjobeek/pytribe, and https://github.com/esdalmaijer/PyTribe).
You could then use a scripting language like AutoIt, AutoHotkey, AutoKey, PYAHK, or Sikuli to run the scripts and functions.

(AutoIt has a macro recorder called Au3Record).

(Sikuli uses the open-source Tesseract optical character recognition program.
Instead of writing keystrokes to access the interface elements that could be involved in macros, you just take screenshots of the interface elements.
e.g. click <screenshot of interface element>.
Here is a picture of the in-line screenshots that are used in Sikuli scripting http://i.imgur.com/2dqGSPr.png).

9.4.1.1 **AutoIt example – “left-click-with-magnification”**

E.g. Dwell on a widget with a text label of “left-click-with-magnification”, and it will activate an AutoIt script of:

Send("{LWIN}{+}") ; Windows key + Plus Sign(+) is the shortcut to use the built-in zoom of Windows
MouseClick("left")
Send("{LWIN}{+}") ; zoom back out

9.4.1.2 **Automatic generation of touch/eye alternatives for smaller elements, vs. manual creation of on-screen buttons for more complex macros**

Buttons for things like menu items could be automatically generated by other methods, like the projection concept that was mentioned above.
However, accessing common elements with a “keyboard + mouse + eye tracking cursor teleport” could be on par with “keyboard + eye tracking two-step projection process”, so using “keyboard + mouse + eye tracking cursor teleport” could be fine for a lot of the time.

Nevertheless, custom macros that are activated with keyboard + eye tracking have the potential to be the quickest.
The more complex a macro is, the more steps can be avoided.

Eye-tracking can make macros more commonplace because eye-tracking allows for easier activation, and thus more use of custom widgets and on-screen buttons that are usually labeled.
A collection of custom on-screen macro buttons with recognizable text labels is easier to maintain than a collection of Control + Alt + Shift + <whatever> keyboard shortcuts for activating macros.

9.4.2 **Open-Source keyboards**

From brief research, and taking keyboards as an example, there are open-source projects like Gnome, KDE, Ubuntu Linux, and Chrome OS’s on-screen keyboard/virtual keyboards.
I also noticed a few other open-source on-screen keyboards in online repositories that are written in JavaScript and Python.

9.4.2.1 **e.g. web-based virtual keyboard, and JavaScript framework**

Ignacio Freiberg from the eye-tracking subreddit (http://www.reddit.com/r/EyeTracking) put up a video that demonstrates typing on a web-based virtual keyboard (great sound effects!): https://www.youtube.com/watch?v=JoIMzfIKVDI.
It is apparently powered by Pupil, an open source, mobile eye tracking hardware and software platform, and the particular keyboard is based on an open source JavaScript framework.

9.4.2.2 **e.g. Virtual Keyboard using jQuery UI**

jQuery on-screen virtual keyboard plugin:

https://github.com/Mottie/Keyboard

Features:

“An on-screen virtual keyboard embedded within the browser window which will popup when a specified entry field is focused.
Autocomplete extension will integrate this keyboard plugin with jQuery UI's autocomplete widget.
Position the keyboard in any location around the element, or target another element on the page.
Easily modify the key text to any language or symbol.”.
Etc.

9.4.3 **Alt Controller – cursor going over designed regions can launch actions – e.g. page-down or scroll-down when gaze reachs a region at the bottom**

“Alt Controller is free Open Source software to help make PC games more accessible.
It lets you map computer inputs (like mouse pointer movements) to actions (like key presses) in order to create alternative controls.” http://altcontroller.net/.

This application works very well with less accurate eye trackers which may have more imprecision, as you can create large regions where actions occur when the mouse pointer moves inside.

One of the eye tracking features that has been showcased, and is already available on mobile prototypes is the ability to have a page scroll-down or page-down when the eyes approach text that is at the bottom of the page (I think this feature alone is worth more than the $5 cost to integrate an eye tracker).
Accuracy issues don’t matter as much because you can decide for yourself how big the “bottom of the window” will be.
You could simulate this with Alt Controller by defining a large horizontal rectangle at the bottom of the screen.

9.4.4 **DesktopEye: open-source prototype for switching between window processes**

As mentioned above, DesktopEye, a prototype for switching between window processes, is open source.
(https://github.com/Olavz/DesktopEye).

9.4.5 **Open source, eye-tracking, predictive-typing software program by Washington State University student group, Team Gleason**

“Fifteen competing senior design teams from EECS displayed their posters in the halls of the department on April 24th.
The winning team, Team Gleason, was chosen based on their poster, their project as a whole, and their presentation.

Team Gleason has been developing a reliable predictive-typing software program which runs on a generic Android or Windows-8 tablet; and uses two hardware platforms for eye tracking: The Eye Tribe and The Pupil.”

http://school.eecs.wsu.edu/story_team_g ... ior_design

“WSU’s “World Class, Face to Face” students and faculty will develop inexpensive technology and release it under open source license with no royalties”

http://teamgleason.eecs.wsu.edu/

“Like a smartphone’s auto-complete function, it anticipates a word or phrase based on a couple of letters.
Currently, the students are putting the software on PUPIL, a 3-D printed set of glasses that connects to a computer to translate eye movement into computer action”.

http://wsm.wsu.edu/s/index.php?id=1097

9.4.6 **bkb: open source application to control keyboard and mouse with the Tobii EyeX, The Eye Tribe gaze tracker, or an Airmouse (e.g. Leap Motion, Haptix) – has mouse actions, virtual keyboard, and automatic scrolling**

There is an open source application that is called bkb to control a computer (by MastaLomaster).
It can be found here: https://github.com/MastaLomaster/bkb

The page says that the program works with the Tobii EyeX, The Eye Tribe gaze tracker, or an Airmouse (e.g. Leap Motion, Haptix).
(There is a video demonstration on that github page: https://www.youtube.com/watch?v=O68C4d2SNC8.
The video is labeled in Russian, but if you select Swahili as the closed captioning language in YouTube, you’ll get English).

With bkb, you dwell/fixate on widgets in a vertical menu bar that is docked on either the left or right side of your screen (there is a “switch sides” widget).
There are widgets for single clicking, double-clicking, etc..
There is also a virtual, on-screen keyboard that can be brought out.

It looks like bkb functions similarly to GazeMouse, and the PCEye software (although, it appears to work more comparably to the PCEye software since there is a virtual keyboard and “activate eye-scroll” widget, and you keep going back to an icon in the menu bar for each action).

9.4.6.1 **Repeating click mode**

Edit: the author says that a “repeating click” mode was added.
This was mentioned above with GazeMouse, where an action can be repeated every time that the point-of-gaze comes to a stop, and you don’t have to keep going back to a widget in the side menu bar.

9.4.6.2 **Automatically scroll down when eyes reach bottom of window**

Even though bkb is an accessibility program, it has a scroll widget that almost anyone could benefit from.
When your eyes reach the bottom of a window, the window automatically scrolls down, and vice versa.

9.4.7 **Gazespeaker – design your own grids that have cells that launch actions, predictive keyboard, automatic scrolling, shareable grids, desktop or tablet**

Gazespeaker is another open source program for controlling the computer.

Some of the functionalities that are listed on its website include:

“display communication grids
integrated grid visual editor
write a text with an auto-adaptative predictive keyboard
read web pages on the internet
read ebooks (in html format)
read scanned books or comic strips”

http://www.gazespeaker.org/features/

It looks similar to the Maestro and Vmax+ software from Dynavox (makes “symbol-adapted special education software”), and The Grid 2, which is AAC (augmentative and alternative communication) software from Sensory Software (they also make Dwell Clicker 2).

I found the creator’s blog, and screenshots and info about the program when it was a work in progress can be found in a post from last year.
He/she mentions that ITU Gazetracker was used to test the program as it was being created.

9.4.7.1 **Work from within the program with built-in interfaces for eye-tracking e.g. custom email interface, and web browser – similar to The Grid 2, Maestro, and Vmax+ software**

It feels similar to GazeTalk, as the user works more within the program.
For example, you can enter a POP/SMTP server, and pull the data from an email service (?).
The program then provides a user with an email interface that works with eye-tracking (i.e. large buttons).
Gazespeaker can also pull websites into its own built-in web-viewer.
The browser is compatible with eye tracking, where scrolling down automatically occurs when a user’s gaze is at the bottom of a window.

Similarly, Gazetalk has its own email, web, and media viewer.

On the contrary, programs like bkb, PCEye, and GazeMouse try to assist the user in working with outside interfaces.
That is, they have features like magnification to deal with the more standard-sized elements.

9.4.7.2 **Customized grids with customized cells made with a visual editor (cells launch AutoHotkey, AutoKey, Autoit, PYAHK, Sikuli etc.?)**

One awesome feature of the program is the ability to design your own grids and cells with a visual editor.
Grids hold cells.
The software lets you define the dimensions of grids and cells, label and decorate cells, and you can have cells launch some of the predefined actions that the program provides.

(Perhaps in the future, the program could work with other programs like AutoHotkey, Autoit, AutoKey, PYAHK, Sikuli, etc..
In addition to launching predefined actions, the cells could launch more customized actions that can be built with the scripting programs).

9.4.7.3 **Sharing custom grids and cells – visual grids are easier to share, as opposed to sharing something like an AutoHotkey or Autoit script/text file**

On the website, it mentions that grids are stored in a standard XML file, and can be shared with other people.

I have some AutoHotkey scripts, and macros are launched by inputting keystrokes.
I wouldn’t bother to try sharing some of my scripts/text files, as I doubt anyone’s going to take the time to memorize the very personalized set of keystrokes.

Virtual buttons like Gazespeaker cells can have their labels customized, unlike physical keyboard buttons.

With an eye tracker, on-screen buttons like the cells are just as fast to activate as keyboard shortcuts.
You actually are touching the keyboard: look at the on-screen macro button, and then press a “click-what-I’m-looking-at” keyboard button.

E.g. Instead of memorizing and pressing something like Control + F6 to launch a favorite command, you could take a cell, stick and easily recognizable text label on it (could simply put the name of the command), and then activate the cell.

9.4.8 **Eye-Tracking Universal Driver (ETU-Driver) – independence from particular eye tracking device**

“Eye-Tracking Universal Driver (ETU-Driver) has been developed as a software layer to be used between actual eye tracker driver and end-user application to provide device-independent data access and control.”.

http://www.sis.uta.fi/~csolsp/projects.php

9.5 **Conclusion**

A program being open source could potentially speed up development, especially for any applications that have features like Dwell Clicker 2 or MyGaze’s element detection and target snapping, or bkb’s eye-scrolling, which any average person could benefit from.

It’s only now that eye tracking interfaces can really be examined because it’s only now that an eye tracker costs $100, and not a few thousand dollars.).

10 **Google speech and accessibility**

A while back, I read about some of this group’s disappointment with Nuance’s disinterest for user groups like this one (in fairness, some of the speech extensions offer functionalities that their products deliver, except that the extensions are free, and can sometimes do much more).
I then wrote a post about how the opening of Google’s speech APIs, in combination with some burgeoning applications that allowed the creation of commands with Google’s speech, could provide some alternatives.

10.1 **Growth of speech**

I’ll assume that Google’s speech, and its augmentative programs still meet very little of this community’s requirements.

However, keep in mind that only a year ago, when Palaver Speech Recognition, which is what the Google speech command program was named, was starting, it had just a few people in its Google plus community, but as of today, it has 338 members: https://plus.google.com/communities/117 ... 2112738135.

Google speech commands are also and especially gaining popularity in the mobile arena.

“Always-listening-for-speech” mode was a major selling point in the Moto X phone.
I’ve read that spellcheck corrections are something that Google considers to be a competitive advantage, so I’m sure they value, and want to advance speech recognition, as voice samples and corrections help improve the general recognition for everyone.

For dictation, I see Google dictation recognition recognize some of the most obscure terms that I try to throw at it.

I suppose that free speech recognition on the leading search engine, and without needing a powerful local processor = more users to submit their voice samples, and correction data.

10.1.1 **Taking more and more context into account**

In a recent test of Google speech recognition, I purposely bumbled the pronunciation of, “Targaryen”, from Game of Thrones.
I get Targaryen, but I sometimes get “dark urine”, and “Tiger rant”.
If you first say “Game of Thrones”, and then follow that up with the bumbled pronunciation of the “Targaryen” speech pattern, you are much more likely to get “Targaryen”.
More uncommon words like that won’t work in Dragon NaturallySpeaking, without context, or with context.
Google most likely has much more vocabulary and context data now.

10.1.1.1 **Google Research: - “we are releasing scripts that convert a set of public data into a language model consisting of over a billion words”**

Google Research: - “we are releasing scripts that convert a set of public data into a language model consisting of over a billion words”

http://googleresearch.blogspot.ca/2014/ ... guage.html

In a screenshot, a person swipes to type “New York” first: http://i.imgur.com/8PpCkoQ.png.

“the input patterns for “Yankees” and “takes” look very similar”.
“but in this context, the word, “New York”, is more likely to be followed by the word, “Yankees”, even though “Yankees” has a similar swipe pattern to “takes”.

10.1.2 **Valuing speech recognition**

“Even though Nuance has been in the voice recognition business for some time now, Google is quickly ramping up its own efforts.
Voice commands sit at the heart of Google Now, Google Glass, and the Moto X, and it has also hired renowned futurist Ray Kurzweil to work on language processing and artificial intelligence.
(Kurzweil, it’s worth noting, founded the digital imaging company that would become ScanSoft, which ended up merging with Nuance in 2005.)”

http://venturebeat.com/2013/09/08/how-n ... to-mobile/

10.1.2.1 **E.g. “Spell Up - a new word game and Chrome Experiment that helps you improve your English”**

“That's the idea behind Spell Up, a new word game and Chrome Experiment that helps you improve your English using your voice—and a modern browser, of course.
It’s like a virtual spelling bee, with a twist.

We worked with game designers and teachers to make Spell Up both fun and educational.
The goal of the game is to correctly spell the words you hear and stack them to build the highest word tower you can—letter by letter, word by word.
The higher the tower gets, the more difficult the word challenges: You’ll be asked to pronounce words correctly, solve word jumbles and guess mystery words.
You can earn bonuses and coins to level up faster.”.

http://chrome.blogspot.ca/2014/05/speak ... atest.html

At the same time, data is being gathered to keep improving speech recognition.

10.1.2.2 **Speech recognition in Android Wear, Android Auto, and Android TV**

Edit: Android Wear, Android Auto, and Android TV were demoed at Google I/O 2014, and recognition was one of the main interfaces for all these platforms.

Speech recognition was significantly used in Android Wear, where a small screen of a watch isn’t suitable for typing.

10.1.3 **E.g. speech-to-text dictation is coming to the desktop version of Google Docs**

Marques Brownlee
Shared publicly -
“You saw it here first - Google Docs is getting Voice typing.”

http://www.androidpolice.com/2014/03/03 ... ogle-docs/

10.2 **Google accessibility**

In my old post about Palaver, I also mentioned that Google seems to value accessibility more than Nuance.
In their latest developer conference, Google demonstrated some great open-source tools that make it easy to create accessible apps and websites, and some of them automatically fix any issues.

I still think Google targets the more visible disabilities, like blindness, (I know plenty of people with lower back pain, and I’m seeing a lot more people online discussing hand pain, especially with the relatively very new, and extreme pervasiveness of mobile computing.
Unfortunately, they are invisible ailments, and are harder to measure, so it doesn’t get the same awareness), but I’m sure motor and dexterity issues are on their radar, as I’ve seen posts about dexterity on Google’s accessibility Google group.

Google has forums where you can talk specifically about accessibility, and they actually have employees that frequent the groups.
If you look for Google I/O videos, there are talks exclusively for accessibility, presented by teams of people that work exclusively on accessibility.
Google I/O 2014 had 14 sessions on accessibility.

11 **Eye tracking + Google speech recognition**

Then again, attitude towards accessibility doesn’t matter if the software doesn’t come near meeting the functional necessities.
In terms of speech command capabilities that are required of users here, Dragon and its extensions are still dominant (for commands, but not as much for dictation, as Google is catching up, if not starting to exceed, in dictation).

However, the fact that eye tracking as an input can stand completely on its own, and Google and its technologies like speech recognition are growing rapidly means that combining the two could create an option that helps to level the playing field in an area where there are few choices.

(I had a bad experience with the accuracy of Windows speech recognition for dictation on Windows 7, but if it has improved on Windows 8, then eye tracking with Windows speech recognition could also be a viable combo).

12 **Eye tracking potential depends on software – e.g. eye-typing**

Eye tracking is able to be the sole input, but if you take an activity like typing words, I’ve seen that some of the methods of eye-typing on a virtual/soft keyboard can be slower, as you have to dwell on each letter for a certain amount of time.

For example, if you want to produce the word, “testing”, and you fixed the dwell time necessary for activation at 0.6 second, you would stare at the letter “T” for 0.6 second, stare at the letter “E” for 0.6 second, and so on, and so forth.
Although, a lot of the eye-tracking input software have predicted words that become available for you to auto complete or auto correct, and they really speed things up.

12.1 **Video examples of eye-typing: PCEye, JavaScript web-based virtual keyboard, bkb, and Gazespeaker**

Examples of eye-typing can be seen at 5:26 of the PCEye video: http://youtu.be/6n38nQQOt8U?t=5m26s, or, again, in Ignacio Freiberg’s video of a web-based virtual keyboard: https://www.youtube.com/watch?v=JoIMzfIKVDI.
Bkb (http://youtu.be/O68C4d2SNC8?t=1m20s) and Gazespeaker (http://youtu.be/03w7eIu6rY8?t=1m17s) also have virtual keyboards.

12.2 **E.g. of fast eye-typing: Eyegaze Edge**

Edit: someone recently posted a YouTube link, and actually, eye-typing can be pretty fast already (Eyegaze Edge): http://youtu.be/lY22CZ7XP-4?t=42s

12.3 **Thesis – review of the research conducted in the area of gaze-based text entry**

For more information about eye-tracking text entry, you can check out the thesis, “Text Entry by Eye Gaze” by Päivi Majaranta.
The “thesis provides an extensive review of the research conducted in the area of gaze-based text entry.
It summarizes results from several experiments that study various aspects of text entry by gaze.”: https://tampub.uta.fi/handle/10024/66483

12.4 **Advanced software e.g. Swype, Fleksy, SwiftKey – eye-typing with Minuum on Google Glass**

Depending on the software, eye-typing could be greatly hastened.
If you could grab an application like Swype, which allows you to speedily bounce from letter to letter without stopping, snatch applications like Fleksy and SwiftKey, which allow you to type without looking at the keyboard because they have immensely proficient word-prediction and auto-complete, and combine them with eye-tracking, then a feature like eye-typing, which currently isn’t even meant for the mainstream population, might not be so slow anymore.

12.4.1 **BBC video interview: Gal Sont, a programmer with ALS, creates Click2Speak, an on-screen keyboard that is powered by Swiftkey**

Gal Sont is a programmer that was diagnosed with ALS in 2009.
He created Click2Speak, an on-screen keyboard that is powered by Swiftkey.
http://www.bbc.co.uk/programmes/p021r01n
https://www.youtube.com/watch?v=WWMsPpBRV3A

Features:

Works with all standard Microsoft Windows applications.
Includes Swiftkey’s powerful features like the award-winning prediction engine, and 'Flow'.
Supports more than 60 languages.
Floats over other applications.
Includes advanced visual and audio features.
Auto-spacing and auto-capitalization.
Choose between different layouts and sizing options.
Contains Dwell feature that allows you to imitate a mouse click by hovering.

“After being diagnosed with the disease, I contacted other individuals who suffer from ALS at different stages, and began to learn about the different challenges that I would face as my disease progressed.
I also learned about the tech solutions they used to cope with these challenges.
The most basic challenge was typing, which is done using a virtual on screen keyboard, a common solution shared by not only individuals affected by ALS, but a variety of illnesses such as brain trauma, MS and spinal cord injuries victims.
The fully featured advanced on screen keyboards, again proved relatively very expensive (starting at $250), so I decided to develop the ultimate on screen keyboard on my own.
Through the development process, my own physical condition continued to deteriorate and I reached the point of needing to use these cameras and on screen keyboards myself.
I started with Microsoft’s 'ease of access’ keyboard that comes with windows.
This is an acceptable keyboard and it has a reasonable prediction engine.

For my own development needs I purchased the developer version of TOBII’s eye gaze camera.
This allowed me to code (with my eyes!) additional important features that were lacking in the Microsoft keyboard for eye control such as highlighted keys, virtual keys, auto scroll, right click, drag and much more.

It quickly became apparent that using our 'powered by Swiftkey’ keyboard enabled me to work faster and more accurately.
Friends who used other solutions prior to ours (not necessarily Microsoft’s) were delighted with the results, albeit a small sample size.

This started a new journey that introduced me to Swiftkey’s revolutionary technologies and how we customize them to our specific needs.
I reached a first version of our keyboard and distributed it to friends who also suffer from ALS.
They gave us invaluable feedback through the development process, and they all raved about its time saving capabilities and accuracy and how it makes their lives a little easier.
Even Swiftkey’s 'Flow’ feature is translated successfully to this environment; basically, it replaces the finger when using Swiftkey on an Android device with an eye/head/leg when using a PC/Tablet/laptop + camera/other input device + our Swiftkey powered keyboard installed.

At this point I had my good friend Dan join me in this endeavor as I needed help with detail design, quality assurance, market research, project management, and many other tasks.
We formed 'Click2Speak’, and we plan to make the world a better place! ...”.

http://www.click2speak.net/

12.4.2 **Eye-typing with Minuum on Google Glass**

Minuum is a one line keyboard for android smartphones (also has strong auto-correction), and the team behind it released a concept video that showed various ways of typing on Google Glass.
One method combined their virtual keyboard, and an eye tracker on the head-mounted device.
http://youtu.be/AjcHzO3-QEg?t=46s.

12.4.3 **Google software**

After the Android 4.4 KitKat update, and you can now swipe the space bar between swiping words in the stock Google keyboard.
This allows you to keep the gesture and flow going without letting go of the screen, which is apparently similar to the “Flow through Space” feature that SwiftKey has.
Android 4.4’s auto-correction is also extraordinarily good.

I’m sure that one can incorporate eye tracking interfaces with high-level APIs, such as perhaps using eye-typing with Google Docs’ semantic and spelling suggestions, just like Palaver utilized Google’s speech APIs.

Recently, Google launched an add-on store for Docs and Sheets, which will potentially give people access to a large collection of third-party tools, just like the Play store.

Many Google APIs are free, and they keep adding features that were once less accessible, like the newly released speech recognition for Google Docs, so the standard is continually being raised.

12.5 **Touch software and UIs (and thus close-to-eye-tracking interfaces) coming to desktops, laptops, notebooks – e.g. touchscreen Windows 8 laptops, foldable Chromebook, Project Athena in Chromium, Material Design, Android apps going to Chrome OS, HTC Volantis (Flounder/Nexus 9) 8.9" tablet**

The above mobile typing applications, and other touch-based applications will most likely come to the desktop, and will bring with them their larger, more-compatible-with-eye-tracking elements.

These applications could come through the digital distribution platforms that are on the desktop, such as the Chrome Web Store (the “For Your Desktop” section of the Chrome Web Store hold apps that run offline and outside the browser, and are known as Chrome Apps), as the number of notebooks with touch screens keeps rising (NPD DisplaySearch Quarterly Mobile PC Shipment and Forecast Report).

Foldable Chromebook

There are more touch/non-touch hybrid devices, such as the Lenovo ThinkPad Yoga 11e Chromebook, which folds 360° for tablet mode.
Lenovo’s N20p Chromebook flexes to 300° (laptop mode to standing mode).

Project Athena

There is a new window manager for Chromium called Project Athena, and it hints to more touch interaction on Chrome OS.

Android apps going to Chrome OS

Google announced at I/O that Android apps will be coming to Chrome OS.

Material Design

Material Design is a new design language that was introduced at Google I/O 2014 for achieving consistency across devices.
Instead of a design language that is just for Android phones and tablets, it will be for the web, Chrome OS, and other devices.

Android L 5.0

Android L 5.0 will make use of Material Design.

The new multitasking feature in Android L hints at a multi-window system. It is built on Chromium and has the same look as the windowing system on Chrome OS.

HTC Volantis (Flounder/Nexus 9) 8.9" tablet

I believe that eye tracking provides more benefits when it’s applied to a vertical touchscreen (e.g. ergonomic benefits), but I don’t believe that most people are using a vertically-set up Android device for heavy work.

The specs on the tablet (Tegra K1 64-bit) are powerful enough to handle Chrome OS.

Android Police reports that Volantis may come with official accessories like a keyboard case (http://www.androidpolice.com/2014/06/21 ... us-tablet/), and keyboards are currently and mainly used for Chrome OS, not Android.

It has a more narrow 4:3 aspect ratio, which is more suitable for general work (reading, writing, browsing, programming, image editing, etc.).
This is opposed to a wider 16:9 aspect ratio, which is more suitable for watching videos.
(In the "Google I/O 2013 - Cognitive Science and Design" talk, the speaker says that experiments show that people can be faster with reading longer lines, but a lot of people prefer, and are more comfortable with reading shorter and more narrow lines: http://youtu.be/z2exxj4COhU?t=23m29s.
Dyson and Kipping (1997), Dyson and Haselgrove (2001), Bernard, Fernandez, and Hull (2002), Ling and van Schaik (2006). samnabi/com.
I’m sort of making that happen in this post with my text replacement of “period” “space” with “period” “manual line break”, which puts each sentence on a new line, and shortens a lot of lines).

Perhaps Volantis could be a device that functions in a hybrid fashion like the Microsoft Surface.

These mentioned changes show that another touch/eye-tracking interface (Android) will make its way to a more work-oriented PC (Chrome OS device), and eye-tracking would see more usage.
On the other end, touch devices (e.g. Volantis tablet) with their touch UI will see more uses for productivity, as these devices add more power, and features like multitasking.
Also, when you use a computer for productivity, you’re more likely to use a vertical screen, and eye-tracking is more valuable when the screen is upright.
(Although, eye-tracking just for teleporting a mouse controlled cursor would still be advantageous for the typical non-touch, smaller-element-interface, work-oriented PC, so an eye tracker should still find a route into a work-oriented device regardless).

Air-swiping with the Nod ring

There is a YouTube video of a gesture ring that is called the Nod.
It shows a person watching TV, and he air-swipes to produce a search term on Netflix: http://youtu.be/dy-Ac9X9oSo?t=6s.

Rise of touchscreens for Windows 8, and rise of touchscreen laptops

“But they're taking off fast: in the US, touchscreen models account for a quarter of current Windows 8 laptop sales, says NPD, and Windows 8 boss Julie Larson-Green has said she expects virtually all laptops to have touchscreens soon.”

http://www.theguardian.com/technology/2 ... ptops-pain

12.6 **Auto-complete and auto-correct for eye-typing code**

I read that some people on the VoiceCoder forums don’t actually use much of the speech recognition extensions for programming (https://groups.yahoo.com/neo/groups/Voi ... opics/7781).
Instead, speech recognition is for producing characters, and the auto complete of IDEs do most of the work.

12.6.1 **Mirror the drop-down list of auto-complete suggestions to the on-screen keyboard**

If an eye tracking on-screen keyboard were to be used in an IDE, then hopefully, there’s a way to detect any drop-down list of code completion choices that appear after you start typing, and have the suggestions be mirrored to a location that is close to the keyboard.
It could be similar to the location of predicted natural language words in numerous keyboards, like Gazespeaker’s virtual keyboard, Windows’ on-screen keyboard, and Android’s stock keyboard, which usually place predictions on top of the keyboard.
(Suggestions that appear even closer to your view can be seen when you swipe with the Android keyboard.
Floating, continuously-building predicted words appear right on the letters that you swipe).

12.6.2 **Virtual keyboard indicator for number of drop-down list suggestions**

If one can’t mirror the drop-down list items to the virtual keyboard, perhaps there can be an indicator near the virtual keyboard instead.
Any open drop-down lists, and their items are detected, and the indicator displays a number that represents the amount of suggestions/drop-down list items that are currently available.
It could assist the user in deciding when to put focus back on the combo box.
As characters are typed, the number of suggestions would narrow down, and the indicator number would get smaller.
When the number of suggestions is low enough, a user could activate a virtual keyboard button that switches the view and focus to a drop-down list.
It could then be followed by a large-element, color projection (mentioned above) of the drop-down list of choices.

#ux #design #ui #usability #tendinitis #tendonitis #repetitivestraininjury #rsi #repetitivestrain #a11y #Google #palaver #AAC #stroke #strokerecovery #spinalcordinjury #quadriplegia #googleglass #cerebralpalsy #hci #humancomputerinteraction
JeffKang
 
Posts: 129
Joined: 15 Feb 2014, 23:59

Return to Off topic



cron