Academic Research

Gaze Typing in Virtual Reality: Impact of Keyboard Design, Selection Method, and Motion

Gaze tracking in virtual reality (VR) allows for hands-free text entry, but it has not yet been explored. We investigate how the keyboard design, selection method, and motion in the field of view may impact typing performance and user experience. We present two studies of people (n = 32) typing with gaze+dwell and gaze+click inputs in VR. In study 1, the typing keyboard was flat and within-view; in study 2, it was larger-than-view but curved. Both studies included a stationary and a dynamic motion conditions in the user's field of view.
Our findings suggest that 1) gaze typing in VR is viable but constrained, 2) the users perform best (10.15 WPM) when the entire keyboard is within-view; the larger-than-view keyboard (9.15 WPM) induces physical strain due to increased head movements, 3) motion in the field of view impacts the user's performance: users perform better while stationary than when in motion, and 4) gaze+click is better than dwell only (fixed at 550 ms) interaction.

A user gaze typing in a VR environment: A) FOVE VR headset with an eye tracking unit, B) a virtual keyboard displayed inside the headset, C) the trigger buttons on the bike simulator, D) peddling the bike simulator moves the user along a straight road in the virtual world.

Flat, Within-View Keyboard: on this flat keyboard, the user needs minimal to no head movements to focus on any key on the VKB, since it is entirely within the user's field of view. To select a character, the user first focuses on the key, then the background changes to green (A). When the key is selected either by dwell or click the background changes to yellow (B). Note: this image was captured on a desktop, the field-of-view will be smaller inside the VR headset.

Curved, Larger-than-View Keyboard: Only central 2/4 of the keyboard is within the view, 1/4 of the keyboard on both left and right sides of the keyboard are out of the view. In `A' the user is about to type character `T', this requires no head movements. In `B' the user is trying to type character `W'. Since `W' is out of the user's field of view the user is forced to turn her head to left so that `W' can be focused. Similarly, in `C' the user has entered character `O'. Since `O' was out of the user's field of view, the user had to turn her head to right to focus on the key. Note: this image was captured on a desktop, the field-of-view will be smaller inside the VR headset.

Demo - Gaze typing in virtual reality

PressTapFlick: Exploring a Gaze and Foot-based Multimodal Approach to Gaze Typing

Foot gesture recognizer
Foot gesture recognizer device
Foot gestures
Foot gesture recognizer. From left to right (1) Foot Gesture Recognition Device - Master unit: the entire circuitry of the master unit is housed inside a3D-printed container that is attached to the user’s footware. The user is executing a toe tap gesture (2) Foot Gesture Recognition Device: the master and receiver units. The master unit is attached to user’s footware, and the receiver unit is connected to the computer through USB port, and 3) Gesture Recognition: the list of foot gestures that are recognized by the device. The angular velocities indicate the speed at which the user has to tap or flick the foot to trigger a corresponding gesture.

Text entry is extremely difficult or sometimes impossible in the scenarios of situationally-induced or physical impairments and disabilities. As a remedy, many rely on gaze typing which commonly uses dwell time as the selection method. However, dwell-based gaze typing could be limited by usability issues, reduced typing speed, high error rate, steep learning curve, and visual fatigue with prolonged usage. We present a dwell-free, multimodal approach to gaze typing where the gaze input is supplemented with a foot input modality. In this multi-modal setup, the user points her gaze at the desired character, and selects it with the foot input. We further investigated two approaches to foot-based selection, a foot gesture-based selection and a foot press-based selection, which are compared against the dwell-based selection.

Enhanced Virtual Keyboard:

We evaluated our system through three experiments involving 51 participants, where each experiment used one of the three target selection methods: dwell-based, foot gesture-based, and foot press-based selection. We found that foot-based selection at least matches, and likely improves, the gaze typing performance compared to dwell-based selection. Among the four foot gestures (toe tapping, heel tapping, right flick and left flick) we used in the study, toe tapping is the most preferred gesture for gaze typing. Furthermore, when using foot-based activation users quickly develop a rhythm in focusing at a character with gaze and selecting it with the foot. This familiarity reduces errors significantly. Overall, based on both typing performance and qualitative feedback the results suggest that gaze and foot-based tying is convenient, easy to learn, and addresses the usability issues associated with dwell-based typing. We believe, our findings would encourage further research in leveraging a supplemental foot input in gaze typing, or in general, would assist in the development of rich foot-based interactions.

Foot Press Sensing Device: 1) Foot Press Sensing device attached to the user’s footware. The pressure sensor is placed inside the footware. Foot Press Sensing Device: the entire circuitry is housed inside a 3D-printed container, and the force sensitive resister that senses foot press actions extends from the main circuit and is placed inside the footware. 2) An outline of how the foot press sensing device is attached to the user’s footware, and the placement of the pressure sensor inside the footware.

Can Gaze Beat Touch? A Fitts' Law Evaluation of Gaze, Touch, and Mouse Inputs

Touch input
Mouse input
Gaze input
Fitts' Law Experiment: a participant performing a multi-directional point-and-select Fitts' Law task. From left to right (1) Touch Input, 2) Mouse Input, and 3) Gaze Input.

Gaze input has been a promising substitute for mouse input for point and select interactions. Individuals with severe motor and speech disabilities primarily rely on gaze input for communication. Gaze input also serves as a hands-free input modality in the scenarios of situationally-induced impairments and disabilities (SIIDs). Hence, the performance of gaze input has often been compared to mouse input through standardized performance evaluation procedure like the Fitts' Law. With the proliferation of touch-enabled devices such as smartphones, tablet PCs, or any computing device with a touch surface, it is also important to compare the performance of gaze input to touch input.

In this study, we conducted ISO 9241-9 Fitts' Law evaluation to compare the performance of multimodal gaze and foot-based input to touch input in a standard desktop environment, while using mouse input as the baseline. From a study involving 12 participants, we found that the gaze input has the lowest throughput (2.55 bits/s), and the highest movement time (1.04 s) of the three inputs. In addition, though touch input involves maximum physical movements, it achieved the highest throughput (6.67 bits/s), the least movement time (0.5 s), and was the most preferred input. While there are similarities in how quickly pointing can be moved from source to target location when using both gaze and touch inputs, target selection consumes maximum time with gaze input. Hence, with a throughput that is over 160% higher than gaze, touch proves to be a superior input modality.

Visualization of the paths on the screen along which the cursor traversed during touch, mouse, and gaze inputs respectively.

A Fitts' Law Evaluation of Gaze Input on Large Displays Compared to Touch and Mouse Inputs

Gaze-assisted interaction has commonly been used in a standard desktop setting. When interacting with large displays, as new scenarios like situationally-induced impairments emerge, it is more convenient to use the gaze-based multi-modal input than other inputs. However, it is unknown as to how the gaze-based multi-modal input compares to touch and mouse inputs. We compared gaze+foot multi-modal input to touch and mouse inputs on a large display in a Fitts' Law experiment that conforms to ISO 9241-9. From a study involving 23 participants, we found that the gaze input has the lowest throughput (2.33 bits/s), and the highest movement time (1.176 s) of the three inputs. In addition, though touch input involves maximum physical movements, it achieved the highest throughput (5.49 bits/s), the least movement time (0.623 s), and was the most preferred input.

Fitts' Law Experiment: a participant, in an upright stance, performing a multi-directional point-and-select Fitts' Law task (1) shown on a large display. Also, an eye tracker is mounted on a tripod (2).

Left Image - Microsoft Surface Hub (84-inch). Right Image - The foot controller used in the gaze+foot selection method. 1 - a force sensitive resistor, microcontroller, and bluetooth module in a 3D printed case, 2 - foot interaction.

Demo - Gaze typing in virtual reality

A Gaze Gesture-Based Paradigm for Situational Impairments, Accessibility, and Rich Interactions

Gaze gesture-based interactions on a computer are promising, but the existing systems are limited by the number of supported gestures, recognition accuracy, need to remember the stroke order, lack of extensibility, and so on. We present a gaze gesture-based interaction framework where a user can design gestures and associate them to appropriate commands like minimize, maximize, scroll, and so on. This allows the user to interact with a wide range of applications using a common set of gestures. Furthermore, our gesture recognition algorithm is independent of the screen size, resolution, and the user can draw the gesture anywhere on the target application. Results from a user study involving seven participants showed that the system recognizes a set of nine gestures with an accuracy of 93% and a F-measure of 0.96. We envision, this framework can be leveraged in developing solutions for situational impairments, accessibility, and also for implementing rich a interaction paradigm.

A user minimizing the browser with a gaze gesture.

A user draws a gesture with their eye movements, and assigns a dedication action (e.g., minimize).

Demo - Gaze gestures for accessible and rich interactions.

DyGazePass: A Gaze Gesture-Based Dynamic Authentication System to Counter Shoulder Surfing Attacks

Shoulder surfing enables an attacker to gain the authentication details of a victim through observations and is becoming a threat to visual privacy. We present DyGazePass: Dynamic Gaze Passwords, an authentication strategy that uses dynamic gaze gestures. We also present two authentication interfaces, a dynamic and a static-dynamic interface, that leverage this strategy to counter shoulder surfing attacks. The core idea is, a user authenticates by following uniquely colored circles that move along random paths on the screen. Through multiple evaluations, we discuss how the authentication accuracy varies with respect to transition speed of the circles, and the number of moving and static circles. Furthermore, we evaluate the resiliency of our authentication method against video analysis attacks by comparing it to a gaze- and PIN-based authentication system. Overall, we found that the static-dynamic interface with a transition speed of two seconds was the most effective authentication method with an accuracy of 97.5%.

The following three images show the password selection interface, dynamic interface, and static-dynamic interfces.

Demo - DyGazePass: A Gaze Gesture-Based Dynamic Authentication System to Counter Shoulder Surfing Attacks

A Gaze Gesture-Based User Authentication System to Counter Shoulder-Surfing Attacks

Shoulder-surfing is the act of spying on an authorized user of a computer system with the malicious intent of gaining unauthorized access. Current solutions to address shoulder-surfing such as graphical passwords, gaze input, tactile interfaces, and so on are limited by low accuracy, lack of precise gaze-input, and susceptibility to video analysis attack. We present an intelligent gaze gesture-based system that authenticates users from their unique gaze patterns onto moving geometric shapes. The system authenticates the user by comparing their scan-path with each shapes' paths and recognizing the closest path. In a study with 15 users, authentication accuracy was found to be 99% with true calibration and 96% with disturbed calibration. Also, our system is 40% less susceptible and nearly nine times more time-consuming to video analysis attacks compared to a gaze- and PIN-based authentication system.

If a user has selected Square-Star-Pie as a password, then the user is authenticated by following each shapes' paths in their respective frames, as shown in the sequence of Figures below. The user first follows the square shape, then star, and finally pie. The user does not receive any feedback, since the gaze point and scan-path are hidden.

Demo - A Gaze Gesture-Based User Authentication System to Counter Shoulder-Surfing Attacks

Gaze Gesture-Based Interactions for Accessible HCI

Users with physical impairments are limited by their ability to work on computers using the conventional mouse- and keyboard-based interactions. Existing accessible technologies still have usability issues, need a lot of training, and are imprecise. We present a gaze gesture-based interaction paradigm for users with physical impairments to work on a computer by just using their eye movements. We use an eye tracker that tracks the user's eye movements. To perform an action like minimizing and maximizing an application; opening a new tab, scrolling down, refresh on a browser, etc., the user moves their eyes to make a predefined gesture. The system recognizes the gesture performed and executes the corresponding action. Users with speech impairment can also use this system to speak quick phrases by performing gestures. This is crucial when a person with speech impairment is interacting with another person who does not know sign language.

Demo - Gaze Typing Through Foot-Operated Wearable Device

Gaze Typing Through Foot-Operated Wearable Device

Gaze Typing, a gaze-assisted text entry method, allows individuals with motor (arm, spine) impairments to enter text on a computer using a virtual keyboard and their gaze. Though gaze typing is widely accepted, this method is limited by its lower typing speed, higher error rate, and the resulting visual fatigue, since dwell-based key selection is used. In this research, we present a gaze-assisted, wearable-supplemented, foot interaction framework for dwell-free gaze typing. The framework consists of a custom-built virtual keyboard, an eye tracker, and a wearable device attached to the user's foot. To enter a character, the user looks at the character and selects it by pressing the pressure pad, attached to the wearable device, with the foot. Results from a preliminary user study involving two participants with motor impairments show that the participants achieved a mean gaze typing speed of 6.23 Words Per Minute (WPM). In addition, the mean value of Key Strokes Per Character (KPSC) was 1.07 (ideal 1.0), and the mean value of Rate of Backspace Activation (RBA) was 0.07 (ideal 0.0). Furthermore, we present our findings from multiple usability studies and design iterations, through which we created appropriate affordances and experience design of our gaze typing system.

Demo - Gaze Typing Through Foot-Operated Wearable Device

Gaze-Assisted User Authentication to Counter Shoulder-surfing Attacks

A highly secured, foolproof user authentication is still a primary focus of research in the field of User Privacy and Security. Shoulder-surfing is an act of spying when an authorized user is logging into a system; it is promoted by a malicious intent of gaining unauthorized access. We present a gaze-assisted user authentication system as a potential solution counter shoulder-surfing attacks. The system comprises of an eye tracker and an authentication interface with 12 pre-defined shapes (e.g., triangle, circle, etc.) that move on the screen. A user chooses a set of three shapes as a password. To authenticate, the user follows paths of the three shapes as they move, one on each frame, over three consecutive frames. The system uses a template matching algorithms to compare the scan-path of the user's gaze with the path traversed by the shape. The system evaluation involving seven users showed that the template matching algorithm achieves an accuracy of 95%. Our study also shows that Gaze-driven authentication is a foolproof system against shoulder-surfing attacks; the unique pattern of eye movements for each individual makes the system hard to break into.
Demo - Gaze Typing Through Foot-Operated Wearable Device

GAWSCHI: Gaze-Augmented, Wearable-Supplemented Computer-Human Interaction

Recent developments in eye tracking technology are paving the way for gaze-driven interaction as the primary interaction modality. Despite successful efforts, existing solutions to the ``Midas Touch" problem have two inherent issues: 1) lower accuracy, and 2) visual fatigue that are yet to be addressed. In this work we present GAWSCHI: a Gaze-Augmented, Wearable-Supplemented Computer-Human Interaction framework that enables accurate and quick gaze-driven interactions, while being completely immersive and hands-free. GAWSCHI uses an eye tracker and a wearable device (quasi-mouse) that is operated with the user's foot, specifically the big toe. The system was evaluated with a comparative user study involving 30 participants, with each participant performing eleven predefined interaction tasks (on MS Windows 10) using both mouse and gaze-driven interactions. We found that gaze-driven interaction using GAWSCHI is as good (time and precision) as mouse-based interaction as long as the dimensions of the interface element are above a threshold (0.60" x 0.51"). In addition, an analysis of NASA Task Load Index post-study survey showed that the participants experienced low mental, physical, and temporal demand; also achieved a high performance. We foresee GAWSCHI as the primary interaction modality for the physically challenged and a means of enriched interaction modality for the able-bodied demographics.

Gaze-Assisted Human-Computer Interaction

Exploring Users' Perceived Activities in a Sketch-based Intelligent Tutoring System Through Eye Movement Data

Intelligent tutoring systems (ITS) empower instructors to make teaching more engaging by providing a platform to tutor, deliver learning material, and to assess students' progress. Despite the advantages, existing ITS do not automatically assess how students engage in problem solving? How do they perceive various activities? and How much time they spend on each activity leading to the solution? In this research, we present an eye tracking framework that, based on eye movement data, can assess students' perceived activities and overall engagement in a sketch based Intelligent tutoring system, "Mechanix." Based on an evaluation involving 21 participants, we present the key eye movement features, and demonstrate the potential of leveraging eye movement data to recognize students' perceived activities, "reading, gazing at an image, and problem solving," with an accuracy of 97.12%.
First Author: Purnendu Kaul

KinoHaptics: An Automated, Haptic Assisted, Physio-therapeutic System for Post-surgery Rehabilitation and Self-care

Problem Statement: 
A carefully planned, structured, and supervised physiotherapy program, following a surgery, is crucial for the successful diagnosis of physical injuries. Nearly 50% of the surgeries fail due to unsupervised and erroneous physiotherapy. The demand for a physiotherapist for an extended period is expensive to afford, and sometimes inaccessible. With the advancements in wearable sensors and motion tracking, researchers have tried to build affordable, automated, physio-therapeutic systems, which direct a physiotherapy session by providing audio-visual feedback on patient’s performance. There are many aspects of automated physiotherapy program which are yet to be addressed by the existing systems: wide variety of patients’ physiological conditions to be diagnosed, demographics of the patients (blind, deaf, etc.,), and pursuing them to adopt the system for an extended period for self-care

Objectives and Solution:
In our research, we have tried to address these aspects by building a health behavior change support system called KinoHaptics, for post-surgery rehabilitation. KinoHaptics is an automated, persuasive, haptic assisted, physio-therapeutic system that can be used by a wide variety of demographics and for various patients’ physiological conditions. The system provides rich and accurate vibro-haptic feedback that can be felt by any user irrespective of the physiological limitations. KinoHaptics is built to ensure that no injuries are induced during the rehabilitation period. The persuasive nature of the system allows for personal goal-setting, progress tracking, and most importantly lifestyle compatibility.

Evaluation and Results:
The system was evaluated under laboratory conditions, involving 14 users. Results show that KinoHaptics is highly convenient to use, and the vibro-haptic feedback is intuitive, accurate, and definitely prevents accidental injuries. Also, results show that KinoHaptics is persuasive in nature as it supports behavior change and habit building.

The successful acceptance of KinoHaptics, an automated, haptic assisted, physio-therapeutic system proves the need and future scope of automated physio-therapeutic systems for self-care and behavior change. It also proves that such systems incorporated with vibro-haptic feedback encourage strong adherence to the physiotherapy program; can have a profound impact on the physiotherapy experience resulting in higher acceptance rate.

Healthy Leap: An Intelligent Context-Aware Fitness System for Alleviating Sedentary Lifestyles

As people in industrialized countries enjoy modern conveniences that lead to greater sedentary lifestyles and decreased involvement in physical activities, they also increase their risk of acquiring hypokinetic diseases such as obesity and heart disease that negatively impact their long-term health. While emerging wearable computing technologies are encouraging researchers and developers to create fitness-based mobile user interfaces for combating the effects of sedentary lifestyles, existing solutions instead primarily cater to fitness-minded users who wish to take advantage of technology to enhance their self-motivated physical exercises.

In this work, we propose our mobile-based fitness system called Healthy Leap that provides an intelligent context-aware user interface for encouraging users to adopt healthier and more active lifestyles through contextually appropriate physical exercises. Our system consists of an Android-enabled smartphone app that leverages a Pebble Smartwatch, in order to actively monitor users' situational context and activity for identifying their current sedentary state. From this sedentary state information, Healthy Leap responds with physical activity reminders to the user based on user's physical constraints through contextual information such as location, personal preference, calendar events, current time, and weather forecasts. From our evaluations of Healthy Leap, we observed that users not only benefited from sedentary state notifications that intelligently responded to their situation context, but were also more encouraged to engage in physical exercises for alleviating their sedentary lifestyles.

Let Me Relax: Toward Automated Sedentary State Recognition and Ubiquitous Mental Wellness Solutions

Advances in ubiquitous computing technology improve workplace productivity, reduce physical exertion, but ultimately result in a sedentary work style. Sedentary behavior is associated with an increased risk of stress, obesity, and other health complications. Let Me Relax is a fully automated sedentary-state recognition framework using a smartwatch and smartphone, which encourages mental wellness through interventions in the form of simple relaxation techniques. The system was evaluated through a comparative user study of 22 participants split into a test and a control group. An analysis of NASA Task Load Index pre- and post- study survey revealed that test subjects who followed relaxation methods, showed a trend of both increased activity as well as reduced mental stress. Reduced mental stress was found even in those test subjects that had increased inactivity. These results suggest that repeated interventions, driven by an intelligent activity recognition system, is an effective strategy for promoting healthy habits, which reduce stress, anxiety, and other health risks associated with sedentary workplaces.

Framework for Accelerometer Based Gesture Recognition and Integration with Desktop Applications

Master of Science,  Birla Institute of Technology and Science, Pilani, India

This research demonstrates simplified, alternative ways of interacting with desktop applications through natural hand based gestures. In general most of the desktop based applications are presumed to receive user inputs through traditional input devices like keyboard and mouse. Gesture recognition framework implemented in this work leverages accelerometer data from a smart-phone held in a user's hand to identify user gestures.

A short distance communication protocol, Bluetooth, is used to transmit the accelerometer data from a smartphone to a desktop at a constant rate, making the whole system wireless. Accelerometer data received at the desktop computer is analyzed to identify the most appropriate gesture it encodes, and further, is transformed corresponding key press and mouse events. The key-press and mouse events thus generated control various applications and games on a desktop computer. This framework enriches interaction with desktop applications and games, and enhances user experience through intuitive and lively gestures. The framework also enables the development of more creative games and applications, which is an exciting way of being engaged. 

Demo - Accelerometer Based Gaming - Integration with NFS on Desktop

Demo - Accelerometer Based Painting - Integration with Windows Paint

Multi-threaded Download Accelerator With Resume Support

Visvesvaraya Technological University, Karnataka, India

Issues with existing file transfer protocols
When a computer tries to transfer files over a network to another computer, typically it establishes a single connection with the server and transfers the files sequentially over this connection. This method slows down the speed of data transfer and does not utilize the available bandwidth effectively.

Using muti-threading it is possible for several threads to connect to the server independently over different sockets and transfer either different files simultaneously or different portions of a single file simultaneously. Also when data transfer with a system is terminated abruptly, generally the entire download operation needs to be re-instantiated from scratch again. This can be eliminated if the data transfer package can maintain the status of download for every file downloaded with resume support at the server side, thereby ensuring that the download can resume from the point of disconnection rather than all over from the beginning.

The main objectives of Multi-threaded Downloaded Accelerator with Resume Support:
  • To develop a server that can support file transfer transactions, with resume support
  • To develop a client that can provide an attractive graphical user interface to the user and help the user connect to specific systems and transfer files. Furthermore, the client must be able to maintain the status of all downloads.
  • To develop a protocol that ensures that the client can communicate with the server.
  • To incorporate multi-threading in order to improve bandwidth utilization, with proper communication amongst threads so that there is no synchronization problems or race conditions.
  • To introduce resume support by incorporating CRC checks, so that incomplete downloads can be resumed from the point where it was left off.

Graphics Editor

Graphics editor is a utility software that enables a user to carry out graphical operations like drawing geometrical figures and text. The system "Graphics Editor" is developed completely in "C" programming language. Various geometrical shapes that can be drawn using this editor are Rectangle, Circle, Ellipse, line, and Spiral etc. Graphics Editor also provides provisions for transformations of the geometrical figures. Various other functions like Save, Load, Clip, Rotate, and Scale etc., are also provided.

Graphics Editor is mouse driven with different functions represented as icons. The GUI is user friendly, as anyone can easily use the editor without any learning prerequisites. In addition, the system provides different colors that can be applied to geometrical figures and different patterns that can be used to fill shapes (rectangle, circle).

Linux Shell

This project is a custom implementation of "Linux Shell" that enhances the basic functionalities provided by default Linux shells like Bourne, Korn, C, and Bash. The custom Linux shell thus developed is able to run executable statements with command line arguments.

This shell is an intermediary program interpreter, which interprets commands that are entered at the terminal, and translates them into commands that are understood by the kernel. Myshell, thus acts as a blanket around the kernel and eliminates the need for a programmer to communicate directly with the kernel.

A unique feature of the Linux operating system is that all Linux commands exist as utility programs. These programs are located in individual files in one of the systems directories, such as /bin, /etc, or /usr/bin. The shell can be considered a master utility program, which enables a user to gain access to all the other utilities and resources of the computer.

The shell reads the first word of a command line and tries to identify if it is an alias, a function, or an internal command. If there is a command to be executed, the shell then searches through the directories specified in the path for the command files, and executes the command.

How does it work ?
  • On logging into the terminal, the custom shell displays the Linux prompt  indicating that it is  ready to receive a command from the user.
  • The user issues a command, for example: ls <directory-name>
  • The custom shell then,
    • Reads the command, 
    • Searches for and locates the file with that name in the directories containing utilities, 
    • Loads the utility into memory and executes the utility.
  • After the execution is complete, the shell once again displays the prompt, conveying that it is ready for the next command.

    Lex Yaac

    This project involved understanding and further modifying the existing implementation of Lexical analyzer and YAAC parser to implement a custom interpreter on UNIX system. This work enables a user to specify customized "Patterns" to generate various "Tokens" through Lexical Analyzer.

    Further these "Tokens" are fed to customized implementation of "YAAC" parser that enables a user to specify "Grammar"  to suit the requirements of custom interpreter.