Feedback satisfaction subjective? – botched calibration?

Forum to report issues and bugs on Windows 7, 8, and 10.

Feedback satisfaction subjective? – botched calibration?

Postby JeffKang » 30 May 2014, 09:49

Subjective satisfaction for feedback? – What about botched calibration attempts?

Subjective feedback satisfaction

Before attempting to send feedback data, I’m never really sure how to rate my satisfaction on the scale. I don’t know what “good” is.

It made me wonder about how others are making the decision. E.g. somebody might use the eye tracker, and have very high expectations. Their “poor” experience might actually be the best that the eye tracker can do.

Stars are a more important indicator?

Can the calibration stars reveal the true experience? For example, somebody gets five stars, and they put a low satisfaction rating. A five-star calibration indicates that the eye tracker was working optimally, regardless of how they felt about it? That is, setting the satisfaction level is not as important.

Effect of fumbled calibrations, or outlier calibrations

What happens if somebody screws up their calibration? E.g. they’re blinking too many times, they miss a couple targets, or they’re moving around too much.

In some of the previous versions, I was getting one stars, and five stars in different attempts. Had the feedback sending procedure been working in those versions, I’m not sure how you people would have taken it. (Although, in these cases, I think it had to do with all the flickering that I was getting).

Your preferences for the conditions that we should send feedback?

Is it best if we send feedback on calibrations that get 4 to 5 stars? After a calibration attempt, evaluate how you just performed? Cancel the sending attempt, and try again if you felt that you didn’t do well (e.g. didn’t follow the targets accurately)?

I imagine that you have to send successful attempts to some extent.

I use speech recognition, and you can correct mistakes, which improves the software for the future. However, I’m mindful of the situations where I say “correct that”, and instead of correcting the word, I decide to change the entire thought. I wonder if that screws up the speech recognition, or if there’s any mechanism to compensate for that.

The same thing goes for using speech recognition with Google search. You say your search term, and if it’s incorrect, you go in there, and fix it manually, which is data that Google uses. Sometimes, I’ll go in to not fix the speech recognition attempt, but I’ll want a completely different term. I wonder how they deal with that. Maybe if the search term varies too much, it’s ignored.

Maybe similarly with the eye-tracking calibration, if I rub my eyes mid-calibration, maybe there’s a way you detect the divergence. Maybe you have a way to gather data regardless of the conditions of the calibration. Or maybe you don’t, and we should try to be as attentive as possible during a calibration.

Thanks.
JeffKang
 
Posts: 129
Joined: 15 Feb 2014, 23:59

Re: Feedback satisfaction subjective? – botched calibration?

Postby Martin » 31 May 2014, 02:29

JeffKang wrote:Subjective feedback satisfaction

Before attempting to send feedback data, I’m never really sure how to rate my satisfaction on the scale. I don’t know what “good” is.

It made me wonder about how others are making the decision. E.g. somebody might use the eye tracker, and have very high expectations. Their “poor” experience might actually be the best that the eye tracker can do.


What you say is spot on. It is a subjective rating but we get a sense of user-satisfaction from it. Given the data we can see how the system is performing, if the accuracy and precision is good etc. If someone gets a top-notch calibration and still rate it low, there is either very high expectations, or other factors that we need to improve upon.

For example, a good calibration result does not necessarily mean that the tracking is robust. From the data sent we can compute the statistics for many different things and get a good picture of how well it is working. We can then spot trends for specific cases (e.g. glasses, large screens, certain eye colors etc) and work to improve the algorithms.

JeffKang wrote:Stars are a more important indicator?

We choose to use stars as rating scheme because the scientific standard of visual degrees of angle is not that user friendly, for example it differs with distance to the screen. Another idea would be to use pixels as many are familiar with it but with the introduction of high density displays (retina) there is a big variety between Pixels-Per-Inch.

The converting function from visual angle to stars is very straight forward and looks like this:

Code: Select all
     
        // Check for zero
        if (accuracy == 0)
        {
            RatingValue = 0;
            return "UNCALIBRATED";
        }

   if (accuracy < 0.5)
   {
      RatingValue = 5;
      return "PERFECT";
   }

   if (accuracy < 0.7)
   {
      RatingValue = 4;
      return "GOOD";
   }

   if (accuracy < 1)
   {
      RatingValue = 3;
      return "MODERATE";
   }

   if (accuracy < 1.5)
   {
      RatingValue = 2;
      return "POOR";
   }

   RatingValue = 1;
   return "REDO";


It should however be mentioned that the accuracy is computed during the calibration, as such, it only knows about the accuracy for the position of the calibration points. To do a fair accuracy assessment one would need to calibrate as normal and then measure all over the screen, say 16-24 positions, and then compute the average accuracy.

Perhaps needless to say there has been some debate on the methodology within the eye tracking field, for the time being the most ambitious approach is taking place through COGAIN which probably will end up as *the* international standard for eye movement data quality standardization.

JeffKang wrote:Effect of fumbled calibrations, or outlier calibrations


That's perfectly fine. We appreciate good data but imperfect data even more. It helps us to spot difficulties in recovering from lost tracking. E.g how long does it take from the pupil was detected until it was found again. How does the system perform with partial or unbalanced data.
A great deal work in building a good eye tracking is not around the perfect condition, that's easy, the difficult part is what you do in sub-optimal cases (e.g. how to handle all types of eyes and environments).

JeffKang wrote:Your preferences for the conditions that we should send feedback?

A bit of everything would be great. Some good ones but primarily feedback data when things are not working as well, or unexpectedly. Perhaps you bring it to a new environment and suddenly it is jittery or giving poor calibrations. There is no better way for us to improve the system but to have the ability to reanalyze the data and find those specific issues.

In general, we prefer data as natural as possible without effort to turn it better or worse.
Martin
 
Posts: 567
Joined: 29 Oct 2013, 15:20


Return to Issues and troubleshooting - Windows



cron