Unfortunately, this may say as much about your computer monitor as it says about your eyes.
Edit: After finishing the test, the demographic information is clearly screwed up now, since the score range for males in the 20-29 range is -160 to 444445389, which is definitely wrong on the low end (which is supposed to be zero), and probably wrong on the high end (which is probably not supposed to be more than 400 million).
I got 0, which must be wrong because I was officially diagnosed to have a inferior color recognition but a better contrast differentation then average.
That is a bad designed test, for all the reasons you learn in psychology when talking about experimental test design.
Most importantly: How is this thing scored? Imagine you have everything but one block in the right order. Now every element following that false positioned block is one apart from the right position. Are they now wrong as well? Or are they scored as correct as they are in the correct order in itself?
I'm sure there exists rules for this test design, but it is a hard decision nonetheless. It's error-prone.
It would be a way more sound experiment when presenting always only two colour blocks and letting the user bring them in order (or let them decide: which one is greener, stuff like that). This way one would also detect whether people are able to distinguish between the colours.
Edit: But if they are specific testing for the way people arrange that stuff and try to detect patterns, then well, might be a different story.
Well, it all depends what you want to test. Normally, such an experiment should try to detect which colours when compared can still be distinguished, detecting the smallest notable difference/patterns which don't work for subgroups. I don't see how that could work well with the measurement.
If it really is about the whole row of colours, then the levenshtein distance could indeed work well.
It seems this would not fix the issue of having one block off and thereby pushing a bunch of other blocks one off from their right position. You'd still have to move all those blocks back, which means the Levenshtein distance would not be short.
When inspecting element I saw IDs like "patch_ROW_COLUMN" in container, just put them in order of IDs and you should get a perfect score.
I got 8 instead of 0, bug ?
I scored a 0, so it is possible; there may still be a bug (perhaps my order was wrong but it still gave me a perfect score for some reason), but it is possible.
Just clicking through without doing any reordering got me 1042 so I don't think it's in the [0,99] interval. I'm assuming 99 is just an arbitrary "you have terrible eyesight" cutoff.
I have the feeling they aren't checking the input, so anyone could probably POST them some results, and it will put it into the DB. Speaking of which, BRB, let me see if I can lower it to -200
My result is 32 (0 is perfect, 99 is low). But the best and highest scores for my demographic look wrong, -160 and 444445389. Is that a bug or am I misunderstanding something?
A few observations. I think it's easier if you don't group the hues roughly immediately. I did that and it made it harder to discern the hues later when I was doing fine adjustments. Also, I'm a bit worried about my monitor not accurately displaying all the colours, possibly making them look the same.
My second score is 4. I took my time and re-adjusted the screen brightness now and then. When I felt fatigue I would look away for a while. Also, and I feel most importantly, I would frequently jump from solving one band to do another one, because I felt looking too long into similar hues would desensitise my mind.
That shouldn't make a difference. If anything, it should help as it would better match your screen to the ambient light, allowing you to make better choices.
I've heard a lot of people in graphics who use f.lux or redshift only to turn it off when they do graphics work, but I find more accurate results if it is left on. This calls for another experiment!
It will make a difference. Your eyes can see a much wider color space than your monitor can display. By shifting colors, f.lux leaves some of the monitors color space unused, compressing the remaining colors so that some are closer together (or the same even).
Edit: After finishing the test, the demographic information is clearly screwed up now, since the score range for males in the 20-29 range is -160 to 444445389, which is definitely wrong on the low end (which is supposed to be zero), and probably wrong on the high end (which is probably not supposed to be more than 400 million).