*That scientist went to that lab, looked closely at everything for 2 days...* Ye...

That scientist went to that lab, looked closely at everything for 2 days...

Yeah, that also worked for Uri Geller. Scientists with Ph.Ds investigated his spoons! They found no gimmicks!

Of course there's nothing wrong with the equipment that anyone can find by staring at it for two days. Let's do a little math. Consider the guy from SRI. He has been working on cold fusion for twenty years. The experiment is always described as "incredibly simple", so he probably should have been running it daily all that time, just cranking it over and over with the goal of getting perfect reliability and repeatability. But let's assume that, instead, he thinks really hard every time he turns on the equipment, so his group has run it about once a week. Once a week for twenty years is about 20x50 = 1000 experiments. He reports approximately 50 non-reproducible positive results. That implies that he gets negative results about 95% of the time, maybe more.

So the instrument basically works. 95% of the time, the measurement shows that you get out exactly what you put in. That's pretty good! I've worked on lots of graduate-level experiments that have a far higher failure rate. (This isn't manufacturing. Your equipment is often literally held together with duct tape.)

Of course you're not going to find an intermittent problem in a 95%-accurate instrument by staring at the thing. You will have to work with it for a while. And even working experiments will yield a certain percentage of mysterious anomalies that you will never understand. There's a power surge. One of the instruments has an intermittent software bug. One of the sensors burns out slowly, spending several months giving erroneous readings before it finally fails. A solder joint starts to work loose. Some piece of metal contacts your thermocouple and creates a bimetallic junction. There's a ground loop. Static electricity. You misread a dial. You nudge a knob without noticing. Your materials come from different batches at the supplier. Some of your students use a different procedure than the others.

That's why irreproducible results have such a bad reputation. Everyone gets them. There are the stuff of legends ("gremlins", "Magic/More Magic", etc.) precisely because everybody gets them, and because they often lead you astray. You cannot hope to explain them all, because they are irreproducible! If you hope to pull an actual observation out of the noise, you either need to design a cleaner experiment, or make very careful observations and use statistics to pull out correlations ("Hey, every anomaly has occurred between midnight and 3am!" "Hey, every bad chip has come from a wafer that was handled by the guy who was once cited for forgetting his gloves!"). Then you construct hypotheses, and then zero in on those hypotheses. ("Hmm, if we feed the grad students more coffee the anomalies at 3am go away." "We lectured the guy about wearing gloves and the failure rate dropped by 85%.")

If you can't reproduce the experiment after a lot of effort, you need to seriously consider the possibility that you're fooling yourself. I've seen a lot of scientists fail this test, temporarily or (sometimes) for entire careers.

The other point to make is that scientists fight like tigers in print, but are often very polite in person, and exceedingly polite on camera. Remember: Scientists are anonymously graded by each other. You have to appear to fight fair, or your colleagues will downgrade your reviews and your grants will get rejected. So to get a scientist to make fun of another scientist's equipment on camera is like convincing the Pope to smack a non-Christian with a mace on live television. We prefer to eviscerate each other's manuscripts... often anonymously. That's why so many of us prefer to wait for something to appear in print... and why the cold fusion guys are justly vilified for going to the press before getting their work accepted by a journal.