Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Worse than the chart crime of truncating the y axis is putting LLaMa2's Human Eval scores on there and not comparing it to Code Llama Instruct 70b. DBRX still beats Code Llama Instruct's 67.8 but not by that much.


> "On HumanEval, DBRX Instruct even surpasses CodeLLaMA-70B Instruct, a model built explicitly for programming, despite the fact that DBRX Instruct is designed for general-purpose use (70.1% vs. 67.8% on HumanEval as reported by Meta in the CodeLLaMA blog)."

To be fair, they do compare to it in the main body of the blog. It's just probably misleading to compare to CodeLLaMA on non coding benchmarks.


Which non-coding benchmark?


> chart crime of truncating the y axis

If you chart the temperature of the ocean do you keep the y-axis anchored at zero Kelvin?


If you chart the temperature of the ocean are you measuring it in Kelvin?


Apparently, if you want to avoid "chart crime" when you chart temperatures, then it's deceptive if you don't start at absolute zero.


When was the temperature on Earth at absolute zero?


My point exactly.

In a chart of world gross domestic product for the last 12 months, when was it at zero?

In a chart of ocean salinity, when was it at absolute zero?

Is it inherently deceptive to use a y-axis that doesn't begin at zero?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: