Why AI Companies Love Releasing Benchmarks First

I think it's because benchmarks serve as roadmaps. Publishing benchmarks
signals the direction of future development to the community and ecosystem.
As long as the direction is correct, the ecosystem will naturally align itself
accordingly.

Then, the company only needs to excel in the conditions defined by the
benchmark. If the benchmark itself is set incorrectly, the direction becomes
flawed. Disruptive companies often change the way success is measured; they
prioritize differently from mainstream ones.

For example, what truly disrupted AI with generative models? Traditional AI
emphasized accuracy in classification tasks—essentially multiple-choice
questions. However, generative AI required AI to produce much broader
outputs, something nobody initially considered.

This meant that autoregressive models, which didn't seem important at the
time, became central. To recognize the value of autoregressive models, one
must first value AI's horizontal use cases or generalization capabilities.

In essence, different visions lead to different benchmarks, and the perceived
importance of certain directions often comes down to taste and overall strategic
perspective.