dyl000 7 days ago

I wonder how long before we start to acknowledge that AI labs are heavily gaming benchmarks and they are mostly useless as a way of judging model performance.

The latest one to be caught was Meta, but they've all been doing it for a while now.