I spent the week stress-testing the Grok-2 reasoning model through its new API....

https://lukaszaph075.almoheet-travel.com/why-is-grok-3-so-bad-at-citations-even-if-summarization-looks-good

I spent the week stress-testing the Grok-2 reasoning model through its new API. As someone who lives in documentation and rate limits, I wanted to see if the performance holds up against the industry incumbents. With pricing set at $0

Submitted on 2026-05-09 01:59:31