I spent the week stress-testing the Grok-2 reasoning model through its new API....
https://lukaszaph075.almoheet-travel.com/why-is-grok-3-so-bad-at-citations-even-if-summarization-looks-good
I spent the week stress-testing the Grok-2 reasoning model through its new API. As someone who lives in documentation and rate limits, I wanted to see if the performance holds up against the industry incumbents. With pricing set at $0