I just tried it on open router but i was served by cerebras. Holy... 40,000 tokens per second. That was SURREAL.
I got a 1.7k token reply delivered too fast for the human eye to perceive the streaming.
n=1 for this 120b model but id rank the reply #1 just ahead of claude sonnet 4 for a boring JIRA ticket shuffling type challenge.
EDIT: The same prompt on gpt-oss, despite being served 1000x slower, wasn't as good but was in a similar vein. It wanted to clarify more and as a result only half responded.
I got a 1.7k token reply delivered too fast for the human eye to perceive the streaming.
n=1 for this 120b model but id rank the reply #1 just ahead of claude sonnet 4 for a boring JIRA ticket shuffling type challenge.
EDIT: The same prompt on gpt-oss, despite being served 1000x slower, wasn't as good but was in a similar vein. It wanted to clarify more and as a result only half responded.