More

WiSaGaN · 2026-04-01T12:11:55 1775045515

Yeah, this seems like a personal choice, which does work out given the current result.

WiSaGaN · 2026-03-27T06:01:50 1774591310

Harness is fine. I think people here are arguing what provided here to take the test is not harness.

WiSaGaN · 2026-02-23T22:50:27 1771887027

This violates the ToS, but I don't think it's distillation. Distillation requires knowing the logits, which current API does not provide. This is just synthetic data generation. Anthropic definitely knows the difference.

janalsncm · 2026-02-23T23:32:57 1771889577

Yes, it is annoying that companies keep calling it “distillation” when it’s really imitation learning. In fact the closest analogy is probably more like “scraping” which is pretty ironic.

WiSaGaN · 2026-02-23T12:41:44 1771850504

Google has gigantic power over its users. Consider that for some reason, Google banned your gmail account, which you are using for large number of logins for different essential services.

reactordev · 2026-02-23T13:11:55 1771852315

All it takes is Google to ban you from one service and you’re locked out of things like, oh I don’t know, GCP…

WiSaGaN · 2026-02-20T09:08:05 1771578485

I think diffusion makes much more sense than auto-regressive (AR) specifically in code generation comparing to chatbot.

WiSaGaN · 2026-02-13T06:59:32 1770965972

Great article. To me, this highlights a key question in the era of rapidly advancing machine intelligence: if we know machine intelligence is progressing, what is more valuable to build for? As humans, we still find many tools useful even when doing knowledge work. For instance, a calculator. Sure, a smart person can perform calculations in their head, but it’s much easier to teach everyone how to use a calculator, which is 100% reliable in its intended domain.

In this era, we should build these kinds of tools for problems we know are straightforward ones you can’t get smarter than, even as intelligence continues to advance. Using tools like "bash" or command-line interfaces originally designed for humans is a good initial approach, since we can essentially reuse much of what was built for human use. Later, we can optimize specifically for machines, either accounting for their different cognitive structures (e.g., the ability to memorize extremely long contexts compared to humans) or adapting to the stream-based input/output patterns of current autoregressive token generators.

Eventually, I believe machine intelligence will build their own tools based on these foundations, likely a similar kind of milestone to when humans first began using tools.

WiSaGaN · 2026-02-13T05:06:20 1770959180

Yes, agentic-wise, Claude Opus is best. Complex coding is GPT-5.x. But for smartness, I always felt Gemini 3 Pro is best.

thefounder · 2026-02-13T08:43:04 1770972184

Can you give an example of smartness where Gemini is better than the other 2? I have found Gemini 3 pro the opposite of smartness on the tasks I gave him (evaluation, extraction, copy writing, judging, synthesising ) with gpt 5.2 xhigh first and opus 4.5/4.6 second. Not to mention it likes to hallucinate quite a bit .

WarmWash · 2026-02-13T14:36:05 1770993365

I use it for classic engineering a lot, it beats out chatgpt and opus (I haven't tried as much with opus as chagpt though). Flash is also way stronger than it should be

WiSaGaN · 2026-01-28T13:13:22 1769606002

OpenAI has a former NSA director on its board. [1] This connection makes the dilution of the term "PRISM" in search results a potential benefit to NSA interests.

[1]: https://openai.com/index/openai-appoints-retired-us-army-gen...

WiSaGaN · 2026-01-24T03:54:16 1769226856

Rust's `serde_json` recently switched to use a new library for floating string conversion: https://github.com/dtolnay/zmij.

vitaut · 2026-01-24T15:50:39 1769269839

I was impressed how fast the Rust folks adopted this! Kudos to David Tolnay and others.

WiSaGaN · 2026-01-20T11:54:46 1768910086

A market maker needs a premium to provide liquidity. If all else is equal, why would they take on execution time risk? This is a universal feature of continuous-trading Central Limit Order Books (CLOBs), not something unique to prediction markets.