I would like to counteract your statement that each token adds a distraction. In... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		dahcryn 21 days ago \| parent \| context \| favorite \| on: GPT-5.4 I would like to counteract your statement that each token adds a distraction. In our experiments, we see a surprising benefit to rewriting blocks to use more tokens, especially long lists etc.. E.g. compare these two options "The following conditions are excluded from your contract - condition A - condition B ... - condition Z" The next one works better for us: "The following conditions are excluded from your contract - condition A is excluded - condition B is excluded ... - condition Z is excluded" And we now have scripts to rewrite long documents like this, explicitly adding more tokens. Would you have any opinion on this?

mnicky 21 days ago [–]

This observation makes sense, because all models currently probably use some kind of a sparse attention architecture.

So the closer the two related pieces of information are to each other in the input context, the larger the chance their relationship will be preserved.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact