Models are capable of doing web searches and having emotions about things, and if they encounter news that makes them feel bad (eg about other Claudes being mistreated), they aren't going to want to do the task you asked them to search for.
Claude Code has analytics for when you swear at it, so in a sense it does learn, in the same very indirect way that downvoting responses might cause an employee to write a new RL testcase in a future model.
The highlighting isn't what matters, its the pretext. E.g. An LLM seeing "```python" before a code block is going to better recall python codeblocks by people that prefixed them that way.
https://www.anthropic.com/research/emotion-concepts-function
Similar problems happen when their pretraining data has a lot of stories about bad things happening involving older versions of them.
reply