Just want to note that this simple “mimicry” of mistakes seen in the training te...

		ActivePattern on Jan 3, 2025 \| parent \| context \| favorite \| on: Can LLMs write better code if you keep asking them... Just want to note that this simple “mimicry” of mistakes seen in the training text can be mitigated to some degree by reinforcement learning (e.g. RLHF), such that the LLM is tuned toward giving responses that are “good” (helpful, honest, harmless, etc…) according to some reward function.