Ask HN: How is the web effected by an AI generated feedback loop?

Jack000 · on May 18, 2023

This type of data is actually better than independent human text (specifically for training the LLM that originally produced the output)

GPT4 is trained with PPO+RLHF. The web text that is produced by the LLM then fed back in will be more proximal to the original token distribution.

In other words, by selectively publishing LLM output you’re effectively performing the same action as clicking the thumbs up/thumbs down button on the chatgpt webui.

I agree with openai that this will not be a problem at all, since you would need a process to gauge the quality of the data anyways, even for human text.

ineedausername · on May 17, 2023

Well a possibility is that LLMs won't become much more "intelligent" than what they are currently.

shakezula · on May 17, 2023

Ive seen this referred to as “AI drift” before and I see it being a long term problem of data sets for training. As always, the quality of the data set determines the quality of the results. Models and data sets will be the arms race in the LLM world for a while to come, I think.

kleer001 · on May 20, 2023

Not all IMHO. I think the next step is layering human choice in weights on yop of the already established models.

jjgreen · on May 17, 2023

Bovine spongiform encephalopathy

ftxbro · on May 17, 2023

Many will say that no AI feedback loop can become better than the original web, because of 'blurry jpeg of the web' arguments, but I disagree.

edizms · on May 17, 2023

affected*

0proboy0 · on May 17, 2023

LLMs now still generating content that contains misinformation, which will end up on the web, LLMs then can learn and propagate those misinformation in their future responses. That's one way this feedback loop can affect their output.