Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Snorkel paper doesn't cover this in depth, the math is all in this paper:

https://arxiv.org/abs/1703.00854

I can't say I followed all the proofs, but it seems that under certain limited assumptions about labelling functions they prove their generative function can do well.

Reading Snorkel it initially sounded like magic in the bad way, but this does make it clear that if your labelling functions are garbage or have certain kinds of problems there's nothing they can do about it.

Even leaving aside the generative model I think the focus on function-based data bootstrapping is great, which is why I've been following Snorkel's projects for a while.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: