Great question! Let me rephrase so you can confirm my understanding: I have some labeling functions (LFs) that are far more accurate on a majority subset of the data than on one or more minority subsets or "slices" of the data... and these subsets are not necessarily correlated with the class labels, so this isn't a traditional class imbalance problem...
We've actually done some recent work on this (https://papers.nips.cc/paper/9137-slice-based-learning-a-pro...) where we have users define these critical "slices" approximately so that the model being trained can pay special attention to them (extra representation layers) so they don't get drowned out by the majority subsets/slices. But definitely a lot more to do in this area!
We've actually done some recent work on this (https://papers.nips.cc/paper/9137-slice-based-learning-a-pro...) where we have users define these critical "slices" approximately so that the model being trained can pay special attention to them (extra representation layers) so they don't get drowned out by the majority subsets/slices. But definitely a lot more to do in this area!