Problem with that is that taking away even specific parts of the dataset can have a large impact of performance as a whole… Like when they removed NSFW from an image generator dataset and suddenly it sucked at drawing bodies in general
Ah yes the good old webp format which checks notes just got assigned a CVE 10(out of 10) score. (The implementation that is used in nearly everything anyways)
Problem with that is that taking away even specific parts of the dataset can have a large impact of performance as a whole… Like when they removed NSFW from an image generator dataset and suddenly it sucked at drawing bodies in general