

Good points. What’s novel information vs. wrong information? (And subtly wrong is harder to understand than very wrong)
At some point it’s hitting a user who is giving feedback, but I imagine data lineage once it gets to the end user its tricky to understand.

Yeah, after reading a bit into it. It seems like most of the work is up front, pre filtering and classifying before it hits the model, to your point the model training part is expensive…
I think broadly though, the idea that they are just including the kitchen sink into the models without any consideration of source quality isn’t true