@yes_this_time

yes_this_time@lemmy.world · 1 day ago

Yeah, after reading a bit into it. It seems like most of the work is up front, pre filtering and classifying before it hits the model, to your point the model training part is expensive…

I think broadly though, the idea that they are just including the kitchen sink into the models without any consideration of source quality isn’t true

yes_this_time@lemmy.world · 1 day ago

Good points. What’s novel information vs. wrong information? (And subtly wrong is harder to understand than very wrong)

At some point it’s hitting a user who is giving feedback, but I imagine data lineage once it gets to the end user its tricky to understand.

yes_this_time@lemmy.world · 1 day ago

If I’m creating a corpus for an LLM to consume, I feel like I would probably create some data source quality score and drop anything that makes my model worse.

yes_this_time@lemmy.world · 2 months ago

Ahh spoiler, just started reading this

yes_this_time@lemmy.world · 3 months ago

It makes me a bit sad that there is a whole article on a (very likely) mirage

yes_this_time@lemmy.world · 3 months ago

500 million was specific to Claude Code, they are at 5 billion annual run rate and growing

yes_this_time@lemmy.world · 4 months ago

I mean, I agree that a lot of money was spent training some of these models - and I personally wouldn’t invest in an ai based company. The economics dont make sense.

However, worst case, self hosted open source models have got pretty good, and I find it unlikely that progress will simply stop. Diminishing returns from scaling data yes, but there will still be optimizations all through the pipeline.

That is to say, LLMs will continue to have utility regardless if Open AI and Anthropic are around long term.

yes_this_time@lemmy.world · 4 months ago

Some people are finding value in LLMs, that doesn’t mean LLMs are great at everything.

Some people have work to do, and this is a tool that helps them do their work.

yes_this_time@lemmy.world · 4 months ago

The ‘cloud’ was a pretty big thing though… everyone used to self host, now only some self host.

AWS, GCP, Azure make a lot of money

yes_this_time@lemmy.world · 5 months ago

Yeah exactly!

yes_this_time@lemmy.world · 5 months ago

Could this be attributed to the driver mix changing?

It’s quite possible tesla drivers are worse in 2025 than 2024

yes_this_time@lemmy.world · 9 months ago

Threats to other countries isn’t noise… It doesn’t seem appropriate to hand wave away threats to other people.