• 0 Posts
  • 12 Comments
Joined 2 years ago
cake
Cake day: June 12th, 2023

help-circle



  • Really? I mean, it’s melodramatic, but if you went throughout time and asked writers and intellectuals if a machine could write poetry, solve mathmatical equations, and radicalize people effectively t enough to cause a minor mental health crisis, I think they’d be pretty surprised.

    LLMs do expose something about intelligence, which is that much of what we recognize as intelligence and reason can be distilled from sufficiently large quantities of natural language. Not perfectly, but isn’t it just the slightest bit revealing?


  • A child may hallucinate, lie, misunderstand, etc, but we wouldn’t say the foundations of a complete adult are not there, and we wouldn’t assess the child as not conscious. I’m not saying that LLMs are conscious because they say so (they can be made to say anything), but rather that it’s difficult to be confident that humans possess some special spice of consciousness that LLMs do not, because we can also be convinced to say anything.

    LLMs can reason (somewhat unreliably) with a fraction of a human brains compute power while running on hardware that was made for graphics processing. Maybe they are conscious, but only in some pathetically small way, which will only become evident when they scale up, like a child.



  • Why can’t complex algorithms be conscious? In fact, ai can be directed to reason about themselves, context can be made to be persistent, and we can measure activation parameters showing that they are doing so.

    I’m sort of playing devil’s advocate here, but, “Consciousness requires contemplation of self. Which requires the ability to contemplate.” Is subjective, and nearly any ai model, even rudimentary ones, are capable of insisting that they contemplate themselves.



  • Yes, sorry, where I live it’s pretty normal for cars to be diesel powered. What I meant by my comparison was that a train, when measured uncritically, uses more energy to run than a car due to it’s size and behavior, but that when compared fairly, the train has obvious gains and tradeoffs.

    Deepseek as a 600b model is more efficient than the 400b llama model (a more fair size comparison), because it’s a mixed experts model with less active parameters, and when run in the R1 reasoning configuration, it is probably still more efficient than a dense model of comparable intelligence.



  • This article is comparing apples to oranges here. The deepseek R1 model is a mixture of experts, reasoning model with 600 billion parameters, and the meta model is a dense 70 billion parameter model without reasoning which preforms much worse.

    They should be comparing deepseek to reasoning models such as openai’s O1. They are comparable with results, but O1 cost significantly more to run. It’s impossible to know how much energy it uses because it’s a closed source model and openai doesn’t publish that information, but they charge a lot for it on their API.

    Tldr: It’s a bad faith comparison. Like comparing a train to a car and complaining about how much more diesel the train used on a 3 mile trip between stations.