• 0 Posts
  • 119 Comments
Joined 2 years ago
cake
Cake day: June 19th, 2023

help-circle



  • I agree. I was almost skipping it because of the title, but the article is nuanced and has some very good reflections on topics other that AI. Every technical progress is a tradeoff. The article mentions cars to get to the grocery store and how there are advantages in walking that we give up when always using a car. Are cars in general a stupid and useless technology? No, but we need to be aware of where the tradeoffs are. And eventually most of these tradeoffs are economic in nature.

    By industrializing the production of carpets we might have lost some of our collective ability to produce those hand-made masterpieces of old, but we get to buy ok-looking carpets for cheap.

    By reducing and industrializing the production of text content, our mastery of language is declining, but we get to read a lot of not-very-good content for free. This pre-dates AI btw, as can be seen by standardized tests in schools everywhere.

    The new thing about GenAI, though is that it upends the promise that technology was going to do the grueling, boring work for us and free up time for us to do the creative things that give us joy. I feel the roles have reversed: even when I have to write an email or a piece of coding, AI does the creative piece and I’m the glorified proofreader and corrector.




  • Machine Learning Models have existed for a long time. They are at their core predictors: you give them data, you carefully tweak the model’s parameters for a long time and you can finally train a model that can make predictions in a specific domain. That way you can have a model trained specifically to identify patterns that look like cancer on medical imaging or another one (like in your example) to predict a protein’s structure.

    LLMs are ML models too, but they are trained on language. They learn to identify patterns in human language and predict long pieces of text that are similar to those language patterns. They also accept input in natural language.

    The hype consists in slapping a new “AI” marketing label onto all of Machine Learning, mixing LLMs and other types of models, and creating the delusion that predicting a protein’s structure was done by people at Google casually throwing prompts at Gemini.

    And as these LLMs are exceptionally power-hungry and super expensive (turns out that predicting human language based on a whole internet’s worth of training requires incredibly complex models), that hype is to gather all the needed trillions of investment. GenAI is not the whole of Machine Learning and saying “Copilot is not worth the cost of the energy that’s needed to power it” doesn’t mean creating obstacles to ML used for cancer research.






  • Basically, model collapse happens when the training data no longer matches real-world data

    I’m more concerned about LLMs collaping the whole idea of “real-world”.

    I’m not a machine learning expert but I do get the basic concept of training a model and then evaluating its output against real data. But the whole thing rests on the idea that you have a model trained with relatively small samples of the real world and a big, clearly distinct “real world” to check the model’s performance.

    If LLMs have already ingested basically the entire information in the “real world” and their output is so pervasive that you can’t easily tell what’s true and what’s AI-generated slop “how do we train our models now” is not my main concern.

    As an example, take the judges who found made-up cases because lawyers used a LLM. What happens if made-up cases are referenced in several other places, including some legal textbooks used in Law Schools? Don’t they become part of the “real world”?


  • I tried reading the paper. There is a free preprint version on arxiv. This page (from the article linked by OP) also links the code they used and the data they tried compressing, in the end.

    While most of the theory is above my head, the basic intuition is that compression improves if you have some level of “understanding” or higher-level context of the data you are compressing. And LLMs are generally better at doing that than numeric algorithms.

    As an example if you recognize a sequence of letters as the first chapter of the book Moby-Dick you’ll probably transmit that information more efficiently than a compression algorithm. “The first chapter of Moby-Dick”; there … I just did it.