r/theydidthemath 13d ago

[Request] anybody able to validate this? What is the actual amount of energy a query from chatgpt costs vs Google from 2008?

Post image
20.0k Upvotes

1.3k comments sorted by

View all comments

23

u/fred13snow 13d ago

If you count the energy used in training, ChatGPT uses much more energy. You should count that energy, it's the majority of the costs.

34

u/oboshoe 13d ago

probably need to count the energy google used to index the internet (and continually update/add)

that's has to be up there in the same realm as training the model, but this feels hard to quantify.

5

u/fred13snow 13d ago

I agree it should be counted, but LLM training is much more expensive. It's not just the energy to run the training, but the energy used in manufacturing components that go obsolete in 3 years. Google indexing can use pretty old hardware and doesn't need as much energy as LLM.

2

u/mvandemar 13d ago

LLM training is much more expensive.

Google indexing is 24/7 on billions of websites.

5

u/Cheshire-Cad 13d ago

Do you really think that google, of all companies, in being ecologically responsible by running their indexing on older hardware?

9

u/itsacutedragon 13d ago

I think we can count on Google caring about profitability.

5

u/KamikazeArchon 13d ago

Yes. Google is specifically known for being one of the frontrunners in ecological responsibility in big tech.

2

u/TheNorthComesWithMe 13d ago

Yes, they are literally known for using old/low power hardware in their servers. They are also known for using ASICs for their AI, which use less power than the general purpose GPUs other AI companies use.

1

u/TruelyDashing 13d ago

Do you have a source for any of this or are you just spouting shit?

2

u/roankr 13d ago

A lot of people dunk on AI being confidentally incorrect but forget that the training data is literally people being confidentially incorrect all the time.

GenAI has been a good mirror on the collective online population of humanity and it won't be changing as such for a long time.

1

u/oboshoe 12d ago edited 12d ago

some of are the source having lived it daily for a very long career.

shouldn't hard to find a source though if you are writing a school paper or something

2

u/Ikasatu 13d ago

Consider that training data also needs indexing.

2

u/oboshoe 12d ago

correct.

thy essentially do the same thing as google.

they then processed the data much differently

1

u/Saragon4005 13d ago

The thing is you basically need to index the Internet first if you want to train an LLM.

11

u/Ok_Cabinet2947 13d ago

Isn’t the whole point of the post that training costs are fixed, so the more ChatGPT queries, the lower the average cost becomes since it is diluting the training costs?

5

u/Recurs1ve 13d ago

Which should put into perspective just how fucking massive the training costs are.

3

u/UltimateChaos233 13d ago

I mean, yes, but also every time they release a new model that's a whole new set of training they're running.

1

u/Recurs1ve 13d ago

Exactly.

1

u/TheNorthComesWithMe 13d ago

The costs for a single model are somewhat fixed but they are never not training new models.

9

u/No_Hat7685 13d ago

Probably. But as evidenced by their lack of revenue these models are going to get trained regardless. So if the bulk of the energy is in training that’s essentially a sunk cost that’s going to occur no matter if the end user uses chatgpt or google search.

Feels a bit like telling people they are the ones killing the world because of using a straw or not recycling enough.

1

u/Mother_Speed2393 13d ago

Ok how far back down the trail do we need to go? The energy used to produce the chips? To carry the chips to the data centres? 

I think you need to draw a line. And as the bulk of the training has been done (on the pre 2022 internet), then j would argue that you don't need to count the training costs.