In Energy and Policy Considerations for Deep Learning in NLP, three UMass Amherst computer science researchers investigate the carbon budget of training machine learning models for natural language processing, and come back with the eyepopping headline figure of 78,468lbs to do a basic training-and-refinement operation.
This is about five times the lifetime, cradle-to-grave carbon budget for a car, including manufacture.
The bulk of the carbon is expended at the fine-tuning stage, which involves a lot of trial and error. More complex models, like the Transformer model (employed in machine translation) use even more carbon -- 626,155lbs.
Text and language processing are by no means the most compute-intensive (and hence carbon-intensive) forms of machine learning model -- things like vision systems are even more complex.
One implication the authors explore: the computational intensity of today's machine learning research has priced it outside the realm of most academic researchers, moving the most important work in the field to private firms whose research doesn't necessarily contribute to our collective store of knowledge.
What’s more, the researchers note that the figures should only be considered as baselines. “Training a single model is the minimum amount of work you can do,” says Emma Strubell, a PhD candidate at the University of Massachusetts, Amherst, and the lead author of the paper. In practice, it’s much more likely that AI researchers would develop a new model from scratch or adapt an existing model to a new data set, either of which can require many more rounds of training and tuning.
To get a better handle on what the full development pipeline might look like in terms of carbon footprint, Strubell and her colleagues used a model they’d produced in a previous paper as a case study. They found that the process of building and testing a final paper-worthy model required training 4,789 models over a six-month period. Converted to CO2 equivalent, it emitted more than 78,000 pounds and is likely representative of typical work in the field.
The significance of those figures is colossal—especially when considering the current trends in AI research. “In general, much of the latest research in AI neglects efficiency, as very large neural networks have been found to be useful for a variety of tasks, and companies and institutions that have abundant access to computational resources can leverage this to obtain a competitive advantage,” Gómez-Rodríguez says. “This kind of analysis needed to be done to raise awareness about the resources being spent [...] and will spark a debate.”
Energy and Policy Considerations for Deep Learning in NLP [Emma Strubell, Ananya Ganesh and Andrew McCallum/57th Annual Meeting of the Association for Computational Linguistics (ACL)]
Training a single AI model can emit as much carbon as five cars in their lifetimes [Karen Hao/MIT Technology Review]