By Dan Blagojevic PhD, Chief Data Scientist, Roberto Wolf, Principal of AI Innovation and Chris Marshall, Lead Data Architect
The environmental footprint of generative AI is becoming hard to ignore. Take Scotland as an example. Data centres have increased their water use by 300 percent since 2021. That is the equivalent of 27 million bottles of drinking water every year, mostly to stop servers overheating. AI adoption is accelerating, and while the major technology providers are working on greener, more efficient data centres, those improvements will not fix the near-term impact. Organisations want the value of Gen AI now, and they want to adopt it responsibly. The encouraging truth is that many of the most effective levers are in their own hands.
This became clear in a recent conversation between Dan Blagojevic, Chief Data Scientist at Optima Partners, and colleagues Roberto Wolf, Principal of AI Innovation, and Chris Marshall, Lead Data Architect. They shared a simple but important message. The environmental impact of AI is not just determined by powerful servers or hyperscale infrastructure. It is shaped by the everyday choices made inside organisations. How people write prompts. How they store and retrieve information. Which models they select. How long they allow conversations to run. These decisions, repeated thousands of times across a business, determine whether AI becomes a streamlined asset or an unnecessarily resource-hungry one.
Cloud computing is a good starting point. We often talk about the cloud as if it were floating above us, weightless and endless. In reality, it sits in thousands of buildings filled with machines that consume substantial power and water. The cloud market is heading toward an estimated two trillion dollars by 2030, and yet many organisations are unknowingly wasting compute resources. Chris sees it regularly. Oversized compute services running at 10 percent utilisation. Databases scanning entire tables when only a few fields are relevant. These are not dramatic failures. They are habits. Fixing them is one of the quickest ways to reduce operating cost and environmental impact in one step.
Generative AI introduces the same pattern, but the stakes are higher. When someone submits a prompt, it is easy to forget the amount of work the model is doing behind the scenes. It breaks the message into small “tokens”, passing them through hundreds of billions of calculations and mathematical operations to generate just one word in the response. Roberto puts it simply. The more tokens you use, the more energy the model consumes. And the relationship is not linear. Small increases in token count can lead to disproportionately higher compute. This is why short, clear prompts are not just better to read. They are genuinely more sustainable.
Conversation length matters too. Large language models, the powerhouse behind all gen-AI, reprocess the full conversation history every time they generate a reply. In settings like contact centres, where interactions may run for long periods, these models quietly accumulate a large context window. Chris suggests organisations build sensible boundaries into their systems. Summarise older content. Reset where appropriate. Help the model focus on the relevant information and avoid unnecessary reprocessing.
Model choice is another powerful lever. The largest models offer impressive general reasoning, but they are not always needed. Many business tasks are more routine, structured or narrowly defined. Smaller “specialist” models, especially those refined through distillation or quantisation, can handle these tasks extremely well while consuming far less computing resources. Roberto encourages organisations to select models with intention. Use the right tool for the job, not the biggest tool available.
Data quality also plays a significant role. Even the most capable models struggle when the underlying information is unstructured or inconsistent. This leads to hallucinations and repeated queries, which both increase computational load. Chris highlights that graph databases are becoming an effective way to structure information for retrieval. They allow AI systems to access what they need directly, without interpreting or reprocessing large volumes of text.
Even image generation follows the same logic. Users often regenerate images several times when they are unsure how to guide the model. Roberto notes that providing reference images or templates gives the model a clear direction from the start. This reduces the number of iterations needed and improves efficiency.
What becomes clear across all these areas is that organisations are not passive consumers of Gen AI. They play an active role in shaping their own environmental footprint. A single prompt may seem small, but enterprise usage spans thousands of prompts running across GPU clusters that draw considerable power. The encouraging part is that financial cost and environmental cost tend to move together. Improvements that reduce waste also reduce spend.
The organisations that gain the most from Gen AI are the ones that bring intention and clarity to how they use it. Clear prompts. Clean data. Sensible models. Manageable context windows. Guided image generation. These may seem like small steps, but together they help keep AI efficient, affordable and responsible.
Key takeaways:
- Clear, specific prompts keep token counts low and reduce unnecessary compute.
- Managing conversation length through summarisation and session boundaries prevents reprocessing of irrelevant history.
- Choosing a model that fits the task avoids the overhead of larger architectures.
- High-quality structured data reduces hallucinations and unnecessary repeated queries.
- Graph databases provide an efficient foundation for retrieval augmented generation.
- Reference images and templates streamline image generation and reduce iteration cycles.
- Intentional design and day-to-day discipline help organisations get the most out of Gen AI while keeping financial and environmental impact under control.

