In 2025, the worldwide expenditure on infrastructure as a service and platform as a service (IaaS and PaaS) reached $90.9 billion, a 21% rise from the previous year, according to Canalys. From I’m seeing, this surge is primarily driven by companies migrating their workloads to the cloud and adopting AI, which relies heavily on compute resources. Yet as businesses eagerly embrace these technologies, they are also encountering obstacles that could hinder their strategic use of AI.
Transitioning AI from research to large-scale deployment poses a challenge in distinguishing between the costs associated with training models and those linked to inferring them. Rachel Brindley, senior director at Canalys, notes that, although training usually involves a one-time investment, inferencing comes with expenses that may vary considerably over time. Enterprises are increasingly concerned about the cost-effectiveness of inference services as their AI projects move towards implementation. It is crucial to pay attention to this, as costs can quickly add up and create pressure for companies.
Today’s pricing plans for inferencing services are based on usage metrics, such as tokens or API calls. As a result, companies may find it difficult to predict their costs. This unpredictability could lead businesses to scale back the sophistication of their AI models, restrict deployment to critical situations, or even opt out of inferencing services altogether. Such cautious strategies might hinder the overall advancement of AI by constraining organizations to less cutting-edge approaches.