Amazon pushes new controls for spendthrift GenAI models

GenAI is as much of a risk for enterprises as it is a potential benefit, offering the potential for expensive disasters which could wipe out any efficiency savings in a frighteningly short space of time.

Now AWS has introduced a new feature which will enable businesses to keep track of the costs of their GenAI projects, reducing the chance of nasty surprises for CFOs when it's time to close the books (or worse).

AWS wrote: "As enterprises increasingly embrace generative AI , they face challenges in managing the associated costs. With demand for generative AI applications surging across projects and multiple lines of business, accurately allocating and tracking spend becomes more complex."

It has expanded tagging capabilities so that foundation models can now be monitored with tags which track their costs. Until now, tagging was supported across a range of Bedrock resources including provisioned models, custom models, agents and agent aliases, model evaluations, prompts, prompt flows, knowledge bases, batch inference jobs, custom model jobs, and model duplication jobs.

Until now, there was no ability to tag on-demand foundation models, which "added complexity to cost management for generative AI initiatives", AWS wrote in a blog announcing the feature.

Organisations can now label all Amazon Bedrock models with AWS cost allocation tags, to"align usage to specific organisational taxonomies such as cost centers, business units, and applications."

Services like AWS Budgets can also be used to set "tag-based budgets and alarms" to monitor usage and set alerts clanging whenever anomalies are detected or predefined thresholds are overcome.

"This scalable, programmatic approach eliminates inefficient manual processes, reduces the risk of excess spending, and ensures that critical applications receive priority," AWS wrote "Enhanced visibility and control over AI-related expenses enables organisations to maximize their generative AI investments and foster innovation."

AWS recently introduced a feature called cross-region inference, which lets users automatically route inference requests across AWS Regions.

Amazon Bedrock’s cross-region inference feature now allows automatic routing of AI model requests across AWS Regions via system-defined inference profiles. However, for better cost and usage management, Amazon Bedrock introduced application inference profiles. These allow organizations to apply custom tags to track and control costs by tenant or workload.

With application inference profiles, users can configure specific inference settings for models, including tagging metadata to align with project budgets, using APIs like CreateInferenceProfile and TagResource. This tagging enables organizations to categorize and monitor spending by project or team, with AWS Budgets, Cost Explorer, and CloudWatch integration offering insights and alerts for cost management.

AWS also offered this advice to enterprises: "Organisations need to prioritise their generative AI spending based on business impact and criticality while maintaining cost transparency across customer and user segments. This visibility is essential for setting accurate pricing for generative AI offerings, implementing chargebacks, and establishing usage-based billing models."

"Without a scalable approach to controlling costs, organizations risk unbudgeted usage and cost overruns. Manual spend monitoring and periodic usage limit adjustments are inefficient and prone to human error, leading to potential overspending."