← All briefings Briefing

The three highest-impact SageMaker cost levers for ML teams.

awssagemakercostmlfinops

AWS SageMaker removes much of the undifferentiated work of building ML infrastructure, but its flexible pricing can also hide waste. A cost optimisation guide from Wring.co identifies three patterns that drive the majority of unnecessary SageMaker spend. Each is straightforward to fix once you know where to look, and none requires a architecture change.

Idle inference endpoints

Production inference endpoints are often left running at full scale even when traffic is low or absent overnight and weekends. SageMaker Serverless Inference and multi-model endpoints can help, but the simplest win is to scale endpoints in and out with traffic. Autoscaling policies based on invocations per instance or latency can reduce idle time dramatically without affecting user experience.

For batch or internal use cases, consider whether an always-on endpoint is needed at all. Running inference as a batch transform job, or using a smaller model behind an endpoint that scales to zero, can cut costs by an order of magnitude.

Oversized training jobs

Training instance families vary widely in price, and it is tempting to default to the largest GPU instance available. In practice, many jobs are not bottlenecked by GPU memory or compute in the way teams assume. Start with a smaller instance, profile the job and scale up only when metrics prove it is necessary.

Use managed spot training for fault-tolerant workloads. The savings are substantial and, with checkpointing, the interruption risk is manageable for most modern training frameworks.

Notebooks left running

SageMaker Studio and notebook instances are convenient for experimentation, but they are also easy to leave on. A single large notebook instance running for a month can cost hundreds of dollars. Institute idle timeouts, encourage users to shut down kernels and use lifecycle configurations to stop notebooks automatically after a period of inactivity. Shared team domains should also be reviewed regularly to remove unused user profiles and old experiments.

Build a cost review habit

These three levers are not technically complex, but they require visibility. Tag SageMaker resources by team and project, set AWS Budgets alerts and review usage weekly. Watch for storage and data transfer charges alongside compute, since large datasets moved repeatedly between S3, training jobs and endpoints can add surprising costs.

The teams that keep SageMaker spend under control treat cost as a first-class operational metric, not a finance afterthought. A short weekly review of the top cost drivers, owned jointly by engineering and FinOps, will usually surface one or two quick wins. In our experience, simply identifying and deleting idle endpoints and shutting down abandoned notebooks can reduce a team’s first SageMaker bill by 20% or more.

Related briefings

Keep reading.

More from the team

Longer thinking →

Briefings are short reads on the news. For Burt's own thinking, see the Journal.