MLOps

Also known as: ML operations, machine learning operations, LLMOps

MLOps is the practice of operating machine learning systems in production reliably and repeatably — covering training, deployment, monitoring, retraining, and governance, analogous to DevOps for software.

MLOps brings software engineering and operations rigor to machine learning. It covers reproducible training pipelines, model registries, deployment patterns (online, batch, edge), feature stores, monitoring for accuracy and drift, automated retraining, and governance for model changes.

Mature MLOps practices treat models like any other production artifact: versioned, tested, observable, and reversible. Key tooling categories include experiment tracking, pipeline orchestration, model serving, feature platforms, evaluation frameworks, and ML observability.

For LLM systems, MLOps extends to prompt and eval management, RAG pipeline operations, and tool/agent observability — sometimes called LLMOps. The core principles are the same: ship safely, monitor honestly, roll back fast.

Related service

AI & Machine Learning Solutions

Related terms

← Back to glossary

MLOps

Detailed explanation

Related

Related service

Related terms