LLM Application

An LLM application is software built around one or more large language models that interact with users or systems, typically combining prompts, tools, retrieval, evaluation, and traditional code to deliver a specific business outcome.

Detailed explanation

An LLM application is more than a model API call. It usually includes prompt templates, context assembly, tool or function calling, retrieval over private data, structured output parsing, evaluation harnesses, observability, and traditional application logic. The model is one component in a larger system.

Common patterns include chat assistants, document analysis, classification and extraction, code generation, search, and agentic workflows. Production LLM applications need versioning of prompts and models, automated evaluation on real or synthetic data, cost and latency monitoring, and safety controls such as content filters and prompt-injection defenses.

The discipline of building LLM applications is converging with traditional software engineering: source control for prompts, CI for evals, canary releases for model changes, and incident response for AI-specific failure modes like jailbreaks, data leakage, and quality regressions.

← Back to glossary