Retrieval is the process of fetching external information at inference time and injecting it into the LLM Context. Retrieval is distinct from Prompt creation as it is a dynamic process with a selection criteria under constraints of context size, relevance, cost, etc.

Recall that the model is stateless but it can plan for what information it needs when wrapped in an Agent loop and evaluates supplied information. Some sort of Agent Harness is required to process plans for retrieval and execute them. Examples of typical retrieval techniques are Tool Calling, MCP, and RAG.

Retrieval is an important component in making Agentic workflows more powerful as it allows for autonomous collection of relevant data and mitigates some Context Window issues by allowing the model to evaluate information relevance.