Why Agentic Architecture Won the Day
Most engineering teams face a familiar tension: you've consolidated your backend services, but the workflow layer remains fragmented. Each buying channel — Direct, Self-Serve, Programmatic — runs on shared infrastructure, yet the decision logic drifts. Budget allocation, inventory selection, and performance tradeoffs get re-implemented per channel and per surface. Over time, they diverge.
Spotify's Ads AI team confronted this exact problem. Instead of building yet another REST service with hard-coded happy paths, they bet on an agentic approach: a unified, programmable decision layer that can understand goals, reason over shared signals, and orchestrate existing Ads APIs consistently across all buying channels and surfaces.
The core insight: workflows are combinatorial. You can't capture planning, forecasting, audience selection, and optimization in a static decision tree. You need agents that can reason, adapt, and use tools.
Source: Spotify Engineering Blog

The Multi-Agent Blueprint
Spotify decomposed the media planning workflow into specialized agents, each with a focused responsibility and optimized prompt. The architecture leverages Google's Agent Development Kit (ADK) 0.2.0 for orchestration and Vertex AI (Gemini 2.5 Pro) for natural language understanding.
Agent Breakdown
-
RouterAgent — The traffic controller. It analyzes incoming user messages and determines what information is present, preventing unnecessary LLM calls.
-
Resolution Agents — Each handles a distinct dimension:
- GoalResolverAgent: Maps user intent to campaign objectives (REACH, CLICKS, APP_INSTALLS) and searches for appropriate ad categories
- AudienceResolverAgent: Extracts targeting criteria from a predefined taxonomy — interests, geography, age ranges, gender
- BudgetAgent: Parses various formats ($5000, 5k, €10,000) and converts to micro-units
- ScheduleAgent: Handles date parsing including relative dates like "next month"
-
MediaPlannerAgent — The optimizer. It takes all resolved information and generates optimized ad set recommendations using a heuristics engine backed by historical performance data.
Key Optimization Rules
- Minimize cost metrics (CPM, CPC, CPI) relative to historical medians
- Target campaigns with delivery rates close to 100%
- Find historically successful campaigns with similar budget ranges
- Score based on demographic and interest overlap
- Ensure diversity in format/goal combinations
- Scale recommendations by budget: €0-1,000 → 1 recommendation; €15,000+ → 4-5 recommendations
Parallel Execution in Action
# Simplified orchestration flow using Google ADK
from google.adk import Agent, FunctionTool
router = Agent(
name="RouterAgent",
model="gemini-2.5-pro",
tools=[
FunctionTool.from_func(extract_goal),
FunctionTool.from_func(extract_audience),
FunctionTool.from_func(extract_budget),
FunctionTool.from_func(extract_schedule)
],
instruction="""
Analyze the user's message and determine which resolution agents to invoke.
Return a structured plan with only the missing information.
"""
)
# Each resolution agent runs in parallel
resolved_data = await asyncio.gather(
goal_resolver.resolve(user_message),
audience_resolver.resolve(user_message),
budget_resolver.resolve(user_message),
schedule_resolver.resolve(user_message)
)
# MediaPlannerAgent uses all resolved data to generate recommendations
plan = await media_planner.generate(resolved_data)
Performance Impact: Media plan creation dropped from 15-30 minutes to 5-10 seconds. Required user inputs went from 20+ form fields to 1-3 natural language messages. Agent response latency sits at ~3-5s with parallel execution.
Tool Integration via Function Calling
Spotify uses Google ADK's FunctionTool to give agents access to real data. The @Schema annotations provide the LLM with structured information about tool parameters, grounding outputs in reality and preventing hallucination.

Trade-offs and Lessons Learned
Design Decisions
| Decision | Chosen Approach | Rationale |
|---|---|---|
| Single vs. multi-agent | Multi-agent | Better latency, maintainability, and parallelization |
| Cache strategy | In-memory cache for historical data | Minimizes latency; performance data is bounded and refreshed periodically |
| Response mode | Synchronous | Simpler initial implementation; streaming planned for future |
Key Learnings
1. Prompt engineering is software engineering Treat prompts as code — version control, testing, iteration. Small wording changes dramatically affect output consistency. Be explicit about format, provide concrete examples, and build guardrails at both prompt and parsing layers.
2. Agent boundaries matter Too many agents increases latency and coordination overhead. Too few creates monolithic, hard-to-maintain prompts. Rule of thumb: one agent per distinct skill or data source.
3. Tools enable grounding LLMs hallucinate. By providing agents with tools that access real data (geo targets, ad categories, historical performance), you ground outputs in reality. The LLM reasons about what to do; tools provide accurate data.
Limitations and Caveats
- Hallucination risk remains: Even with tool-grounded outputs, the LLM can misinterpret tool results or invent intermediate reasoning steps. Continuous monitoring and guardrails are essential.
- Agent coordination overhead: While parallel execution helps, the orchestration layer itself introduces latency and complexity. For simpler workflows, a single-agent approach may be more pragmatic.
- Cold start problem: The heuristics engine relies on historical data. For new campaigns or niche objectives with sparse historical data, recommendations may be less reliable.
- Prompt maintenance burden: As the system evolves, prompts for each agent need continuous refinement. Prompt drift can silently degrade performance.
Next Steps
Spotify's roadmap includes:
- Streaming responses via server-sent events for real-time feedback
- Multi-turn refinement for iterative campaign optimization
- A/B testing integration to automatically test AI-recommended plans against baselines
- Fine-tuned models for domain-specific advertising terminology
For teams considering a similar path, start small: pick one bounded workflow where fragmentation hurts most, build a minimal multi-agent prototype, and measure the latency vs. quality tradeoff before scaling.

Conclusion: The Future of Workflow Automation
Spotify's Ads AI demonstrates that complex, multi-step workflows are well-suited to multi-agent architectures. By decomposing media planning into specialized agents — each with focused prompts, relevant tools, and clear responsibilities — they created a system that's both powerful and maintainable.
The combination of Google ADK for orchestration, Vertex AI for LLM capabilities, and historical performance data creates a system that doesn't just understand what advertisers want — it knows what actually works.
The takeaway: Instead of hard-wiring more deterministic workflows per channel, treat your domain as a set of modular agents that consume shared signals, optimize jointly for user goals and business constraints, and use existing APIs as tools. That's how you centralize decision-making once and project it everywhere.
Recommended Reading
- Pandas loc vs iloc: The Definitive Guide to DataFrame Indexing — Understand the fundamentals of data indexing, a core skill for any data-intensive agent system.
- Data Commons MCP Now Hosted on Google Cloud: Query Public Data with AI, No Setup Required — Explore how to leverage public datasets for enriching agent knowledge without infrastructure overhead.