The Problem: Helping Exploratory Travelers Find Their Next Destination
When you open Airbnb without a clear destination in mind, you're not just browsing — you're exploring. These "exploratory" users behave differently: they search broadly ("France"), visit less frequently, and rarely book immediately. Yet, they represent a massive opportunity for engagement and conversion.
Airbnb's engineering team faced a unique challenge: how to recommend destinations to users who themselves don't know where they want to go? The solution is a sophisticated destination recommendation model that blends short-term behavior (recent searches, views) with long-term preferences (past bookings, seasonal patterns).
This post breaks down the architecture, training strategy, and real-world deployment of this system. If you're building recommendation engines or personalization features, the insights here are directly applicable.
For a deeper understanding of how to build knowledge agents without embeddings, check out our guide on using filesystem and bash.

Model Architecture: Transformers for Travel Intent
The core idea is simple but powerful: treat each user action as a token, inspired by language models. Here's how it works:
- Inputs: Sequences of user actions (bookings, views, searches) + contextual signals (current time, seasonality)
- Embedding: Each action is represented by the sum of embeddings for city, region, and "days to today"
- Transformer: A standard transformer encoder processes these sequences to capture both short-term and long-term dependencies
# Simplified pseudo-code for the model's forward pass
import torch
import torch.nn as nn
class DestinationTransformer(nn.Module):
def __init__(self, num_cities, num_regions, d_model=256, nhead=8):
super().__init__()
self.city_embed = nn.Embedding(num_cities, d_model)
self.region_embed = nn.Embedding(num_regions, d_model)
self.time_embed = nn.Linear(1, d_model) # days to today
self.transformer = nn.TransformerEncoder(
nn.TransformerEncoderLayer(d_model, nhead),
num_layers=6
)
# Multi-task heads: predict both city and region
self.city_head = nn.Linear(d_model, num_cities)
self.region_head = nn.Linear(d_model, num_regions)
def forward(self, city_ids, region_ids, days_to_today):
# Embed each action
x = self.city_embed(city_ids) + self.region_embed(region_ids) + self.time_embed(days_to_today.unsqueeze(-1))
# Transformer encoding
x = self.transformer(x)
# Multi-task predictions
city_logits = self.city_head(x[:, -1, :]) # use last token
region_logits = self.region_head(x[:, -1, :])
return city_logits, region_logits
Balancing Active vs. Dormant Users
One of the most interesting design decisions is how they handle different user states:
- Active users (e.g., someone who searched in the Bay Area last week): Training examples use up-to-date booking, view, and search data from 1–7 days before the actual booking.
- Dormant users (e.g., someone who booked last year and hasn't returned): Training examples use only booking data from 8–365 days before the booking, simulating early-stage exploration.
This dual-training strategy ensures the model works well for both "I know where I'm going" and "I'm just looking" scenarios.

Improving Location Understanding with Multi-Task Learning
Airbnb's geolocation data is rich: cities like San Francisco and San Jose belong to the same region (Bay Area). To encode this, the model uses multi-task learning:
- Two prediction heads: one for city-level, one for region-level destination
- Shared transformer backbone: learns representations that are useful for both granular and coarse predictions
- Consistency constraint: encourages the model to align city and region predictions
This approach is elegant because it doesn't require manual feature engineering — the model learns the hierarchy from data.
Real-World Applications
The model powers two features:
- Autosuggest: When users tap the search bar, they see personalized city recommendations. A/B tests showed significant booking gains in non-English-speaking regions.
- Abandoned Search Emails: If a user leaves without booking, follow-up emails highlight listings from predicted destinations. This re-engages users and drives conversions.
For a real-world case study on scaling memory-safe systems, see how WhatsApp scaled Rust for billions of users.
![]()
Limitations and Caveats
- Cold start: Users with no history get generic recommendations. The model relies heavily on historical data.
- Privacy: Users can opt out of personalization, but the default is on. Always respect user consent.
- Seasonal bias: The model may overfit to seasonal patterns (e.g., always recommending beach destinations in summer).
- Interpretability: Transformers are black boxes. Debugging incorrect recommendations is hard.
Next Steps for Learning
- Explore graph neural networks for encoding location hierarchies more explicitly.
- Try reinforcement learning to optimize for long-term engagement, not just immediate booking.
- Experiment with contrastive learning to better separate different travel intents (e.g., business vs. leisure).
Conclusion
Airbnb's destination recommendation model is a masterclass in applying transformer architectures to a real-world personalization problem. The key takeaways: treat user actions as tokens, balance active and dormant users with smart training data design, and use multi-task learning to encode domain knowledge. The result is a system that genuinely helps users discover where to go next — and drives measurable business impact.