The Problem: Helping Exploratory Travelers Find Their Next Destination

When you open Airbnb without a clear destination in mind, you're not just browsing — you're exploring. These "exploratory" users behave differently: they search broadly ("France"), visit less frequently, and rarely book immediately. Yet, they represent a massive opportunity for engagement and conversion.

Airbnb's engineering team faced a unique challenge: how to recommend destinations to users who themselves don't know where they want to go? The solution is a sophisticated destination recommendation model that blends short-term behavior (recent searches, views) with long-term preferences (past bookings, seasonal patterns).

This post breaks down the architecture, training strategy, and real-world deployment of this system. If you're building recommendation engines or personalization features, the insights here are directly applicable.

For a deeper understanding of how to build knowledge agents without embeddings, check out our guide on using filesystem and bash.

Diagram of a transformer-based destination recommendation model architecture for travel planning IT Technology Image

Model Architecture: Transformers for Travel Intent

The core idea is simple but powerful: treat each user action as a token, inspired by language models. Here's how it works:

  • Inputs: Sequences of user actions (bookings, views, searches) + contextual signals (current time, seasonality)
  • Embedding: Each action is represented by the sum of embeddings for city, region, and "days to today"
  • Transformer: A standard transformer encoder processes these sequences to capture both short-term and long-term dependencies
# Simplified pseudo-code for the model's forward pass
import torch
import torch.nn as nn

class DestinationTransformer(nn.Module):
    def __init__(self, num_cities, num_regions, d_model=256, nhead=8):
        super().__init__()
        self.city_embed = nn.Embedding(num_cities, d_model)
        self.region_embed = nn.Embedding(num_regions, d_model)
        self.time_embed = nn.Linear(1, d_model)  # days to today
        self.transformer = nn.TransformerEncoder(
            nn.TransformerEncoderLayer(d_model, nhead),
            num_layers=6
        )
        # Multi-task heads: predict both city and region
        self.city_head = nn.Linear(d_model, num_cities)
        self.region_head = nn.Linear(d_model, num_regions)

    def forward(self, city_ids, region_ids, days_to_today):
        # Embed each action
        x = self.city_embed(city_ids) + self.region_embed(region_ids) + self.time_embed(days_to_today.unsqueeze(-1))
        # Transformer encoding
        x = self.transformer(x)
        # Multi-task predictions
        city_logits = self.city_head(x[:, -1, :])  # use last token
        region_logits = self.region_head(x[:, -1, :])
        return city_logits, region_logits

Balancing Active vs. Dormant Users

One of the most interesting design decisions is how they handle different user states:

  • Active users (e.g., someone who searched in the Bay Area last week): Training examples use up-to-date booking, view, and search data from 1–7 days before the actual booking.
  • Dormant users (e.g., someone who booked last year and hasn't returned): Training examples use only booking data from 8–365 days before the booking, simulating early-stage exploration.

This dual-training strategy ensures the model works well for both "I know where I'm going" and "I'm just looking" scenarios.

Data flow showing how user actions are encoded as tokens for destination intent prediction Programming Illustration

Improving Location Understanding with Multi-Task Learning

Airbnb's geolocation data is rich: cities like San Francisco and San Jose belong to the same region (Bay Area). To encode this, the model uses multi-task learning:

  • Two prediction heads: one for city-level, one for region-level destination
  • Shared transformer backbone: learns representations that are useful for both granular and coarse predictions
  • Consistency constraint: encourages the model to align city and region predictions

This approach is elegant because it doesn't require manual feature engineering — the model learns the hierarchy from data.

Real-World Applications

The model powers two features:

  1. Autosuggest: When users tap the search bar, they see personalized city recommendations. A/B tests showed significant booking gains in non-English-speaking regions.
  2. Abandoned Search Emails: If a user leaves without booking, follow-up emails highlight listings from predicted destinations. This re-engages users and drives conversions.

For a real-world case study on scaling memory-safe systems, see how WhatsApp scaled Rust for billions of users.

Mobile phone displaying Airbnb autosuggest feature with destination recommendations System Abstract Visual

Limitations and Caveats

  • Cold start: Users with no history get generic recommendations. The model relies heavily on historical data.
  • Privacy: Users can opt out of personalization, but the default is on. Always respect user consent.
  • Seasonal bias: The model may overfit to seasonal patterns (e.g., always recommending beach destinations in summer).
  • Interpretability: Transformers are black boxes. Debugging incorrect recommendations is hard.

Next Steps for Learning

  • Explore graph neural networks for encoding location hierarchies more explicitly.
  • Try reinforcement learning to optimize for long-term engagement, not just immediate booking.
  • Experiment with contrastive learning to better separate different travel intents (e.g., business vs. leisure).

Conclusion

Airbnb's destination recommendation model is a masterclass in applying transformer architectures to a real-world personalization problem. The key takeaways: treat user actions as tokens, balance active and dormant users with smart training data design, and use multi-task learning to encode domain knowledge. The result is a system that genuinely helps users discover where to go next — and drives measurable business impact.

This content was drafted using AI tools based on reliable sources, and has been reviewed by our editorial team before publication. It is not intended to replace professional advice.