By Shekhar Khandelwal

15. August 2023

How to use Bayesian Hierarchical Marketing Mix Modeling (BHMMM) to redefine marketing strategies at a regional level

Welcome back to the world of Marketing Mix Modelling (MMM) with a Bayesian twist. In our last article, we introduced the concept of Bayesian MMM and how it revolutionizes the traditional marketing models by incorporating prior knowledge and adapting to new data. Today, we take a leap forward and dive into one of its most amazing capabilities – handling hierarchical data.

In this article, we will talk about:

What is Hierachical Data?
How traditional MMM’s model Hierarchical Data?
Hierarchical Modeling using the Bayesian Approach?

Firstly, what is Hierarchical Data?

Before we unravel the magic that Bayesian MMM does on Hierarchical data, let’s quickly understand what hierarchical data is and why it's a treasure trove for marketers. Imagine a multinational company selling multiple products in different regions. The sales data can be structured at various levels – country, state, city, and product categories. This multi-level structure is hierarchical data.

Definition Hierachical Data

Hierarchical data is information that's organized in a structure similar to a pyramid or a tree, with different levels of categories and subcategories.

In marketing, consider a global brand's advertising data. At the top level, you might have different regions like North America, Europe, and Asia. Within each region, there are individual countries. Each country may be split into different states or provinces, and each state might be further divided into different cities or towns. For each city, you might track different types of advertising spend, like television, radio, print, or online. So, the data is organized hierarchically - from regions down to specific types of ad spend in individual cities. Each level of this hierarchy is part of the larger whole, and data at each level can provide different insights for the brand's marketing strategy.

For marketers and advertising agencies, this data is gold. The ability to analyze and understand the interplay between different levels of hierarchy can reveal insights about customer preferences, regional trends, and product performances. But, here's the catch: traditional MMM struggles to accurately capture the relationships in such data.

How traditional MMM’s model Hierarchical Data ?

Within the realm of traditional machine learning based marketing mix models, two primary approaches emerge: the pooled and the unpooled models.

Pooled Model (Complete Pooling):

In a pooled model, we don't differentiate between the different hierarchical groups. We treat all data as if it comes from a single group. This is akin to aggregating all the data and running a single regression model on it.

Building one comprehensive model is often the most straightforward approach: all samples are aggregated, and distinctions between different groups are disregarded.

However, this approach, especially when using simple models like linear regression, might overlook nuances in the data, a phenomenon known as underfitting. More complex "black-box" methods, such as gradient boosting, might detect and learn from the different sub-datasets on their own, offering potentially better accuracy. But this comes at the expense of interpretability, making it challenging to understand the model's underlying mechanics and decisions.

Here's the visualization of the pooled model :

The dots represent sales data for three regions (A, B, and C) against different advertising spends.
The black line is the regression line for the pooled model, which is fitted to all the data across regions.

You can notice that while the black line may provide a general trend, it doesn't seem to perfectly capture the individual trends for each region.

Unpooled Model:

In contrast, we create a separate model for each group. So, if we have data for 3 regions, we'll run 3 separate regression models, one for each region. These models are referred to as unpooled models.

While each model specializes in a specific subset of the data, collectively, they aim to provide a comprehensive understanding of the entire dataset. The primary advantage is that these models can capture specific trends and nuances within each subset.

Here's the visualization for the unpooled models:

The dots still represent the sales data for the three regions (A, B, and C) against different advertising spends.
The colored lines are the regression lines for each of the regions:

Red line represents the regression for Region A
Green line represents the regression for Region B
Blue line represents the regression for Region C

You can observe that the individual regression lines fit their respective regional data better than the pooled model did. This is the advantage of the unpooled approach: it can capture nuances and variations specific to each group.

However, there are challenges:

The need to fit multiple models, which can be computationally intensive.
The potential for overfitting, especially when the datasets for individual groups are small.

To summarize:

Pooled Model: Provides a general trend across all data but might miss nuances in individual groups.
Unpooled Model: Captures the trends within individual groups more accurately, but you end up with multiple models.

Enter Bayesian MMM with its magic wand.

Hierarchical Modeling using the Bayesian Approach

Now, let's discuss the hierarchical Bayesian approach.

While both pooled and unpooled methods have their advantages, they also have drawbacks:

The pooled approach might be too generalized and miss out on nuances in the data.
The unpooled approach can capture nuances but might overfit to individual groups, especially if some groups have limited data, and there will be too many models to build and analyze, which is unrealistic.

Hierarchical Bayesian Modeling offers a middle ground. Hierarchical models, often implemented using Bayesian techniques, strike a balance between pooled and unpooled models. Here's a simple explanation:

Partial Pooling: Hierarchical models allow for "partial pooling", meaning they share information across groups (like the pooled model) but also allow for group-specific effects (like the unpooled models).

Here's how:

Information Sharing: When analyzing sales of a new product, some regions might lack data. Bayesian MMM cleverly uses information from well-represented regions to improve predictions in data-scarce areas. This technique is invaluable for diverse businesses with varying regional characteristics.

Managing Complexity: While traditional models can stumble over intricate hierarchical data, Bayesian MMM thrives on it. It adeptly captures interactions, such as a local promotion's impact on broader sales, while also recognizing ongoing trends and seasonal shifts.

Leveraging History and Expertise: The Bayesian method integrates past data and expert insights. This is crucial when navigating hierarchical structures, ensuring decisions are grounded in both historical context and specialized knowledge.

Balanced Modeling: Bayesian MMM operates on the idea that each region or group, while unique, is part of a larger shared pattern. If a region's data suggests a significant deviation, the model adjusts. This balance between individuality and shared trends ensures both specificity and broad applicability.

In essence, hierarchical Bayesian modeling blends the strengths of both pooled and unpooled approaches. It allows for individual group differences while also benefiting from the shared information across groups. In the context of marketing mix models, this can lead to more robust insights into the effects of different marketing levers across various segments or regions.

Illustrative Example: The Coffee Shop Chain

Imagine you're the marketing manager for "BeanStreet," a thriving coffee shop chain with locations spread across various neighborhoods, cities, and states.

You're planning a new advertising campaign and want to know which types of ads (TV, social media, radio) work best in different areas, and how external factors such as weather conditions and local events affect sales.

The Challenge

Your data is hierarchical; you have coffee shops (level 1) nested within neighborhoods (level 2), which are part of cities (level 3), which belong to different states (level 4). The challenge is understanding the interplay between advertising mediums and external factors across these levels to optimize your marketing strategy.

Enter Bayesian MMM

You decide to employ Bayesian Marketing Mix Modeling to tackle this challenge.

Understanding State Preferences: Bayesian MMM helps you identify that social media ads are more effective in State A, while TV ads have a greater impact in State B. You realize that State A has a younger demographic, and they respond more to social media promotions.

Tailoring to Weather Patterns: You find that hot coffee sales spike in colder weather. The model, using data from different levels, reveals that this trend is particularly strong in City X, which has long, cold winters. In contrast, cold brews do better in warmer climates like in City Y.

Local Event Influence: Bayesian MMM indicates that local events in certain neighborhoods significantly boost sales. For example, during a popular art festival in Neighborhood Z, sales triple. You didn’t have enough data on Neighborhood Z specifically, but the model borrows information from similar neighborhoods to make this prediction.

Making Informed Decisions

With these insights from Bayesian MMM, you make a series of data-driven decisions:

Allocate more budget to social media advertising in State A and to TV advertising in State B.
Launch special hot coffee promotions in City X during winter and cold brew campaigns in City Y during summer.
Collaborate with event organizers in Neighborhood Z to offer special discounts or limited-edition menu items during the art festival.

This example demonstrates how Bayesian Marketing Mix Modeling’s adeptness at handling hierarchical data empowers BeanStreet to make highly targeted and effective marketing decisions that account for variances at each level of hierarchy – from state down to individual neighborhoods.

-

How MMT could help you

If you are interested in setting up a marketing mix model for your company, we would be happy to support you in this process, either as a consultant or with our Self-Service Marketing Mix Modeling Software, depending on your needs and available expertise. Feel free to get in touch with us!

The Way Forward

For marketers and advertising agencies, hierarchical data is like a well of insights waiting to be tapped. Bayesian Hierarchical Marketing Mix Modelling (BHMMM), with its ability to gracefully navigate through the complexities of hierarchical data, is the rope that will help you draw water from this well.

Whether you are trying to optimize advertising campaigns or looking to better understand regional trends, Bayesian Hierarchical Marketing Mix Modelling (BHMMM) offers a powerful and adaptable tool for turning data into insights and insights into action.

So, put on your magic glasses and dive into the world of hierarchical data with Bayesian Hierarchical Marketing Mix Modelling (BHMMM) – where data tells stories and insights drive success.