Sales Funnel visualization with R
Sales (purchasing or conversion) Funnel is a consumer-focused marketing model which illustrates the theoretical customer journey towards the purchase of a product or service. Classically, the Sales funnel includes at least four steps:
There can be more steps and this depends on the business model and level of detail you want to/can represent. Obviously, there is quite a lot of creative work in order to establish the model and define the steps of the customer journey in practice. Once you managed this it is always interesting to see the result in visual form. Usually, the Sales Funnel looks like a tornado and it narrows from the top to the bottom. Therefore, the Sales Funnel allows you to find where there is a
bottle funnel neck or what is the step where you lose customers.
We will study a very simple example of creating a Sales Funnel by focusing on visualization in this post.
If we speak about e-commerce, we can use content of the site for defining steps. Let’s assume that our main page and landing pages of ad campaigns are responsible for the Awareness step. The pages with product or service descriptions correspond to the Interest from customers. Shopping cart page confirms the Desire. Finally, the conversion page (“thank you page”) is the Action. In other words, our theoretical customer journey looks like the following:
Note: you can distribute all pages to the Sales Funnel steps with the content grouping option in Google Analytics. Further, you can find a lot of examples on how to extract data from Google Analytics directly from R via API. And if you use the Google Analytics content grouping option you can extract these groups.
Let’s say we have a dataset with the grouped pages and number of sessions (visits). Further, we need to define which pages belong to each step of the funnel:
Furthermore, we can make some improvements to the bottom of the Sales Funnel. If you use LifeCycle Grids you can identify the customer’s lifecycle phase. Therefore, we can represent a breakdown of customers who made purchases into their status (e.g. new customer, engaged and loyal). If you are not familiar with the LifeCycle Grids approach, please start here.
Here is our data set with the number of customers and their statuses:
By using the same column names with the content table, it is easier to combine data.
The logic behind the R code is simple: we need to combine tables, define the order of steps, calculate dummy values for centering bars, calculate a share of session proceeded to the next step and plot the result:
Let’s go through the plot: there are stacked bars on the top of the chart that represent the total number of sessions (visits) at each step of Sales Funnel. 78.3% of sessions went from the Awareness step to the Interest one, 72.2% from the Interest to the Desire and 92.3% from the Desire to the Action. Even though it is not necessary, we used another logic at the bottom of the Funnel. If we can identify the customer’s type, we can split the total number of Actions (120K in our case) into different customer types (e.g. new, engaged and loyal customers). Further, we calculated the share of each customer type in the Action (20.8%, 33.3% and 45.8%). If we aren’t new in the market and have a number of loyal customers, we would take the hourglass view of the Sales Funnel.
click to expand R code
library(dplyr) library(ggplot2) library(reshape2) # creating a data samples # content df.content <- data.frame(content = c('main', 'ad landing', 'product 1', 'product 2', 'product 3', 'product 4', 'shopping cart', 'thank you page'), step = c('awareness', 'awareness', 'interest', 'interest', 'interest', 'interest', 'desire', 'action'), number = c(150000, 80000, 80000, 40000, 35000, 25000, 130000, 120000)) # customers df.customers <- data.frame(content = c('new', 'engaged', 'loyal'), step = c('new', 'engaged', 'loyal'), number = c(25000, 40000, 55000)) # combining two data sets df.all <- rbind(df.content, df.customers) # calculating dummies, max and min values of X for plotting df.all <- df.all %>% group_by(step) %>% mutate(totnum = sum(number)) %>% ungroup() %>% mutate(dum = (max(totnum) - totnum)/2, maxx = totnum + dum, minx = dum) # data frame for plotting funnel lines df.lines <- df.all %>% distinct(step, maxx, minx) # data frame with dummies df.dum <- df.all %>% distinct(step, dum) %>% mutate(content = 'dummy', number = dum) %>% select(content, step, number) # data frame with rates conv <- df.all$totnum[df.all$step == 'action'] df.rates <- df.all %>% distinct(step, totnum) %>% mutate(prevnum = lag(totnum), rate = ifelse(step == 'new' | step == 'engaged' | step == 'loyal', round(totnum / conv, 3), round(totnum / prevnum, 3))) %>% select(step, rate) df.rates <- na.omit(df.rates) # creting final data frame df.all <- df.all %>% select(content, step, number) df.all <- rbind(df.all, df.dum) # defining order of steps df.all$step <- factor(df.all$step, levels = c('loyal', 'engaged', 'new', 'action', 'desire', 'interest', 'awareness')) df.all <- df.all %>% arrange(desc(step)) list1 <- df.all %>% distinct(content) %>% filter(content != 'dummy') df.all$content <- factor(df.all$content, levels = c(as.character(list1$content), 'dummy')) # calculating position of labels df.all <- df.all %>% arrange(step, desc(content)) %>% group_by(step) %>% mutate(pos = cumsum(number) - 0.5*number) %>% ungroup() # creating custom palette with 'white' color for dummies cols <- c("#fec44f", "#fc9272", "#a1d99b", "#fee0d2", "#2ca25f", "#8856a7", "#43a2ca", "#fdbb84", "#e34a33", "#a6bddb", "#dd1c77", "#ffffff") # plotting chart ggplot() + theme_minimal() + coord_flip() + scale_fill_manual(values=cols) + geom_bar(data=df.all, aes(x=step, y=number, fill=content), stat="identity", width=1) + geom_text(data=df.all[df.all$content!='dummy', ], aes(x=step, y=pos, label=paste0(content, '-', number/1000, 'K')), size=4, color='white', fontface="bold") + geom_ribbon(data=df.lines, aes(x=step, ymax=max(maxx), ymin=maxx, group=1), fill='white') + geom_line(data=df.lines, aes(x=step, y=maxx, group=1), color='darkred', size=4) + geom_ribbon(data=df.lines, aes(x=step, ymax=minx, ymin=min(minx), group=1), fill='white') + geom_line(data=df.lines, aes(x=step, y=minx, group=1), color='darkred', size=4) + geom_text(data=df.rates, aes(x=step, y=(df.lines$minx[-1]), label=paste0(rate*100, '%')), hjust=1.2, color='darkblue', fontface="bold") + theme(legend.position='none', axis.ticks=element_blank(), axis.text.x=element_blank(), axis.title.x=element_blank())
Get new post notification