# Portfolio optimization applied to marketing

What has option pricing and portfolio optimization to do with digital marketing? Let’s recap a bit the basic ideas related to options and investments;

- options are a type of security more specifically a financial derivative which can be acquired or sold
- the price of an option is regulated by the stock market but has nevertheless a strong stochastic fluctuation
- a portfolio is usually a combination of cash, bonds, options etc but can be considered for our purposes as just a collection of options
- the investment question related to a portfolio is in essence; given a collection of options, how much of each should one buy in order to optimize the return? The problem is related to the balance between buying high-return/high-risk options and low-return/low-risk ones.

So, the portfolio optimization is related to digital marketing based on the following mapping (homomorhism);

- an option is just a stochastic variable with risk corresponding to the standard deviation and the return corresponding to the expected value. Typically, a high expectation value is linked to a big variation i.e. risk.
- a feature is just a stochastic variable giving the usage and a high usage presumably corresponding to a high chance of classifying the client (buy). One can discard the noisy features in order to ensure that potentially high-usage features are related to non-buying information.
- a large deviation corresponds to attracting a larger amount of clients and presenting information not immediately related to the buying process
- a portfolio corresponds to a selection of features which can be presented a marketing optimizations

Loosely speaking this leads to the following table;

Finance | Marketing | Advertizing |
---|---|---|

Option | Feature | Ad |

Price | Usage | Clicks |

Portfolio | Suggestions | Placement |

Profit | Conversions | Clicks |

Risk | Sales drop | Click through drop |

According to the information available (mostly US patents, scientific articles and newspapers) the way that Efficient Frontier and Adobe are using this technique relies on the following steps;

- Stage 1
- tracking; usage of keywords, pages and other data
- search engines; data from search engines analytics
- keyword scoring and relations between pages (page ranking)

- Stage 2

Model creation and parameter estimation. Forecasting of how much hits a feature will have **within a geographic region**.

- Stage 3

Compare forecasting with reality, tuning the model. The model is being updated **continuously**.

### Details

The following is a sligfhtly more mathematical translation of the portfolio optimization described above

Let’s say you want to invest some money in the stock market. You choose a set of stocks and have a sum of money to invest. How should you distribute the money into the different stocks? There is a general tradeoff between risk and return; with higher potential return we often face higher risk. If we have a goal return in mind, then we should choose the portfolio allocation that minimizes the risk for that return. How can we do this?

Say, we have a series of options

symbols = [‘GOOG’, ‘AIMC’, ‘GS’, ‘BH’, ‘TM’, ‘F’, ‘HLS’, ‘DIS’, ‘LUV’, ‘MSFT’]

how do we find a vector x with length one such that it optimizes return?

You need to look at the options as stochastic variables Xi and the past values can tell you what the mean, variance and other statistical parameters are of the variable. Once the stochastic is parametrized the optimization amounts to maximizing the expectation of

but one usually looks at the maximization of the square

with C the covariance matrix. The goal to have a minimum return amounts to

and the additional constraints

This is a so-called convex optimization problem which can be solved via a quadratic programming technique.

The quadratic equation shows the optimal return for a given risk. Below the line the portfolio has a too low return for the given risk.

### Concrete R code to calculate the efficient frontier

Many resources across the web give information in Python, R and other languages. The following was take from Building an Optimized Portfolio with R. The site contains in fact a whole course on so-called Modern Portfolio Theory(MPT) which is equivalent to the Markovitz theory and the efficient frontier.

Note that “efficient frontier” is both a technical term from MBT and the name of the company that Adobe bought on the way to their digital marketing cloud.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
library(stockPortfolio) library(quadprog) library(ggplot2) stocks <- c("SPY", "EFA", "IWM", "VWO", "LQD", "HYG") returns <- getReturns(stocks, freq = "week") eff.frontier <- function (returns, short = "no", max.allocation = NULL, risk.premium.up = .5, risk.increment = .005) { covariance <- cov(returns) print(covariance) n <- ncol(covariance) # Create initial Amat and bvec assuming only equality constraint (short-selling is allowed, no allocation constraints) Amat <- matrix (1, nrow = n) bvec <- 1 meq <- 1 # Then modify the Amat and bvec if short-selling is prohibited if (short == "no") { Amat <- cbind(1, diag(n)) bvec <- c(bvec, rep(0, n)) } # And modify Amat and bvec if a max allocation (concentration) is specified if (!is.null(max.allocation)) { if (max.allocation > 1 | max.allocation < 0) { stop("max.allocation must be greater than 0 and less than 1") } if (max.allocation * n < 1) { stop("Need to set max.allocation higher; not enough assets to add to 1") } Amat <- cbind(Amat, -diag(n)) bvec <- c(bvec, rep(-max.allocation, n)) } # Calculate the number of loops based on how high to vary the risk premium and by what increment loops <- risk.premium.up / risk.increment + 1 loop <- 1 # Initialize a matrix to contain allocation and statistics # This is not necessary, but speeds up processing and uses less memory eff <- matrix(nrow = loops, ncol = n + 3) # Now I need to give the matrix column names colnames(eff) <- c(colnames(returns), "Std.Dev", "Exp.Return", "sharpe") # Loop through the quadratic program solver for (i in seq(from = 0, to = risk.premium.up, by = risk.increment)) { dvec <- colMeans(returns) * i # This moves the solution up along the efficient frontier sol <- solve.QP( covariance, dvec = dvec, Amat = Amat, bvec = bvec, meq = meq ) eff[loop, "Std.Dev"] <- sqrt(sum(sol$solution * colSums(( covariance * sol$solution )))) eff[loop, "Exp.Return"] <- as.numeric(sol$solution %*% colMeans(returns)) eff[loop, "sharpe"] <- eff[loop, "Exp.Return"] / eff[loop, "Std.Dev"] eff[loop, 1:n] <- sol$solution loop <- loop + 1 } return(as.data.frame(eff)) } eff <- eff.frontier( returns = returns$R, short = "yes", max.allocation = .45, risk.premium.up = .5, risk.increment = .001 ) eff.optimal.point <- eff[eff$sharpe == max(eff$sharpe),] ealred <- "#7D110C" ealtan <- "#CDC4B6" eallighttan <- "#F7F6F0" ealdark <- "#423C30" ggplot(eff, aes(x = Std.Dev, y = Exp.Return)) + geom_point(alpha = .1, color = ealdark) + geom_point( data = eff.optimal.point, aes(x = Std.Dev, y = Exp.Return, label = sharpe), color = ealred, size = 5 ) + annotate( geom = "text", x = eff.optimal.point$Std.Dev, y = eff.optimal.point$Exp.Return, label = paste( "Risk: ", round(eff.optimal.point$Std.Dev * 100, digits = 3), "\nReturn: ", round(eff.optimal.point$Exp.Return * 100, digits = 4), "%\nSharpe: ", round(eff.optimal.point$sharpe * 100, digits = 2), "%", sep = "" ), hjust = 0, vjust = 1.2 ) + ggtitle("Efficient Frontier\nand Optimal Portfolio") + labs(x = "Risk (standard deviation of portfolio variance)", y = "Return") + theme( panel.background = element_rect(fill = eallighttan), text = element_text(color = ealdark), plot.title = element_text(size = 24, color = ealred) ) |

Key concept in all of this is **mean-variance optimization approach (Markowitz’s Modern Portfolio Theory)**.

### Sketch of a custom solution for digital marketing

- one of the assumptions to apply MPT to marketing is that the usage of a feature is directly linked to a classified client. This of course is not true in general since one can have in principle a highly used webpage with, say, movie information which is unrelated to the goal or conversion while still being of interest to users. The way to alleviate this is by applying MPT to the signals and not to all (noisy) features. This would means that one takes the chi2 reduced features and apply MPT to end up with a way to order them.
- calculate the statistical mean and variations of the signals using either a large amount of data or estimating the probability distribution (with an implicit assumption about the normality of the stochastic variables)
- the Markovitz technique then immediately gives the efficient frontier and thus a feature portfolio for a an investor’s profile

What one should have in the end is a cube with

- the geographic nominal
- the scores (chi2, PCA, apriori)
- the centrality with respect to pagerank
- possibly info from search engine analytics