-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathbayesian-ab-testing-lnl.Rpres
200 lines (155 loc) · 7.09 KB
/
bayesian-ab-testing-lnl.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
Bayesian A/B Testing
========================================================
author: Alex Yakubovich
date: July 9, 2014
transition: none
```{r, echo=F}
source('~/playfair/shiny/ab_key_metrics_builder/beta_binomial_ab_test.R')
```
Frequentist A/B Testing
========================================================
- **P-value**: Probability of observing the result of a test (or a more extreme one) if there is no effect
- Probability of data given model
----

Problems
========================================================
- **No peeking**: unrealistic sample size requirements
- Makes crude (but valid) approximations.
- Not all mistakes are created equal
- **Inflexible** and hard to interpert
Problems
========================================================
- **No peeking**: unrealistic sample size requirements
- Makes crude (but valid) approximations.
- Not all mistakes are created equal
- **Inflexible** and hard to interpert

The Bayesian approach
========================================================
- Compute the **probability of the model given the data**
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
alpha <- 4
beta <- 10
conf.level <- .1
curve(dbeta(x, alpha, beta), from=0, to=1, n=1e5, col='darkblue', lwd=5, xlab='True Day 1 Retention', ylab='density', xaxs="i", yaxs="i")
ci <- qbeta(c(conf.level/2, 1-conf.level/2), alpha, beta)
n <- 25
dx <- seq(ci[1], ci[2], length.out=n)
polygon(c(ci[1], dx, ci[2]), c(0, dbeta(dx, alpha, beta), 0), col='lightgrey', border=NA)
# draw a white vertical line on "inside" side to separate each section
#segments(a, 0, a, dbeta(dx, alpha, beta), col="white")
text(.3, 1, labels='90%')
```
Prior Distribution
========================================================
Beliefs about a random variable before seeing the data.
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
curve(dbeta(x, 22, 29), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='True Day 1 Retention', ylab='density')
```
Beta Distribution
========================================================

Likelihood Function
========================================================
Measures how likely it is that the data $D$ was generated by the model.

Bayes Rule
========================================================
Tells us how to update our beliefs after seeing the data.
$$ P(W | D) \propto P(D|W) \times P (W)$$
* $W =$ world state
* $D =$ observed data
Updating our beliefs (n=0)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
alpha0 <- 22
beta0 <- 29
curve(dbeta(x, alpha0, beta0), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=10)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
p <- .3
n <- 10
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=20)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
n <- 20
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=30)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
n <- 30
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=40)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
n <- 40
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=50)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
n <- 50
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=100)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
n <- 100
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Changing our beliefs (n=1000)
========================================================
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
n <- 1000
y <- round(n * p)
curve(dbeta(x, alpha0 + y, beta0+n-y), from=0, to=1, n=1e5, col='darkblue', lwd=2, xlab='day 1 retention', ylab='density')
```
Comparing two groups
========================================================
$$ \begin{align} \displaystyle P(\theta_B > \theta_A) &= \int \int_{\theta_B > \theta_A}P(\theta_A, \theta_B) \\
& = \int \int _{\theta_B > \theta_A} P(\theta_A | data) \times P(\theta_B | data)
\end{align}$$
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
y <- c(5,20)
n <- c(100,100)
grid.resolution <- 512
pgrid <- seq(0, .8, length=grid.resolution+2)[-c(1,grid.resolution+2)] #(0,1) grid
dposterior <- function(pa, pb, alpha0,beta0, y, n) dbeta(pa,alpha0 + y[1],n[1]-y[1]+beta0) * dbeta(pb,alpha0 + y[2],n[2]-y[2]+beta0)
pdf_arr <- outer(pgrid, pgrid, dposterior, alpha0,beta0, y,n)
image(pdf_arr, xlab='True Day 1 retention (group A)', ylab='True Day 1 retention (group B)', xlim=c(0,.6), ylim=c(0,.6))
points(seq(0,1,length=10000), seq(0,1,length=10000), pch='.')
```
```{r, figs.only=TRUE, echo=FALSE,results='hide'}
y <- c(5,20)
n <- c(100,100)
alpha0 <- 22
beta0 <- 29
for (g in 1:2)
curve(dbeta(x,alpha0 + y[g], n[g]-y[g]+beta0), from=0, to=1, n=1e3, col='darkblue', lwd=2, xlab='retention', ylab='density',add=g>1)
#beta.binomial.ab.test(y,n)
```
When do we end a test?
========================================================
* **Risk:** How much many users/conversions/money do we expect we lose if we are wrong?
* Test is over as soon as the risk is below a threshold
References
========================================================
- [Bayesian witch - Agile A/B Testing with Bayesian Statistics and Python](http://www.bayesianwitch.com/blog/2014/bayesian_ab_test.html)
- [Richrelevance - Bayesian A/B tests](http://engineering.richrelevance.com/bayesian-ab-tests/)
- [Richrelevance - Bayesian analysis of Normal distributions with Python] (http://engineering.richrelevance.com/bayesian-analysis-of-normal-distributions-with-python/)
- [Richrelevance - Bayesian A/B testing with a Lognormal model] (http://engineering.richrelevance.com/bayesian-ab-testing-with-a-log-normal-model/)
- [Swrve - A/B testing for game design iteration: a Bayesian approach] (http://www.gdcvault.com/play/1020201/A-B-Testing-for-Game)
- [Probabilistic Programming and Bayesian Methods for Hackers](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/)