# The Yak

Posted on
dicemagic go learning My friend Andrew introduced me to an idiom I had never heard before, which i’m quite fond of. Yak Shaving looks a lot like this. I’m tired, and staying up with Samuel while Natalie sleeps, so I’ll share my most recent Yak Shaving story.

## The problem

My side project Dice Magic has super-bad code coverage, some modules are around 90%, but most are in the 50s. Well, it should be easy to write a few unit tests and bump up those numbers. Maybe I’ll even run gofmt and fix all my lint warnings.

### Step 1: Decide where to start

Dice Magic has become a rather complex beast. We’ve got 2 parsers, integrations with multiple chat systems and DialogFlow. A DAL, a caching layer, logging, a pubsub queue, on and on.

I should probably start with the most basic, core function in the app. It rolls dice, so let’s start with the `Roll()` function.

``````//Roll creates a random number that represents the roll of some dice
func Roll(numberOfDice int64, sides int64) (int64, error) {
//error handing removed for brevity
result := int64(0)
for i := int64(0); i < numberOfDice; i++ {
x, err := generateRandomInt(1, int64(sides))
if err != nil {
return 0, err
}
result += x
}
return result, nil
}``````

So, the nut of this function is another function call to `generateRandomInt()`. As you might expect, the results from that function are random. Not much to test here… Or is there?

``````func generateRandomInt(min int64, max int64) (int64, error) {
size := max - min
//error handing removed for brevity
//rand.Int does not return the max value, add 1
if err != nil {
err = fmt.Errorf("Couldn't make a random number. Out of entropy?")
return 0, err
}
n := nBig.Int64()
return n + int64(min), nil
}``````

### Step 2: Learn about statistical analysis

I could mock this out and move on, but I actually do want to know if the dice this whole house of cards is built on are fair. The problem is, I have no idea how to validate my assumptions here. A quick Google search led me to the concept of a Chi-squared test with some big phrases like “goodness of fit” and pictures that I did not comprehend.

After a rather deep Wikipedia hole, I decided this was, indeed what I wanted to do. And after some further Googling, and reading lots of sites that look like this I learned that a chi-squared test is real easy to use! R has an implementation, surely Go does too. Go does not

### Step 3: Find chi-squared tests for Go

But Go DOES have lots of modules people have built. I found gonum, which implemented something called a chi-squared distribution, but no chi-squared test. Ok, back to the internet and more academic papers. Maybe I can implement this myself.

### Step 4: Eureka!

After much reading I found this nugget on Wikipedia

Accordingly, since the cumulative distribution function (CDF) for the appropriate degrees of freedom (df) gives the probability of having obtained a value less extreme than this point, subtracting the CDF value from 1 gives the p-value.
- https://en.wikipedia.org/wiki/Chi-squared_distribution

And what’s even better, is after 2 days of this adventure, I knew that that meant!

I can use gonum to create the chi-squared distribution from my degrees of freedom, calculate the cumulative distribution based on the distance between my expected and observed results! Then all I have to do is subtract from 1 to get p

First I’ll read a whole bunch of results into buckets

``````    m := make(map[int64]int)
for i := 0; i < numberOfLoops; i++ {
x, _ := generateRandomInt(1, numberOfBuckets)
m[x]++
}``````

It’s pretty easy to calculate the “expected” result for each bucket, it’s just the average.

``````// iterate over buckets of random numbers
expv := float64(int64(numberOfLoops) / numberOfBuckets)
for e := range m {
obs = append(obs, float64(m[e]))
exp = append(exp, expv)
}
c := stat.ChiSquare(obs, exp)
p := 1 - distuv.ChiSquared{K: float64(df), Src: nil}.CDF(c)
t.Logf("chi2=%v, degOfFreedom=%v, p=%v", c, df, p)``````

looks like the random number generator that feeds `Roll()` is perfectly fair.

``````--- PASS: Test_generateRandomInt (3.94s)
roll_test.go:33: chi2=170.2808, df=199, p=0.9307575285063816``````

But what about `Roll()` itself? It loops over `generateRandomInt()` once for each die. We should get a cool curve of probabilities. It should be pretty easy to calculate the expected value of each dice roll. I play a lot of craps, I sort of know this stuff.

I do not know this stuff.

### Step 5: Go to sleep, wake up, try again Calculating probabilities for 2 n-sided dice is pretty simple. Just grid out all possibilities and the number of possibilities that equal your target number divided by the total number of possibilities is the probability.

But what about more complex dice throws?

I decided to start by creating a go routine that outputs all possible combinations of a dice roll to a channel so I could process them.

Except… How do you do that?

### Step 6: Back to Wikipedia Initial google searches led me to a number of possible solutions. Some were yielding combinations and some were yielding permutations, and I didn’t know the difference.

As I was halfway through implementation, I realized that combinations weren’t working as I was expecting. Combinations are over one set, I have 2 sets.

### Step 7: Learn about set theory

As an aside, I never really went to school much. Only 1.5 years of high school and some collage courses here and there. There are big gaps in my understanding of math and algorithms. Nevertheless, I love figuring it out, but I often have to start from square 1. Which looks like this: Some more Wikipedia digging led me to the concept of a Cartesian product. More specifically the n-ary Cartesian product. Once I could search for that, StackOverflow to the rescue!

‘Course, I’m here to learn go, so I’m going to implement it with channels and range

``````func diceProbability(t *testing.T, numberOfDice int64, sides int64, target int64) float64 {
rollAmount := math.Pow(float64(sides), float64(numberOfDice))
targetAmount := float64(0)
var possibilities []int64
for i := int64(1); i <= sides; i++ {
possibilities = append(possibilities, i)
}
c := make(chan []int64)
go GenerateProducts(c, possibilities, numberOfDice)
for product := range c {
if sumInt64(product...) == target {
targetAmount++
}
}
p := (targetAmount / rollAmount)
return p
}``````

`diceProbability()` will start a goroutine `GenerateProducts()` that outputs each possible dice roll to channel c. Then I’ll range over that channel and increment a counter if the sum of the product is the target.

Finally, we can calculate the probability of an arbitrary number coming up on an arbitrary dice roll.

### Step 8: Put it all together

So we can calculate dice probabilities, and we can calculate chi-squared tests, now we just need to put it together!

``````    --- PASS: TestRoll/8d4 (2.80s)
roll_test.go:109: Rolling 8d4 100000 times
roll_test.go:110: Bucket : Probability : Expected : Observed
roll_test.go:111: ------------------------------------------
roll_test.go:116:      8 :  0.0015259% :   1.5259 :        0
roll_test.go:116:      9 :   0.012207% :   12.207 :       12
roll_test.go:116:     10 :   0.054932% :   54.932 :       56
roll_test.go:116:     11 :    0.18311% :   183.11 :      180
roll_test.go:116:     12 :    0.49133% :   491.33 :      482
roll_test.go:116:     13 :     1.1108% :   1110.8 :     1153
roll_test.go:116:     14 :      2.179% :     2179 :     2194
roll_test.go:116:     15 :      3.772% :     3772 :     3791
roll_test.go:116:     16 :     5.8334% :   5833.4 :     5755
roll_test.go:116:     17 :     8.1299% :   8129.9 :     8048
roll_test.go:116:     18 :     10.266% :    10266 :    10426
roll_test.go:116:     19 :     11.792% :    11792 :    11750
roll_test.go:116:     20 :     12.347% :    12347 :    12238
roll_test.go:116:     21 :     11.792% :    11792 :    11760
roll_test.go:116:     22 :     10.266% :    10266 :    10363
roll_test.go:116:     23 :     8.1299% :   8129.9 :     8103
roll_test.go:116:     24 :     5.8334% :   5833.4 :     5947
roll_test.go:116:     25 :      3.772% :     3772 :     3714
roll_test.go:116:     26 :      2.179% :     2179 :     2202
roll_test.go:116:     27 :     1.1108% :   1110.8 :     1128
roll_test.go:116:     28 :    0.49133% :   491.33 :      477
roll_test.go:116:     29 :    0.18311% :   183.11 :      164
roll_test.go:116:     30 :   0.054932% :   54.932 :       38
roll_test.go:116:     31 :   0.012207% :   12.207 :       19
roll_test.go:121: chi2=25.17996518137826, df=23, p=0.34107352659248535
PASS``````

### Step 9: Celebrate success

2 days and 10k dirty diapers later, I’ve imporved code coverage by 1%!

Full code available here, if you’re interested.  