Friday, August 7, 2015

On the Even Distribution of Dice Rolls

An APBA/Stat Preamble
As anyone that speaks to me for more than 12 minutes typically finds out, I’m hopelessly addicted to a game called APBA (or “geek baseball” among anyone who doesn’t play). As a refresher, the core concepts of the game are to take a full season’s stats, create player cards based on said stats, and then roll dice to simulate games. From a baseball fan standpoint it’s a great game, but from a statistical standpoint it’s fascinating - we’re effectively able to use random events (dice rolls) to simulate events based on previous real results.

One of the core components of the game is rolling two six sided dice to come up with a combination of 36 results ranging from 11-66. The results of individual dice are not added, just concatenated in a specific order. In “face to face” play, a red and white die are used with the red die providing the first number and white providing the second.

As any stat geek will know, an even distribution of these results is critical to simulating previous results. Skewed results will not only impact the results of the game, but could give players an unfair competitive advantage. In a 12 team league like I play in, this could impact results at worst and cause drama and accusations at worst.


Some Contextual Whining
In my current APBA league it’s widely acknowledged that I’ve lucked into having a very good team this year. Talk to any baseball fan about an outfield of Stanton, McCutchen and Trout, Abreu at 1B and a variety of very strong pitchers and they’d be impressed. Talk to APBA players about it and they’ll grin in jealousy.


Oddly, this roster has produced surprisingly poor results for me this season. Their expected stats are notably below what they should be. In a game where sample sizes are readily available, these guys all had a surprisingly poor first half of the season. Typically this is blamed on “cold dice” - just pure bad luck of the dice rolls that are widely accepted as a part of the game.


At this point I’m just whining...my really good team isn’t hitting as many homeruns as they should. Waaaaahhhhhhh. But I started noticing a few other things:
  • My opponents seemed to be hitting a lower than normal number of homeruns (in APBA terms, rolls of 11s and 66s seemed to be down)
  • Everyone seemed to be hitting a lot more “secondary power numbers”, singles and walks (APBA terms: rolling lots of 33s, 44s, 45s, etc)
  • The dice distributions just *seemed to be off*. (APBA terms: I had no proof other than a gut feeling)


After half a season things just weren’t sitting well with me. Something seemed off, but I had no real evidence to point to.


As Quick As Possible: Dice Rolling
In the old days our league members all lived in or around Washington, DC and we all played face-to-face games. That’s right, we’d schedule times, drive to each other’s houses and roll dice/talk about baseball/eat pizza for 3 hours and have a grand old time. Unfortunately lives happen, people move and years later we have people scattered across the entire US, so face-to-face games are not possible.


Luckily, the internet exists! Almost everyone knows about AIM and chat windows. Luckily AIM has a dice rolling function that can be used! Unfortunately...
  1. Fewer and fewer AIM clients now support this functionality
  2. AIM Express has been decommissioned, meaning one less client supports it
  3. Mac fan boys like me are nearly SOL for using AIM to roll (note - it turns out that OS X Messenger can somewhat support rolling)


Worry not though, the internet still exists! As do some (seemingly) great dice rolling sites. This is where things get a bit curious.


The (Worrisome) Data
The goal of this post is not to knock down a pet project, so I’ll not call out the site I’ve been using this year, but I will point out the dice results I received from some simulations. Take a look at the below table.

(Update: Udo from Rolz.org has requested I point out this is their site, so here it is. Many thanks to him for helping with the analysis!)

Roll
# Occurrences
Roll
# Occurrences
Roll
# Occurrences
11
13
31
22
51
22
12
25
32
43
52
47
13
29
33
52
53
48
14
36
34
46
54
35
15
24
35
50
55
52
16
8
36
20
56
21
21
28
41
27
61
17
22
51
42
47
62
27
23
49
43
47
63
18
24
49
44
53
64
23
25
45
45
43
65
22
26
25
46
25
66
11
This is based on 1200 rolls (n=1200) so you’d expect the number of occurrences to be right around 33 (1200/36 = 33.3333…). A quick look shows that the number of occurrences is all over the place though, ranging from 8 on the low end to 53 on the high end. Naturally a lot of factors play into the number of rolls, and it’s not uncommon to see such a huge range. Had I paid enough attention in stats my senior year, I’d begin discussing variance and standard deviation, but there’s still a bit more to study up on before I can do that :)


Needless to say, something smells a bit fishy here. It could come down to the randomness of the dice, but let me present the results in a different way...ordered by number of occurrences:
Roll
# Occurrences
Roll
# Occurrences
Roll
# Occurrences
16
8
12
25
34
46
66
11
26
25
42
47
11
13
46
25
43
47
61
17
41
27
52
47
63
18
62
27
53
48
36
20
21
28
23
49
56
21
13
29
24
49
31
22
54
35
35
50
51
22
14
36
22
51
65
22
32
43
33
52
64
23
45
43
55
52
15
24
25
45
44
53
Notice anything odd? I certainly do.
  • Let’s start first with dice rolls in the rightmost column. Not a single one includes a 1 or a 6. That seems rather odd.
  • Let’s jump to the leftmost column. Ok, that’s really odd...EVERY single one involves a 1 or a 6 AND the number of occurrences is at best HALF of those in the right hand column.
  • And now the middle column. Remember the expected number of occurrences for any one roll should be right around 33? Well the only rolls with either a 1 or a 6 in them that are anywhere near that are 14 (36 occurrences) and 13 (29 occurrences).


I could likely go on if I wanted, but the case is pretty clear that this dice roller does NOT like 1s or 6s. Who knows the reasoning (I can almost guarantee it’s not nefarious), but it’s safe to say that anything resembling a completely random and somewhat even distribution will not result from this particular dice roller.


The (Presumed) Gold Standard
The entire goal of an online dice roller is simple: provide results that resemble what you’d expect from real, physical dice. In the real world we’d expect a uniform distribution of results (remember, we’re not adding the totals, so rolling a 43 is different than rolling a 34). In other words, we’d expect 11 to show up as many times as 12, as 13…,as 65, as 66. Naturally chance comes in to play, so this won’t always happen, but it should be somewhat close.

To date, our league just assumed the gold standard for dice rolling was AIM. It was THE dice roller to use as far back as 1997 (save your comments ICQ, IRC, etc fans...AIM was the first chat client used on massive scale), supported dice rolling for a long, long time and never seemed to raise eyebrows.


Unfortunately rolling a lot of dice at once is hard on AIM - they throttle the number of requests (7 req/10 secs from what I can tell), so it’s a tedious exercise of data entry. Even with that being said, I was able to get n=360 rolls in (expected number of occurrences = 10) with the following results:
Roll
# Occurrences
Roll
# Occurrences
Roll
# Occurrences
11
13
31
15
51
8
12
6
32
7
52
17
13
12
33
9
53
10
14
12
34
10
54
10
15
10
35
9
55
8
16
8
36
6
56
7
21
9
41
9
61
13
22
10
42
14
62
15
23
15
43
7
63
5
24
12
44
7
64
9
25
12
45
13
65
7
26
12
46
9
66
5
Overall it’s kind of boring. There are a few outliers (17 rolls of 52, 6 rolls of 12), but those can be attributed to a smaller sample size. Yet again, remember some more from that stats class in college would be helpful here.


If we’re to sort by number of occurrences we get:
Roll
# Occurrences
Roll
# Occurrences
Roll
# Occurrences
63
5
21
9
14
12
66
5
33
9
24
12
12
6
35
9
25
12
36
6
41
9
26
12
32
7
46
9
11
13
43
7
64
9
45
13
44
7
15
10
61
13
56
7
22
10
42
14
65
7
34
10
23
15
16
8
53
10
31
15
51
8
54
10
62
15
55
8
13
12
52
17
Again, a bit inconclusive and random, but that’s what you want. It may look a little weird due to the sample size, but there appears to be no bias against any one number or role in particular. Based simply on gut feel and a long history with AIM alone, I’m not opposed to continue to use it.


An Alternative (Maybe)?
Being that I’m a tech consultant by training, it’s been pounded into my mind that it’s useless to whine about an issue without proposing a solution. This one is particularly tricky for a few reasons though:
  1. AIM doesn’t work for certain users wanting to roll dice (particularly me)
  2. The above rolling site is obviously not ideal for simulating real dice rolls
  3. Findings like the above will naturally make anyone concerned about similar technologies.
  4. Other members of my baseball league may challenge new technologies or change of any kind. It’s simple human nature and very understandable in this case.
So what’s one to do? Easy - pause in frustration and browse reddit. It’s what I do on an hourly basis anyway, so I might as well try it here.

Luckily for me:
  1. Reddit has a great community
  2. Other people need dice rollers
  3. Google has awesome engineers who are Mac fan boys and probably got irritated when AIM Express got shut down and left them without a dice roller


The result? Google Hangouts has a random dice roller!!! That’s right, simply type /roll 2d6 in a Hangout window and a result similar to “Chad rolls 2d6 and gets 7. [ 2 5 ]” will pop up. Now the real question...will it burn when I roll as much as that other site did? To the analysis we go!


The (Hopeful) Data
Let’s start where we did with the previous data dump...a list of occurrences by dice rolls. One small note though - due to tester (aka me) miscounting, the sample size and expected number of occurrences will be a bit different. In this case it will be n=1080 with an expected number of occurrences at 30 on the dot. Now to the table:
Roll
# Occurrences
Roll
# Occurrences
Roll
# Occurrences
11
22
31
26
51
38
12
38
32
25
52
22
13
30
33
30
53
24
14
26
34
26
54
42
15
44
35
39
55
27
16
46
36
41
56
24
21
22
41
43
61
30
22
33
42
20
62
24
23
20
43
29
63
22
24
29
44
37
64
29
25
31
45
30
65
28
26
25
46
26
66
32
A few things immediately jump out here:
  1. The range of occurrences changes dramatically with the low being 20 and the high being 46. Compare this to the range of 8 to 53 on the previous roller and it’s quite an improvement.
  2. The bias against 1s and 6s seems to be gone. In fact, 16 was the most rolled number this time around.
  3. The “gut feel” is a bit better about this table. Sure there are extremes, but most of the number of occurrences *seem* to be in the right range. Again, remembering a bit about variance and standard deviation would probably help my argument a lot here.


Now let’s look at the numbers as we did in the previous data dump - sorted by occurrences.
Roll
# Occurrences
Roll
# Occurrences
Roll
# Occurrences
23
20
31
26
25
31
42
20
34
26
66
32
11
22
46
26
22
33
21
22
55
27
44
37
52
22
65
28
12
38
63
22
24
29
51
38
53
24
43
29
35
39
56
24
64
29
36
41
62
24
13
30
54
42
26
25
33
30
41
43
32
25
45
30
15
44
14
26
61
30
16
46
See any patterns here? Seriously - I’m asking because I don’t. Sure there is a range with a few outliers on both ends, but nothing that a little math probably couldn’t explain. And there’s no obvious bias like before. Naturally I could be missing something, but in terms of gut instincts, this type of distribution just *seems* better.


The Fallout
This data is certainly some combination of interesting, compelling and daunting. Over the next few days I anticipate concern/confusion among league members, an unknown response when I attempt to helping provide some feedback to the person who’s turned a pet project into a nice little rolling site, and hope for an adoption of Google Hangout in rolling so I don’t have to go out and buy a Windows machine.


What is sure is this...as new rolling technologies inevitably must be adopted in the future, there is a 100% chance that I’ll do some basic distribution analysis to determine if there skewed in a way that could be considered flawed.

TL;DR: Use AIM or Google Hangout to roll dice, they seem legit.