Coin Flip Conundrum

I watched this video,

Very interesting.

So, I managed to prove it through some scripting.

#!/usr/bin/env python
import random
# 0 = Head, 1 = Tail
def start_flipping_coin(target, saved):
return flip_until(target, 0, 0, saved)
def flip_coin():
return random.randint(0, 1)
# This is curcial part, doesn't mean you just go back one step.
# Because it assumes you are tossing the coin consecutively.
# So, if the your target is {H,T} or {T,H}, then
# If the second toss is not correct, you will still continue with the 2nd toss.
# If the target size is 3, it will be totally different
def step_back(target, step, coin):
if target[0] != target[1]:
if step == 1 and target[0] == coin:
return 1
return 0
if step == 0 or target[step] == coin:
return step
return step_back(target, step 1, coin)
def flip_until(target, step, count, saved):
if step >= len(target):
return count
coin = flip_coin()
count += 1
if saved is not None:
if target[step] == coin:
step += 1
step = step_back(target, step, coin)
return flip_until(target, step, count, saved)
def main():
target1 = [0, 1]
target2 = [0, 0]
total_turns = 10000
tossed1 = 0
coins1 = []
for i in range(total_turns):
tossed1 += start_flipping_coin(target1, coins1)
if coins1[2:] != target1:
print("Error1!") # This is a check to make sure the coin set is correct
coins1 = []
tossed2 = 0
coins2 = []
for i in range(total_turns):
tossed2 += start_flipping_coin(target2, coins2)
if coins2[2:] != target2:
coins2 = []
print("Target: {} average steps = {}".format(target1, tossed1 / total_turns))
print("Target: {} average steps = {}".format(target2, tossed2 / total_turns))

view raw
hosted with ❤ by GitHub

And I get some result like this,

Target: [0, 1] average steps = 3.987
Target: [0, 0] average steps = 5.9505

P/S: Wrote a robust flip coin script, which can accept the coin tossing sequence with any length. [here]

Monty Hall problem and frog riddle

Monty Hall paradox

Probability topic is the fundamental concept of the statistics. And machine learning is closely related to statistics. That is why, understand the probability very important if you are doing research, statistics, and machine learning.

Monty Hall is a very interesting problem. It says, if you are given 3 doors to choose. One of them contains a car (which you want), the other two are goats (which you don’t want). After you made your choice, before opening the door, the host will open the door that you didn’t choose yet contains the goat (he knows which door has the goat). Now, if you are given an opportunity to change your choice to another door (which you didn’t choose earlier), are you going to change?

In the first glance, you will feel that whatever you choose, the probability is always 1/3. However, the conditional probability tells you that, if you always make the switch after the host opened the door that has a goat, your probability to win the car will increase to 2/3. What??

In order to prove this, I wrote a Python script.

#!/usr/bin/env python
# This is simulating Monty Hall Paradox

import random

def monty(switch):
    # random shuffle
    doors = [0, 0, 1]  # one of the door contains the car

    openDoor = None

    # choose the first door (not open)
    # if the first door is 1, randomly open the other
    if doors[0] == 1:
        # open the door
        openDoor = random.randint(1, 2)
    else:  # open the door that contains goat
        if doors[1] == 1:
            openDoor = 2
            openDoor = 1

    # now open the last door
    if not switch:
        return doors[0]
        if openDoor == 2:
            return doors[1]
            return doors[2]

def main():
    total = 10000
    car = 0
    for i in range(total):
        car += monty(True)

    print("Always switch the door. Total: {}, car: {}. P = {}".format(total, car, car / total))

    car = 0
    for i in range(total):
        car += monty(False)

    print("No switch the door. Total: {}, car: {}. P = {}".format(total, car, car / total))


Run the code, you will always get the probabilty close to 0.6667 if you always switch the door.

Always switch the door. Total: 10000, car: 6625. P = 0.6625
No switch the door. Total: 10000, car: 3309. P = 0.3309

Frog riddle

Recently I just watched a Youtube about frog riddle.

It also mentions about the conditional probability. Interestingly, quite a lot of comments mentioned that the author is wrong.

In order to prove that the author is correct, I wrote another Python script.

#!/usr/bin/env python

import random

# Frog 0 for female, 1 for male

def create_frog():
    return random.randint(0, 1)

def has_croak(pairs):  # also male
    return 1 in pairs

def has_female(frogs):
    return 0 in frogs

def choose_without_croak(choose_two):
    frogs = [create_frog() for i in range(3)]
    # first frog at the right side
    # second and third at the left side

    if choose_two:
        return has_female(frogs[1:])  # choose two frogs

    return has_female(frogs[0:1])

def main():
    total = 10000
    correct = 0
    for i in range(total):
        correct += choose_without_croak(True)
    print('Just choose two frogs. Total: {}, correct: {}. P = {}'.format(total, correct, correct / total))

    correct = 0
    for i in range(total):
        correct += choose_without_croak(False)
    print('Just choose one frog. Total: {}, correct: {}. P = {}'.format(total, correct, correct / total))

# The exact question is,
# "What is the probability of the frogs in the pair has female,
# given that one of them is male?"
def exact_calculation():
    total = 10000
    croak = 0
    correct = 0
    for i in range(total):
        frogs = [create_frog() for i in range(3)]
        if has_croak(frogs[1:]):
            croak += 1
            if has_female(frogs[1:]):
                correct += 1
    print('Total croak: {}, correct: {}. P = {}'.format(croak, correct, correct / croak))


Running the script, you will get

Just choose two frogs. Total: 10000, correct: 7498. P = 0.7498
Just choose one frog. Total: 10000, correct: 4974. P = 0.4974
Total croak: 7474, correct: 4998. P = 0.6687182231736687

Based on the result, if you choose two frogs, the probability of survive is close to 0.75. If you choose one frog, the probability is 0.5.

Now, the tricky part is the probability 0.67 mentioned in the video. The question should be “What is the probability of the frogs in the pair has female, given that one of them is male?”

So, based on the question, my similuation needs to get the total count of the male (that has croak), and within these pairs, count the female frogs.

To convert this into mathematical expression,

P(\text{female frog}) = 0.5

P(\text{at least one male frog}) = 0.75

P(\text{female frog} | \text{at least one male frog}) = \frac{0.5}{0.75} = 0.6667

Then, based on the simulation and calculation, you will get the 0.6667.

Statistics and functional programming languages

Recently, I feel fervent to learn functional programming, because i) (in my opinion), it will become a trend, and ii) the interpreter can be used as an advanced calculator.

Since I am teaching Statistics, I want to do some calculation of the normal distribution probability.

Before I begin, I need to mention, in order to calculate the normal distribution probability something like P(x < X), this can be done by using a spreadsheet software with NORMDIST() and NORM.INV() for the inverse of the former function.

Spreadsheet is good for calculation, but not good as a calculator. My primary calculator is SpeedCrunch, which allows entering expression. But the drawback of SpeedCrunch is the lack of statistical functions.

Therefore, the functional programming comes to my mind.

Firstly, let me introduce the usage of Python. Make sure SciPy is installed. Run the Python interpreter,

from scipy.stats import norm

So, the two functions are norm.cdf() (cumulative distribution function) and norm.ppf() (percent point function).

Now, let me introduce the usage of R language.


R language is used for statistics, thus, no further module or library is required.

Both Python and R languages are not pure functional programmings. I wanted to try Emacs Lisp, but the arithmetic syntax is not convenient. The syntax is using Polish notation. In order to perform arithmetic calculation,

(* 2 (+ 3 6))

Therefore, I tried the pure functional programming language, that is, Haskell.

In order to calculate the normal distribution probability, the Statistics module is required. After installation,

import Statistics.Distribution
import Statistics.Distribution.Normal
let d = normalDistr 0 1
cumulative d 0
quantile d 0.5

“normalDistr 0 1” is to create a normal distribution with mean 0 and standard deviation 1. Then “cumulative d 0” is the calculation of CDF (cumulative distribution function), which produces 0.5. And “quantile” is the inverse function of CDF.

So, enjoy the functional programming in mathematics and statistics.