The threshold I set before starting the experiments was 0.01. It's quite arbitrary, but I always feel that the standard 0.05 is a bit too loose. The null hypothesis was 'the Envoy Staff will give the expected probabilities'.
Let's work out the 10/10/10 case. Those experiments were done with only Nature's Renewal up, so observation of 0, 2, 4x HCT is possible but the case of 8x, even though it is calculated in the background, gets capped to 4x. So I assumed that it was allowed to group them together: total expected probability of 4+8x = 2.7 + 0.1 = 2.8%. The 76 observations will include both 4x and 8x since they are indistinguishable.
The chi-squared value was calculated by putting the counts for each column separately into: [(observed-expected)^2] / expected and then summing. This value was then used to calculate the right-tailed P with the excel formula =CHISQ.DIST.RT().
There are 3 different types of data effectively (since 4 and 8x are indistinguishable in the observations), so 2 degrees of freedom.
The random function in Computer Science isn't really random, so I don't think that khi squared is the way to go here but anyway ~~
That's a very good point I hadn't thought about. Do you know if there are any standard tests or corrections we can use instead?
Many thanks for your help.