One of key questions in experimental design is “How many measurements should be included in the sample?”
In order to answer this, 4 parameters are required:
- type I error, value selected for α (a), the risk of rejecting a true hypothesis (risk of false positive), normally leading to the level of confidential interval, eg we normally confidential with 95% of confidential interval. Under a normal distribution, the upper critical value for this 1.96.
- type II error, value of β (b), the risk of accepting a false null hypothesis when a particular value of the alternative hypothesis is true (risk of false negative). In practice, we accept 10% of such risk. This translated as the upper critical value is 1.282.
- value of the population standard deviation (δ, sd). This value can be obtained from a pilot study or literatures.
- margin of error, m for example, at 5% (standard value of 0.05). this can be explained as, the true mean is μ, we expect to estimate the mean (ý) with 5% error, the deviation from the true or margin m is 0.05u. The expression is
ý – m < µ < ý + m
The standard deviation or variance can be calculated from a pilot study. For a binary trait, the variance is obtained as p(1-p), p is the probability of one of the event, for example, disease. For a continuous traits, such body weight or height, this is initial assessment. It should be standardised for simplicity. The simple formula for this two-side test is
N = [(zα/2 + zβ )/(m/sd)]^2
here is a simple R function
pow(0.05, 0.10, 10,3)
> pow(0.05, 0.1, 10,3)
to be continued
This is a interesting and useful topic. to be followed