Some Notes about Stratified Sampling

Stratified Random Sampling involves first selecting a strata such as age, race, gender, profession, time of day, course of  study, etc.  Then looking up the percentage of the population that belongs to each strata.  Next multiply these percents by the sample size to determine the number of each type that must be interviewed. 

Here is an example.  Suppose one wanted to survey 50 students at Lake Tahoe Community College and use the Strata of race.  Student information for the year 2005 (hopefully a newer version will be available soon) can be found in Graphically Speaking.  According to this document the racial makeup of LTCC is:  70.5% White, 12.7% Hispanic, 2.8% Asian, 9.5 % are unknown, the the leftover 3.5% are from other races.  We can adjust these numbers by taking out the unknowns and recalculate to get

78.8% White, 14.2% Hispanic, 3.1% Asian, 3.9% Other

(The calculation for the White adjusted percent was 70.5 / (70.5 + 12.7 + 2.8 + 3.5).  The other calculations are similar.)

Now to find the quotas we take:

White:  78.8% or 50  =  39

Hispanic:  14.2% of 50  =  7

Asian:  3.1% of 50 = 2

Other:  3.9% of 50  =  2

Next survey LTCC students asking them first what there race is and then you question of interest. Keep asking people until all of the quotas are met.  If one of he quotas, say Hispanic, is met and a new person says he or she is Hispanic then do not ask the question of interest.  Keep asking more students until all of the quotas are exactly met. 

When you have completed the survey, you can cite that a stratified sample based on race was taken.  Then you can put all of the data in one graph or statistic and validly claim that the data was taken from a stratified sample.  Note that you put the data into the computer, you will only have one column (not 4), the race is not entered into the computer.  Stratified sampling is not used for comparing one group to another, but rather to make sure that the sample is representative of the entire population.

Common  Mistakes

1.  Surveying many people and noticing that they seemed to be from many different backgrounds, races, etc. and deducing that this is stratified sampling. 

        If you have not established quotas before you begin surveying, you are not using stratified sampling.

 

2.  Using stratified sampling by collecting data from say 4 different races and then presenting 4 different histograms to compare the races. 

        Stratified sampling is not the same thing as a comparison study.  As soon as the data is collected, the calculations should not reflect which strata each number is taken from.  There should only be 1 histogram, one standard deviation, one mean, etc.  The goal is to gather these statistics for the population, not for each group.

 

3.  Surveying an equal number of people from each strata.

        Stratified sampling involves making sure that the proportion of each group within the sample represents the proportion of that group in the population.  If there are twice as many Caucasians in the community as there are Hispanics, then the sample should also contain twice as many Caucasians as Hispanics for example.