Sampling 101: Getting the Basics Right

ConveGenius Insights
7 min readOct 27, 2023

--

By: Manjusha, Monika & Sandeep Kavety

28th October 2023

I wake up in the morning, open the door and see the packet of milk at the doorstep. An everyday phenomenon and I conclude that the vendor is punctual and provides the best service. A conclusion that is based solely on my personal experience, devoid of any knowledge of the feedback provided by all the other customers. My personal experience is a sample from all the delivery services.

Image 1: Milk delivered before I am awake

Similarly, if we look at the bulk of our knowledge, actions are quite often based on the samples. These sets of experiences drive the conclusions. These experiences and samples can be small or large, but ultimately play an important role in our everyday lives.

Just like how we make judgements or conclusions on aspects of everyday life, much of scientific research is also based on a subset of observations from the larger reality. This sub-set is called as a “Sample”. You might ask, “Shouldn’t we gather data from the complete universe or population to make the right conclusions?”

My response to that would be, “Ummm… If there is an efficient way to get this done, would you still want to reach out to the complete population?”

Yes, the power of statistics helps us efficiently derive conclusions by conducting sample studies. Sampling in education research can range in complexity from relatively simple to quite complex, depending on the specific research question being addressed and the characteristics of the population being studied.

In some cases, researchers may choose to use simple random sampling, where every member of the population has an equal chance of being selected for the sample. This can be a straightforward and efficient way to select a representative sample, particularly if the population is relatively small and homogenous.

However, in other cases, researchers may need to use more complex sampling methods in order to ensure that the sample is truly representative of the larger population. For example, stratified sampling may be used if the population is heterogeneous and the researcher wants to ensure that different subgroups are adequately represented in the sample. Similarly, cluster sampling may be used if it is logistically difficult or impractical to study the entire population, and the researcher wants to ensure that different geographical areas or other subgroups are adequately represented in the sample.

There are several types of sampling that can be used in education research, each with its own unique strengths and limitations. There are two main types of sampling, Probability Sampling and Non- Probability Sampling.

Probability Sampling:

As the names suggest, Probability Sampling involves a random selection process, where each element in the population as an equal such of being selected. Some common types of probability sampling include:

Simple random sampling:

This involves randomly selecting a sample from the entire population, ensuring that every individual in the population has an equal probability of being chosen. For instance, a researcher might employ simple random sampling to select students from a specific school district for a study.

One advantage of this approach is its ability to generate a representative sample from the larger population, as every population member has an equal chance of selection, minimizing potential biases in the results.

However, a drawback of simple random sampling is its potential for being time-consuming and resource-intensive, particularly with large samples. Additionally, ensuring that the sample truly represents the entire population can be challenging, especially when dealing with a highly diverse or stratified population.

Image 2: Representation of Random Sample Selected from the Sample Frame

Stratified sampling:

This method involves the process of grouping the population into subgroups, known as strata, and then selecting a sample from each of these strata. This approach proves beneficial when we aim to guarantee a well-rounded representation of various subgroups within the overall population. For instance, we can employ stratified sampling to choose a sample of students from various socio-economic backgrounds to examine how poverty affects student achievement.

Nevertheless, there are some drawbacks associated with stratified sampling. It can be tricky to figure out how many people to pick from each group to make sure the group is like the big group. Also, it can be hard to be sure the group really represents everyone if the big group is very different or divided into many groups.

Image 3: The Sample Frame is divided into strata, ensuring representation from each.

Cluster sampling:

Cluster sampling involves splitting the population into groups, or clusters, and then randomly picking some of these clusters to gather samples from. This method is useful when it’s too hard or expensive to study everyone in the whole population, and you want to make sure different regions or groups are included in your sample. For example, if you want to see how well a new teacher training program works, you might use cluster sampling to select schools from various areas.

One good thing about cluster sampling is that it’s practical and cost-effective when dealing with large populations or logistical challenges. It’s also great for looking at specific areas or groups within the population.

However, there’s a downside to cluster sampling. It can introduce errors in your results if the clusters you choose don’t represent the whole population accurately. Plus, if the population is very diverse or split into different categories, it can be tricky to make sure your sample truly represents everyone.

Image 4: source

Non-Probability Sampling:

On the other hand, we have Non-Probability Sampling, in which not every element in the population has an equal chance of being selected. Instead, the selection is based on a non-random method. One such sampling method is:

Convenience Sampling:

While using this method, we choose participants based on what’s easiest or most convenient for us, rather than trying to get a diverse or random sample. This method is often used when there’s limited time or resources. For example, if we want to gather quick feedback on a new product, we might ask opinions from whoever is nearby or available at the moment, like colleagues in the same office or friends and family.

The main drawback of convenience sampling is that it can lead to biased results. Because participants aren’t chosen randomly, they might not represent the broader population’s views. It’s a bit like asking only our friends about their favourite food — the answers won’t show what everyone else thinks.

Purposive Sampling:

In this method, we need to purposefully select specific participants whom we believe will provide the most valuable insights. It’s not about choosing the easiest or most random individuals, but rather those who align with the study’s objectives. For example, if we are examining the effects of professional training on chefs, we would deliberately choose chefs who have undergone such training. The selection is not random but is driven by a clear intent.

However, a potential limitation of this approach is its subjectivity. Just because we believe certain participants are ideal doesn’t necessarily mean our choice is accurate. Consequently, the findings might not represent the broader population’s perspectives.

Snowball Sampling:

Here, one participant helps us find more participants. This chain continues, growing like a snowball. It’s especially useful when studying groups that are hard to reach.

Imagine studying a community that’s very private. You interview one person, and they introduce you to two more. Those two introduce you to others, and so on.

The challenge is that the initial choices can heavily influence the entire sample. If the first few participants are similar or have biased views, everyone they introduce might share those views.

Quota Sampling:

In this method, we decide in advance how many participants from each category or group we want. We’ll then pick participants until we fill those “quotas.” If we aim to understand toy preferences among kids, we might decide we want feedback from 20 kids who like action figures, 20 who like dolls, and 20 who like board games.

While this ensures diversity, it doesn’t guarantee randomness. The “quotas” might not accurately represent the sentiments or behaviours of the broader group.

Voluntary Response Sampling:

When individuals decide to participate on their own, we don’t select them; they step forward on their own initiative. This often happens because they have strong feelings about the topic at hand. Consider online polls or call-in radio shows as examples. Participants in these scenarios typically have potent opinions they are eager to voice. However, the challenge lies in the fact that only those with the most fervent views might come forward. As a result, this can produce extreme or biased outcomes that might not genuinely represent the more moderate perspectives of the wider population.

Image 5: People Voluntarily choose to participate and speak

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Overall, the choice of sampling method in education research will depend on the specific goals and needs of the study, as well as the characteristics of the population being studied. It’s important to carefully consider the strengths and limitations of different sampling methods in order to ensure that the sample group is representative of the larger population and that the results of the study are reliable and valid.

On a funny note, it’s worth noting that sampling can also be a bit like trying to pick the perfect ice cream flavour at the shop — there are so many options to choose from, and it can be tough to decide which one will be the best fit for your research needs. Just be sure to sample responsibly, or you might end up with a brain freeze!

--

--

ConveGenius Insights

CGI is a leader in educational assessment and is the only Indian company, that can measure student learning on a continuum using its proprietary scale PinAcLe.