Easy Mastery: From Low Probability Events to Significance Level

Mar 26, 2025

7 min read

Previously, we briefly mentioned in " ABCs of Statistics (1) - Overview ":

▶︎ Concept 1: Significant Difference in Means

▶︎ Concept 2: Confidence Interval

We even marked the three different test scenarios—two-tailed, right-tailed, and left-tailed tests—in the visualization for Confidence Interval.

Today, we will continue with a more detailed introduction to these two concepts.

1. what is random sampling error?

First, let's look at this case:

Case Introduction:
▶︎ We now have a simple random sample with a sample size of 25, and the sample mean is 102.68. The data reflects the French test scores of 25 students this year.
▶︎ It is assumed that the French test scores follow a normal distribution. Historically, the population mean (μ) = 100 and the population variance (σ²) = 5.
Now, we need you to perform hypothesis testing at a 5% significance level to determine if there is a significant difference between this year's French test scores and past scores ？

▶︎ We now have a simple random sample with a sample size of 25, and the sample mean is 102.68. The data reflects the French test scores of 25 students this year.

▶︎ It is assumed that the French test scores follow a normal distribution. Historically, the population mean (μ) = 100 and the population variance (σ²) = 5.

Now, we need you to perform hypothesis testing at a 5% significance level to determine if there is a significant difference between this year's French test scores and past scores ？

When we encounter such a problem, we first observe that the average of the sample data set is 102.68, which is obviously 2.68 points higher than the historical population mean.

This average is calculated from the test scores of 25 randomly selected students.

If we were to sample another 25 students, the average could be different; it might be 99, 109, 96, etc.

This variation due to different individuals in each sample is called "sampling error."

However, if the sample average turns out to be X̅ = 109, then it's 9 points higher than the population mean (μ) = 100. Simply attributing such a large difference to sampling error would be hard to justify.

In this case, it is likely that there is a significant difference between this year's test scores and previous years' French test scores.

How can we demonstrate this statistically, since the above points are just surmises?

At this point, we can use a two-tailed test to conduct a hypothesis test.

Before officially starting, this discussion will serve as an introduction to the basic idea of hypothesis testing, laying a solid theoretical and conceptual foundation for the detailed exploration of different types of hypothesis tests.

Of course, we will continue to use our convenient online visualization and analysis tool, Bayeslab.

Building on the concept of sampling distribution, it is known that the French test scores follow a normal distribution.

We will use the population mean hypothesis test as an example to introduce the basic idea of hypothesis testing.

We know that the population X follows a normal distribution. If we take a random sample (X₁, X₂, ..., Xₙ) from this population using simple random sampling, the sample mean X̅ also follows a normal distribution, and

2. AI visualization example

Referencing the distribution curve drawn by the Bayeslab AI visualization block, we see that:

For a sample of size n taken from a normal population with mean μ and variance σ², the sample mean X̅ will vary. It can be higher or lower than μ, but it will always fluctuate around μ, with a higher probability of being near μ and a lower probability of being farther away.

According to the normal distribution table, the probability that X̅ falls within (μ ± 1.96σ/√n) is 0.95, meaning the probability that it falls within the red shaded area indicated below is 0.05.

In probability statistics, we generally say:

A probability between 0.05 and 0.01 is considered a “low probability event”.

A probability less than 0.01 is considered an “almost impossible event”.

This means that it is almost impossible for the sample mean X̅ to fall outside the range of (μ ± 1.96σ/√n) for a sample of size n taken from a normal population with mean μ and variance σ².

Therefore, if we now use the parameter estimates mentioned earlier and calculate that X̅ falls outside the range of (μ±1.96σ/√n),

it suggests that the population represented by the sample X̅ — this year’s French test scores — has a significantly different mean from the population of previous years’ French test scores

3. Two tailed or one tailed test?

3.1 Two tailed test

The hypothesis test mentioned above uses a two-tailed test.

There are three main types of tests, used depending on the direction of the research question. Specifically:

Used to detect if the sample mean significantly differs from the population mean, without concern for the direction of the difference. Hypotheses are formulated as:

➢ H₀: μ = μ₀

➢ H₁: μ ≠ μ₀

For example: A company wants to determine if its production line needs optimization for product weighing and whether the sample mean differs significantly from the nominal weight.
The null hypothesis H₀ and the alternative hypothesis H₁ are:
● H₀: μ = 50 grams (nominal product weight)
● H₁: μ ≠ 50 grams (significant difference in product weight)

3.2 Left tailed test

Used to determine if the sample mean is significantly lower than the population mean. Hypotheses are formulated as:

H₀: μ ≥ μ₀
H₁: μ < μ₀

For example: A school wants to determine if a new teaching method has led to a significant decrease in students’ average test scores.
The null hypothesis H₀ and the alternative hypothesis H₁ are:
● H₀: μ ≥ 75 points (previous average score)
● H₁: μ < 75 points (new average score significantly decreased)

3.3 Right tailed test

Used to determine if the sample mean is significantly higher than the population mean. Hypotheses are formulated as:

H₀: μ ≤ μ₀
H₁: μ > μ₀

For example: A pharmaceutical company wants to determine if a new drug significantly increases the average recovery days for patients.
The null hypothesis H₀ and the alternative hypothesis H₁ are:
● H₀: μ ≤ 14 days (average recovery days with existing drug)
● H₁: μ > 14 days (new drug significantly increases recovery days)

Conclusion: 7 needed related statistic concepts about hypothesis test(1)

We have previously introduced the concepts of Statistic, Confidence Level, and Confidence Interval in “ 2.1 Descriptive Statistics vs Inferential Statistics “ and “ 2.2 Parameter Estimation_Inferential Statistics (1) .”

Now, let’s list the new concepts involved in the example above:

▶︎ Low probability event:

An event with a very small likelihood of occurring, considered practically impossible in a single trial.

There are two standards for low probability events: probabilities of 0.05 and 0.01.

A probability less than 0.05 but greater than 0.01 is considered “almost impossible,” while a probability below 0.01 is “nearly completely impossible.”

▶︎ Region for acceptance:

The area where the null hypothesis is accepted.

▶︎ Region for rejection:

The area where the null hypothesis is rejected, located outside the acceptance region.

▶︎ Significant level:

The probability of the test statistic falling into the rejection region, denoted as α. Typically, α is set at 0.05 or 0.01.

▶︎ Two-tailed test:

Divides α equally into two parts, with one rejection region on each side. Each rejection region corresponds to a probability of α/2.

▶︎ Left-tailed test:

Places the entire rejection region corresponding to α on the left side.

▶︎ Right-tailed test:

Places the entire rejection region corresponding to α on the right side.

These are the seven concepts we’ve introduced through this case.

In the next template, we’ll start a hands-on step-by-step demonstration of inferential statistics cases, incorporating today’s content.

We will introduce:

● Sample vs. Population

● Two independent samples

● Two related samples

We aim to help you master professional statistics and data analysis using the simplest approach, combining Bayeslab with specific business cases.

About Bayeslab

Bayeslab: Website

The AI First Data Workbench

X: @BayeslabAI

Documents: https://bayeslab.gitbook.io/docs

Blogs:https://bayeslab.ai/blog

Bayeslab is a powerful web-based AI code editor and data analysis assistant designed to cater to a diverse group of users, including :

👥 data analysts ，🧑🏼‍🔬experimental scientists, 📊statisticians, 👨🏿‍💻 business analysts, 👩‍🎓university students, 🖍️academic writers, 👩🏽‍🏫scholars, and ⌨️ Python learners.

Easy Mastery: From Low Probability Events to Significance Level

Easy Mastery: From Low Probability Events to Significance Level

Easy Mastery: From Low Probability Events to Significance Level

Easy Mastery: From Low Probability Events to Significance Level

1. what is random sampling error?

2. AI visualization example

3. Two tailed or one tailed test?

3.1 Two tailed test

3.2 Left tailed test

3.3 Right tailed test

Conclusion: 7 needed related statistic concepts about hypothesis test(1)

About Bayeslab

Bayeslab makes data analysis as easy as note-taking!

Bayeslab makes data analysis as easy
as note-taking!

Bayeslab makes data analysis as easy as note-taking!

Bayeslab makes data analysis as easy as note-taking!