###### Analyze Phase

# 2-Sample T

Tests whether we have a difference between the means of 2 groups

Why to use this tool?

To find out whether our X is a vital X or not when it comes to the mean.

How to use this tool?

You must have Y with continuous data and an X with discrete data. Data should not cover more than 2 population samples. For example, yes/no, hot/cold, green/red, etc.

Step 1: Collect at least 30 data points for Y along with 30 data points for X in Minitab (or Excel) and put Y in one column and X in another.

Step 2: Open Minitab and select: Stat > Basic Statistics > 2-Sample T…

Step 3: Enter Y in “Samples” and X in “Sample-IDs.”

Step 4: Select Graphs and choose the graph of your preference (boxplot or individual value plot).

Step 5: Select “OK” and view the results. If your P-Value is greater than 0.05, you accept your null hypothesis and conclude that it is not a vital X. If the P-value is below 0.05, you reject the null hypothesis and conclude that it is a vital X.

When to use this tool?

During the analyze phase, to find out which of the potential X´s are the vital X´s.

3 Do´s and 3 Don’ts:

Do: Verify if your data is reliable.

Do: Collect at least 30 data points.

Do: Include the results in your project documentation.

Don´t: Forget to interpret the P-Value results in your project documentation.

Don´t: Forget to also test the spread before determining if it is a vital X or not.

Don´t: Consider this X in the improve phase if it is not a vital X.

# Alternative Hypothesis

This statement is accepted when P-Value is below 0.05

Why to use this tool?

It is always important to write the null and alternative hypothesis statements when performing statistical tests as this will make it easy for everyone to interpret and understand the test results.

How to use this tool?

Step 1: Brainstorm with the process SMEs (subject matter experts) what critical factors can influence the process outcomes (potential root causes).

Step 2: After identifying the potential root cause, make sure it is measurable.

Step 3: Check whether historical data exists or should be collected by creating a data collection plan.

Step 4: After collecting at least 30 data points, you can look at the hypothesis testing decision tree and decide what tests to run.

Step 5: Write the Null Hypothesis (Ho). For example: there is no difference between system A and system B in terms of the mean of the process cycle time Y.

Step 6: Write the alternative hypothesis (Ha), which is exactly the opposite from the (Ho). For example: there is a difference between the use of system A and system B in terms of the mean of the process cycle time Y.

Step 7: Run the statistical test and look at the results of the P-Value. If the P-value is below 0.05, it is statistically significant, and we need to reject the (Ho) and accept the (Ha) statement. This means that we identified a vital X.

When to use this tool?

While performing normality and stability tests, we will write the alternative hypothesis statements in the measure phase.

When running any other tests, this will most likely happen in the analyze phase.

3 Do´s and 3 Don’ts:

Do: Write the (Ho) null hypothesis and (Ha) alternative hypothesis statements in your project documentation for each statistical test.

Do: Explain at the end of each statistical test if Ho is rejected or accepted.

Do: Include the Y and the X in each of the (Ho) and (Ha) sentences.

Don’t: Forget that when you have a continuous Y and a discrete X you will get the (Ha) sentence 2 times. Once for the mean and once for the spread.

Don´t: Forget that the alternative hypothesis for the normality test can be written like this: compared to the Gaussian population, there is a problem with the data distribution.

Don´t: Forget that the alternative hypothesis for the run chart can be formulated like this: there is an issue with stability/cluster/oscillation/trend in the data.

# 2 Proportions and Chi Square Test

It compares the observed results with the expected results

Why to use this tool?

To find out whether our X is a vital X or not when it comes to correlation with Y.

If there is a difference between the observed data and the expected data, then we will find out if it is a coincidence or a relationship between the variables.

How to use this tool?

You need to have Y with continuous data and X with continuous data as well.

Step 1: Collect at least 30 data points for the variables you want to test against hit or miss of your Y. After you summarized your data, you will have 3 columns that will look like this: Columns 1 is titled Target, and here you have the summarized number of “hits” and “misses” of your Y. Column 2 is variable 1 with the summed up “hit” and “miss” data, and column 3 is variable number 2 with summarized data on “hit” and “miss”.

Step 2: Open Minitab and select: Stat > Tables > Cross Tabulation and Chi Square…

Step 3: In the drop-down box, select, “Summarized data in a two-way table. In the “Columns containing the table”, select your variables that are summarized in columns 2 and 3. In the “Rows” select your “Target”, which is your column 1. In the “Display” section, place a check mark on “Counts.” Click “Chi-Square” and place a check mark on “Chi-square test” and “Expected cell counts”, then select “OK”.

Step 4: Select “OK” again and view the results. If your P-Value is greater than 0.05, you accept your null hypothesis and conclude that it is not a vital X. If the P-value is below 0.05, you reject the null hypothesis and conclude that it is a vital X.

When to use this tool?

During the analyze phase, to find out which of the potential X´s are the vital X´s.

3 Do´s and 3 Don’ts:

Do: Verify if your data is reliable.

Do: Collect 30 data points or more.

Do: Include the results in your project documentation.

Don´t: Forget to interpret the P-Value results in your project documentation.

Don´t: Forget about using other statistical tools as well. The Chi-square does not create a graphical output. Management always likes to see the further outcomes of your data using other graphical tools (for example, pareto charts, bar graphs, pie charts, etc.).

Don´t: Consider this X in the improve phase if it is not a vital X.

# Data Collection Plan

To be clear what type of data needs to be collected

Why to use this tool?

To create a roadmap of how to collect what kind of data so that we can conduct statistical tests for our projects.

How to use this tool?

Step 1: Inform the project team what data needs to be collected.

Step 2: With their help, find out where this type of data can be collected.

Step 3: Create a data collection plan using the format shown in the template. You can enter the data either into excel or directly into your statistical software (for example Minitab).

Step 4: Make your measurement system for collecting your data accurate and reliable by using tools such as Gage R&R. Then set standards for everyone who collects the data.

Step 5: Collect at least 30 data points.

When to use this tool?

When running the normality, stability, and capability tests, we will need a data collection plan in the measure phase.

As we go further into our project and need to collect further data for the analyze phase, then a data collection plan in the analyze phase is needed.

In most projects, the data collection plan is mainly used in the analyze phase because data for Y already exists and then additional data is needed for individual potential Xs.

Once we are in the control phase and creating a control plan, we will need a data collection plan that is linked to the data needed for the control plan.

3 Do´s and 3 Don’ts:

Do: Collect at least 30 data points (the more data points the better).

Do: Make sure that everyone collecting data follows the same standard. This will ensure that we have stable, accurate and reliable data.

Do: Put the data in columns (vertical order) and follow the correct sequence. For example: the first data point is shown first, the second data point appears second on the data collection plan, and so on. This is especially important when running the run chart (stability test).

Don’t: Try to collect data for each X mentioned in the brainstorming session. We should only focus on the most important factors that we believe affect the performance of the process. If we collect too many points, the data collection plan will be too extensive, and the project team will have neither the motivation nor the time to collect that much data.

Don´t: Start collecting data before checking if historical data already exists.

Don´t: Use historical data until you verify its validity and reliability.

# FMEA

Failure Mode and Effects Analysis to prioritize potential Xs

Why to use this tool?

This tool can be used for different reasons. Most of the time, we use this tool as a prioritization tool right after the fishbone diagram. Once the fishbone is complete, we want to make sure we prioritize all Xs properly and create a data collection plan using only the most important Xs. It would not be efficient to create a data collection plan for all Xs coming from the fishbone.

How to use this tool?

Use the template called FMEA and fill in all the columns step by step.

Step 1: Use the information from the Fishbone and the process maps to fill in the information.

Step 2: Using the SME experience, fill in the scores for SEV (severity), OCC (occurrence) and DET (detection), to create the RPN (risk priority number) score (SEV x OCC x DET = RPN).

Step 3: Sort all Xs in the FMEA from the highest RPN score to the lowest score.

Step 4: Highlight the top Xs to be used in your data collection plan. They are the ones that will be tested using statistical tools.

When to use this tool?

The FMEA is done in the analyze phase of your project.

3 Do´s and 3 Don’ts:

Do: Use the “mode” sentence that comes from the highest-level bone from your X.

Do: Use the “Potential Causes” sentence that comes from the (measurable) lowest level bone of the Fishbone.

Do: Use the text from the head of the Fishbone and put it in the column labeled “Potential Failure Mode”.

Don´t: Forget that the highest DET (detection) in a process comes with a low score, meaning a process that has no control and low detectability gets high scores.

Don´t: Forget to enter the P-Value results for the mean and for the spread.

Don´t: Forget to update the FMEA at the end of the project and highlight the vital Xs. If one day the process gets out of control, the team can look at the FMEA to check for vital Xs and hope to be able to stabilize the process with little effort.

# Null Hypothesis

This statement is accepted when P-Value is 0.05 or greater

Why to use this tool?

It is always important to write the null and alternative hypothesis statements when performing statistical tests as this will make it easier for everyone to interpret and understand the test results.

How to use this tool?

Step 1: Brainstorm with the process SMEs (subject matter experts) what critical factors can influence the outcomes of the process (potential root causes).

Step 2: After identifying potential root cause, make sure it is measurable.

Step 3: Check whether historical data exists or should be collected by creating a data collection plan.

Step 4: When you have at least 30 data points collected, you can look at the hypothesis testing decision tree and decide what tests to run.

Step 5: Write the (Ho) null hypothesis. For example: there is no difference between system A and system B when it comes to the mean of the process cycle time Y.

Step 6: Write the (Ha) alternative hypothesis, which is exactly the opposite of the (Ho). For example: there is a difference between using system A and system B when regarding the mean of the process cycle time Y.

Step 7: Run the statistical test and look at the P-Value results. If the P-value is below 0.05, it is statistically significant, and we need to reject the (Ho) and accept the (Ha) statement. This means we have identified a vital X.

When to use this tool?

When performing the normality and stability tests, we will write the null hypothesis statements in the measure phase.

When running any other tests, this will most likely happen in the analyze phase.

3 Do´s and 3 Don’ts:

Do: Write the (Ho) null hypothesis and (Ha) alternative hypothesis statements in your project documentation for each statistical test.

Do: Explain at the end of each statistical test, whether the (Ho) is rejected or accepted.

Do: Include Y and X in each of the (Ho) and (Ha) sentences.

Don’t: Forget that if you have a continuous Y and a discrete X you will have the (Ho) sentence twice, once for the mean and once for the spread.

Don´t: Forget that the (Ho) null hypothesis for the normality test can be written as follows: compared to the Gaussian population, there is no problem with the distribution of data.

Don´t: Forget that the (Ho) null hypothesis for the run chart can be formulated as follows: there is no issue with stability/cluster/oscillation/trend in the data.

# One-Way ANOVA

Tests whether we have a difference between the means of more than 2 groups (population samples)

Why to use this tool?

To find out whether our X is a vital X or not when it comes to the mean.

How to use this tool?

You need to have Y with continuous data and X with discrete data. Data should cover more than 2 population samples. For example, yes/no/maybe, or hot/medium/cold, or green/yellow/red, etc.

Step 1: Collect at least 30 data points for your Y along with 30 data points for your X in Minitab (or Excel) and put Y in one column and the X in another.

Step 2: Open Minitab and select: Stat > ANOVE > One-Way…

Step 3: Enter your Y in the “Response” and your X in the “Factor” field.

Step 4: Select Graphs and choose the graph of your preference (for example, boxplot or individual value plot).

Step 5: Select “OK” and view the results. If your P-Value is greater than 0.05, you accept your null hypothesis and conclude that it is not a vital X. If the P-value is below 0.05, you reject the null hypothesis and conclude that it is a vital X.

When to use this tool?

During the analyze phase to find out which of the potential Xs are the vital Xs.

3 Do´s and 3 Don’ts:

Do: Verify if your data is reliable.

Do: Collect at least 30 data points.

Do: Include the results into your project documentation.

Don´t: Forget to interpret the P-Value results in your project documentation.

Don´t: Forget to also test the spread before concluding if it is a vital X or not.

Don´t: Consider this X in the improve phase if it is not a vital X.

# P-Value

Stands for probability value

Why to use this (tool)?

It is not really a tool. It is the outcome of your statistical tests. The P-Value will indicate the final results.

How to use this tool?

Step 1: Run your statistical test.

Step 2: Check whether the P-Value is above or below 0.05.

If it is less than 0.05, you reject the null hypothesis. If it is 0.05 or greater, you accept the null hypothesis statement (see the 2 additional videos on null and alternative hypothesis).

Step 3: Record the test results in your project documentation. You can use the templates featured in this video for a better overview.

When to use this tool?

You will be able to use the P-Value test results in the measure phase while using the normality, stability, and capability test.

The time in your project, when you will use P-Value the most will be in the analyze phase. Here you will be testing all your Xs with a variety of statistical tools.

3 Do´s and 3 Don’ts:

Do: In case you obtain the P-value of 0.05, further data should be collected. As you gather more data, you will see whether it tends to go above or below that mark. This is an optional recommendation. If there is no time or budget to collect further data, you will need to accept the (Ho) null hypothesis.

Do: List all P-value results in your project documentation and explain the results.

Do: Move all Xs from analyze phase to the improve phase if they reached a P-value of less than 0.05.

Don’t: Rely solely on the graphical results as the naked eye will not be able to see the exact end result. Always rely on the statistical test results according to your P-Value.

Don´t: Need to test the X in case it is a low-hanging fruit where results are obvious.

Don´t: Forget to illustrate the P-value results on the before and after Y data at the end of your project.

# Regression Test

Tests whether a correlation exists

Why to use this tool?

To find out if our X is a vital X or not when it comes to correlation with Y.

How to use this tool?

You need to have Y with continuous data and X with continuous data as well.

Step 1: Collect at least 30 data points for your Y along with 30 data points for your X in Minitab (or Excel) and put Y in one column and the X in another.

Step 2: Open Minitab and select: Stat > Regression > Fitted Line Plot…

Step 3: Enter your Y in the “Response (Y)” and your X in the “Predictor (X)” field.

Step 4: Select “OK” and view the results. If your R-sq scores are 80% or higher then you have a strong correlation, which can be taken as a vital X. If you are below 80%, then it is not a vital X

Note: In the fitted Line Plot you will see a red line. If the blue dots are on the red line, this indicates a correlation. The further the blue dots are from the red line, the weaker the correlation.

When to use this tool?

During the analyze phase to find out which of the potential Xs are the vital Xs.

3 Do´s and 3 Don’ts:

Do: Verify if your data is reliable.

Do: Collect at least 30 data points.

Do: Include the results into your project documentation.

Don´t: Forget to interpret the R-sq results in your project documentation.

Don´t: Look at the P-Value as the first reference. In this test, first look at the R-sq results.

Don´t: Consider this X in the improve phase if it is not a vital X.

# Test for Equal Variances

Tests whether we have a difference between the means of more than 2 groups (population samples)

Why to use this tool?

To find out whether our X is a vital X or not when it comes to the spread.

How to use this tool?

You need to have Y with continuous data and X with discrete data.

Step 1: Collect at least 30 data points for your Y along with 30 data points for your X in Minitab (or Excel) and put Y in one column and the X in another.

Step 2: Open Minitab and select: Stat > ANOVA > Test for Equal Variances…

Step 3: Make sure your drop-down box illustrates that your response data is in one column for all cell factors. Then enter your Y in the “Response” and your X in the “Factors” field.

Step 4: Select Graphs and choose the graph of your preference (for example, summary plot, individual value plot or boxplot).

Step 5: Select “OK” and view the results. If your P-Value is greater than 0.05, you accept your null hypothesis and conclude that it is not a vital X. If the P-value is below 0.05, you reject the null hypothesis and conclude that it is a vital X.

Note: You will get more than one P-Value. If you take the Levene´s Test results, you should have abnormal data. If you choose the other P-Value results (example: F-Test or Bartlett´s Test), you should have normal data. The F-Test is used when you have 2 populations samples and the Bartlett´s Test when you have more than 2 population samples.

When to use this tool?

During the analyze phase to find out which of the potential Xs are the vital Xs

3 Do´s and 3 Don’ts:

Do: Verify if your data is reliable.

Do: Collect at least 30 data points.

Do: Include the results into your project documentation.

Don´t: Forget to interpret the P-Value results in your project documentation.

Don´t: Forget to also test the mean before concluding if it is a vital X or not.

Don´t: Consider this X in the improve phase if it is not a vital X.