Twenty things you need to know

Donald J. Wheeler

 

Donald J. Wheeler, Ph.D.
Fellow of the American Statistical Association
Fellow of the American Society for Quality

Copyright © 2009 SPC Press, All Rights Reserved
The reproduction of any portion of this material in any form without prior written permission from the publisher is expressly prohibited.
Copies of this book may be obtained directly from SPC Press.
5908 Toole Drive, Suite C Knoxville, Tennessee 37919 (865) 584-5005 Fax (865) 588-9440 (800) 545-8602
www.spcpress.com
ISBN 978 - 0 - 945320 - 68 - 5
viii + 150 pages
58 figures, 6 reference tables, 13 data tables
123456789 10
Cover: Messier 45 The Pleiades
Copyright © 2000-2002 Anglo-Australian Observatory / David Malin Images Used with permission

Contents:

About the Author Preface
1. The Two Mistakes of Data Analysis
2. Two Types of Data
3. The Origin and Structure of Variation
4. Three Questions for Success
5. Where Do the Scaling Factors Come From?
6. Why Not Just Use the Standard Deviation Statistic?
7. Don’t We Need Normal Data?
8. But the Software Transforms the Data!
9. Why Use Three-Sigma Limits?
10. How Many Data Do I Need for Limits?
11. When Do I Need to Revise the Limits?
12. But the Limits Are Too Wide!
13. When Do We Use a Median Moving Range?
14. What Can Be Considered a Signal?
15. Why a Process Behavior Chart
Is the First Step in Data Analysis
16. When Do We Use Subgrouped Data?
17. What About ^-Charts?
18. Don’t We Need to Remove the Outliers?
19. Don’t We Need Good Measurements?
20. The Axioms of Data Analysis
For Further Reading
Tables for Charts for Subgrouped Data
Index

 

About the author

Donald J. Wheeler is a statistician who had the good fortune to work with both Dr. W. Edwards Deming and David S. Chambers. Dr. Wheeler graduated from the University of Texas, Austin, with a B.A. Degree in Physics and Mathematics, and holds M.S. and Ph.D. Degrees in Statistics from Southern Methodist University. From 1970 to 1982 he taught in the Statistics Department at the University of Tennessee, Knoxville, where he was Associate Professor. Between 1981 and 1993 he periodically assisted Dr. Deming with his four-day seminars. He is a Fellow of the American Statistical Association and a Fellow of the American Society for Quality. He has conducted over 1000 seminars in sixteen countries on five continents. He is author or co-author of 24 books and over 160 articles. Through these seminars and books Dr. Wheeler has had a profound impact on companies and organizations around the world.

 

Preface

The full title of this book ought to be “Twenty Things You Need to Know about SPC and Data Analysis Based on Lessons Learned in over Forty Years of Study, Practice, and Collaboration,” but the space on the cover was a couple of light-years short to get it all in.
This book was conceived as a companion book for Understanding Variation, The Key to Managing Chaos. It provides brief answers to many of the commonly occurring questions that arise when people begin to use process behavior charts. While a few chapters will recap material from Understanding Variation, others will complement and complete the message of that book. In every case the objective was to enable you to better and more easily use process behavior charts to get the most out of your processes and operations.
Since not all sources of information about SPC are equally reliable, some of the chapters will, of necessity, be focused on mistakes and misinformation that are currently in circulation. In the interest of completeness, these chapters will also include enough of the background material to justify the answer given. For additional documentation on the topics of each chapter of this book see the For Further Reading section of the Appendix. The books cited there include: Understanding Statistical Process Control, Making Sense of Data, The Six Sigma Practitioner s Guide to Data Analysis, Advanced Topics in Statistical Process Control, Normality and the Process Behavior Chart, and EMP III Evaluating the Measurement Process.
As always, it is my hope that you will find this book to be useful.

There is light enough for those who wish to see, and darkness enough for those who are otherwise inclined.

Blaise Pascal

In this world two plus two is only equal to four on the average.

David S. Chambers

 

Chapter one


The two mistakes of data analysis

In Understanding Variation the case was made that while some data may contain signals, all data contain noise. As a consequence of this reality, we cannot ever make sense of our data until we filter out the noise. And the primary tool for this filtration was shown to be a simple process behavior chart known as an XmR Chart.
If we fail to filter out the noise, we may interpret routine variation as if it is a signal. When we do this, we end up attributing the perceived change to nonexistent causes, with the inevitable consequence that our explanations will not fit the reality in which we have to work.
On the other hand, if we realize that some variation is merely routine, and do not properly separate the routine variation from the exceptional variation, then we may well end up missing some signals of change. When this happens we will fail to understand why the change has occurred, and an opportunity to gain knowledge will be missed.
Thus we have the two mistakes that we can make when we attempt to use data. Either we can interpret noise as a signal and chase after explanations that do not exist. Or, we can interpret a signal as a bit of noise and fail to learn that which could be learned.

As a consequence of these two mistakes, we find that the separation of signals from noise is fundamental to any and every form of data analysis. Analysis requires this separation, and any attempt to interpret data without analysis is an example of numerical naivete. It is because of such naivete that Mark Twain coined the phrase, “Lies, damned lies, and statistics.” When people confuse noise with signals, anybody can make anything out of the data. If we do not filter out the noise, confusion is inevitable.
Unfortunately, many who are numerically naive do not recognize their naivete. They think that because they can do arithmetic they are qualified to interpret data. Their approach can be summarized as: “Two numbers that are not the same are different!” This theorem of the unconsciously naive will turn everything into a signal, two points will always define a trend, and explanations will be required for all of the (unfavorable) noise in the monthly report.
The first step in avoiding this numerical naivete is to understand the nature of the two mistakes of data analysis.

Mistake One: Interpreting the routine variation of noise as if it amounted to a signal of a change in the underlying process, thereby sounding a false alarm.

Mistake Two: Thinking that a signal of a change in the underlying process is merely the noise of routine variation, thereby missing a signal.

You can avoid Mistake One by refusing to believe that anything is a signal. While Dilbert’s boss is never distracted by the data, and hence never makes Mistake One, he makes Mistake Two every day.
On the other hand, you can avoid Mistake Two by simply interpreting everything as a signal. The numbers go up, and that is a signal; the numbers go down, and that is a signal; the numbers stay the same, and that is a signal too because you expected them to go up or down. While this may keep you from ever missing a signal, you are very likely to be overwhelmed by Mistake One. A friend of mine once described a newspaper as having predicted 15 of the last three downturns in the stock market.
The trick is to strike a balance between the two mistakes by filtering out the noise of routine variation so that we can detect the potential signals within our data. This is the essence of effective data analysis.

 

Other Books by Donald J. Wheeler

  • Understanding Variation, the Key to Managing Chaos
  • Understanding Statistical Process Control
  • Making Sense of Data
  • EMP III, Evaluating the Measurement Process
  • The Six Sigma Practitioner’s Guide to Data Analysis
  • Advanced Topics in Statistical Process Control
  • SPC at the Esquire Club
  • Normality and the Process Behavior Chart
  • Short Run SPC
  • Beyond Capability Confusion
  • The Process Evaluation Handbook
  • Range Based Analysis of Means
  • Understanding Industrial Experimentation

You can get information about these and other books by Dr. Wheeler, or place an order at www.spcpress.com.