Baldrige Scoring Guidelines and other Scoring System Guides and Tools
The scoring of responses to Criteria Items (Items) and Award applicant feedback are based on two evaluation dimensions: (1) Process and (2) Results. Criteria users need to furnish information relating to these dimensions. Specific factors for these dimensions are described below. Links to all Scoring Guidelines versions are provided on this page.

Process Scoring
Process refers to the methods your organization uses and improves to address the Item requirements in Categories 16. The four factors used to evaluate process are Approach, Deployment, Learning, and Integration (ADLI). Approach refers to: - the methods used to accomplish the process - the appropriateness of the methods to the Item requirements and the organizations operating environment - the effectiveness of your use of the methods - the degree to which the approach is repeatable and based on reliable data and information (i.e., systematic) Deployment refers to the extent to which - your approach is applied in addressing Item requirements relevant and important to your organization - your approach is applied consistently - your approach is used (executed) by all appropriate work units Learning refers to - refining your approach through cycles of evaluation and improvement - encouraging breakthrough change to your approach through innovation - sharing refinements and innovations with other relevant work units and processes in your organization Integration refers to the extent to which - your approach is aligned with your organizational needs identified in the Organizational Profile and other Process Items - your measures, information, and improvement systems are complementary across processes and work units - your plans, processes, results, analyses, learning, and actions are harmonized across processes and work units to support organization-wide goals

Results Scoring
Results refers to your organizations outputs and outcomes in achieving the requirements in Items 7.17.6 (Category 7). The four factors used to evaluate results are Levels, Trends, Comparisons, and Integration (LeTCI). Levels refers to: - your current level of performance Trends refers to - the rate of your performance improvements or the sustainability of good performance (i.e., the slope of trend data) - the breadth (i.e., the extent of deployment) of your performance results Comparisons refers to - your performance relative to appropriate comparisons, such as competitors or organizations similar to yours - your performance relative to benchmarks or industry leaders Integration refers to the extent to which - your results measures (often through segmentation) address important customer, product, market, process, and action plan performance requirements identified in your Organizational Profile and in Process Items - your results include valid indicators of future performance - your results are harmonized across processes and work units to support organization-wide goals

2012 Baldrige Scoring Guidelines

Previous Versions Business

2009 - 2010 2008 2007 2006 2005

Health Care
2009 - 2010 2008 2007 2006 2005

2009 - 2010 2008 2007 2006 2005

2009 - 2010 2008 2007 2006 2005

2009 - 2010 2008 2007 2006 2005

2009 - 2010 2008 2007 2006 2005

Has the Baldrige Award gone out of business?

For the rst me in the 25 year history of the award, in 2013 there are no Baldrige business applicants and Criteria degrada on and imprac cality are looking more and more like the primary causes

To put this in perspective, not one of the 20 million for-profit businesses in the United States applied for the Baldrige Award this year. Even worse, the number of applicants has declined 73% for Health Care and 80% for Education since 2010
Source: NIST Baldrige Website

2013 Baldrige Scoring Guidelines Improvements Needed

Reduce the total number of words in the Scoring Guidelines by more than half to make them more practical and effective. For example, is anyone going to read the 400 words in the Results Scoring Guidelines and remember what they read each time they score? . . . not everyone for sure. If the objective is to assess trends, comparisons, segmentation and whether the right results are being presented, there is no need for more than 200 words. Add a "Segmentation" scoring dimension to the Results Scoring Guidelines to reflect the popularity of this type of feedback comment Delete the "Levels" scoring dimension from the Results Scoring Guidelines . . . it is redundant with the Comparisons dimension in that you can't assign a score for Levels unless you have some relative basis for doing so . . . this is long overdue Remove all reference to non-results (e.g., projections) from the Results Scoring Guidelines. The outcry when 'projections' were first introduced was unanimous among the many users I encountered including judges . . . yet they still remain. Remove the word "align" from the Scoring Guidelines and replace it with "integration" as appropriate. The terminology "Multiple Requirements", "Overall Requirements" and "Basic Requirements" are confusing to most users and contributes to assessment variation. Guidance that the " requirements" don't really mean "requirements" doesn't help either. Advice to take a holistic view and not hold applicants accountable to the " requirements" . . . well, you get the picture. Results are quantitative by nature. So, why use judgmental terms (e.g., important, poor, good, good relative, very good, good to excellent, excellent, or my personal long-time favorite early good)? They are not needed. They introduce variation into the assessment. Get rid of them. How is "fully deployed without significant gaps" different from "fully deployed with significant gaps" . . . one of several examples where the Scoring Guidelines can be improved through more careful wording selection, simplification, and word count reduction. "Sustained over time" is another. Improve the coherency of the Results Scoring Guidelines language including the use of 'few, little, little to no, limited, limited or no, some, many, many to most, most, majority, fully, or my personal favorite mainly. Examples: Is majority closer to many or is it closer to most? Is 'majority' a simple majority? Is 'mainly' more or less than 'majority'? Is 'majority' between 'many' and 'many to most' or between 'many to most' and 'most? How does 'many' relate to 'mainly'? . . . this act needs to be cleaned up folks. Why does the "accomplishment of Mission" verbiage switch from the Trend scoring dimension to the Integration Scoring dimension in the middle scoring range? Eliminate confounded terminology. For example, how should the terms important, high priority, and key be used in scoring results? For example, which of them is most important? Which should be given the highest priority? Are they all key terms? This variation in terminology is unnecessary, confusing, and contributes to scoring variation if not error. The Results Scoring Guidelines reference customers directly but not other key stakeholders such as workforce, suppliers, and community. Baldrige Results Scoring Guidelines Quiz
Q: Which American document has approximately 150 more words than the other? a) the Baldrige Results Scoring Guidelines or b) Abraham Lincoln's "Gettysburg Address"? A: The Results Scoring Guidelines have 400 words. The Gettysburg Address has 256. Q: Which of these three terms are not used to assess results? a) 'early good', b) 'on time good', or c) 'late good' A: If you thought this was a trick question, unfortunately you're wrong. 'Early good' is used in the 0 to 5% scoring range. Silly me. I thought getting good results early would have scored higher.

Q: Results are quantitative by nature. However qualitative and/or judgmental guideline terms are used to assess them in all scoring bands. Is this: TRUE? or FALSE? A: TRUE. The judgmental terms "poor", "good", "good relative", "very good", "good to excellent", "excellent", and my personal favorite "early good" are used to assess the quantitative results. For 2011, the terms "good for nothing", "good enough", and "too good to be true" will be added . . . not true. Also not true is that because some people do not understand what "early good" means, "on-time good", and "late good" will also be added. Q: "World class" was once part of the Results Scoring Guidelines: TRUE? or FALSE? A: TRUE. The early guidelines required winners to demonstrate 'world class' results to score in the highest scoring range. Q: Which of the following terms are not used to assess the quantity of results? "no", "any", "few", "little", "little or no", "limited", "limited or no", "some", "some to many", "mainly", "many", "many to most", "majority", "most to fully", or "fully" A: Believe it or not, "some to many" and "most to fully" are not in the scoring guidelines. "Mainly"??? Q: Not one Examiner knows how to interpret the relative meaning of these results assessment terms: "important", "high priority", and "key". TRUE? or FALSE? A: I don't know how many but I do know that there is at least one who has never been able to figure it out (LOL).

