“Evaluation, Reliability, and Validity: How Credible are Your Diversity Initiative Assessments of Progress and Results?”

Performance MeasurementEvaluation is a task that every Diversity Practitioner will face at one time or another. No matter what your role such as Trainer, Consultant, Chief Diversity Officer (CDO), Council Member, ERG/BRG Leader, etc., conducting an evaluation to assess key aspects of your Diversity and Inclusion initiatives is inevitable.

Two Definitions of Evaluation

People do not always agree on one definition of evaluation. Following are statements that reflect two different definitions:

  • “Evaluation is the systematic process of collecting and analyzing data in order to determine whether and to what degree objectives have been or are being achieved.”
  • “Evaluation is the systematic process of collecting and analyzing data in order to make a decision.”

Notice that the first ten words in each of the definitions are the same. However, the reasons-the “Why!”-for collecting and analyzing the data reflect a notable difference in the philosophies behind each definition. The first reflects a philosophy that as an evaluator, you are interested in knowing only if something worked, if it was effective in doing what it was supposed to do. The second statement reflects the philosophy that evaluation makes claims on the value of something in relation to the overall operation of a Diversity intervention, project, or event. Many experts agree that an evaluation should not only assess program results but also identify ways to improve the program being evaluated. A Diversity program or initiative may be effective but of limited value to the client or sponsor. You can imagine, however, using an evaluation to make a decision (the second definition) even if a program has reached its objectives (the first definition).

For some, endorsing Diversity Evaluation is a lot like endorsing regular visits to the dentist. People are quick to endorse both activities, but when it comes to doing either one, many Diversity Practitioners are very uncomfortable.

Evaluation: An Essential Element of Success

Evaluation is an absolutely essential ingredient when you are attempting to close performance gaps or improve performance. It is the only way to determine the connections between performance gaps, improvement programs, and cost-effectiveness. Evaluation is one of the most cost-effective activities in diversity performance improvement, because it is the one activity that, if applied correctly, can ensure success. It is often resisted, however, because of the fear that it could document failure. Evaluation is the process that helps us make decisions about the value of all the activities we have been engaged in and whether they are a worthwhile investment for the organization. Without systematic evaluation we are left with “wishful thinking” or self-service impressions that are often wrong and sometimes dangerous.

All evaluation studies must satisfy two criteria: reliability and validity. Establishing these criteria up front will help you communicate your expectations to the C-Suite and any vendors who deliver programs and assist in your Diversity initiatives. Reliability, the simpler of the two, requires all evaluation methods give the same results each time we measure. This protects you against measures that change constantly and produce different results every time they are used, because of the measuring instrument. Reliability is relatively easy to achieve, yet its importance is often overlooked. To overcome this you must utilize specific Diversity science procedures and instruments for measuring the aspects of Diversity performance and goal achievement that are reflected in the initiative’s objectives, strategies and the organization’s performance gaps. Next, you have to standardize these procedures such that they measure in the same way every time. These activities can be perfectly compatible with the way correctly designed Diversity initiatives are structured and administered.

The second criterion, validity, requires that all evaluations measure exactly and only what it is supposed to be measuring. This criterion is one of the requirements most often violated in Diversity performance and other assessments. For example, if we attempt to measure the amount of knowledge employees gained in a Diversity Competency Training program using a “Reaction” form that asks them how much they learned, the results will indicate how much employees “think” they learned, not how much they “actually” learned. Reaction forms too often report high amounts of learning when little occurred and vice versa (Clark, 1982). Consequently, training reaction evaluation could be reliable but not valid in these cases, because the actual results were the opposite of what the invalid instrument reliably reported! If the instrument reported the same invalid result each time it was used, it is still reliable—which is why we need both reliability and validity for all evaluation activities.

An example of a valid measurement of learning would be a Diversity competency problem-solving exercise or memory test (provided they represented the knowledge and skills the participants learned during the training. The more you make use of Diversity sciences and research evidence about the event being measured, the better your chances of for validity. Performance evaluation systems such as the Hubbard 7-Level Evaluation Methodology, integrates these approaches in the process.

Conducting a comprehensive Diversity Evaluation is the only true way to know if Diversity and inclusion programs or initiatives are delivering the outcome results expected by key stakeholders. It is essential that Diversity Practitioners master critical Diversity and Inclusion evaluation methods using technologies that are rooted in Diversity ROI® science. Why? Because the perceived value and credibility of what we do to be seen as a true Business Partner and Professional depends on it!