Psychometrics · 8 min read

How to Build Your Own Norm Group for Hiring

Generic norm groups say little about your specific context. Learn how to build your own norm group that makes your hiring data truly valuable.

Door Ingmar van Maurik · Founder & CEO, Making Moves


Why generic norm groups fall short

When you use assessments in your recruitment process, the results are always compared to a norm group: a reference population that determines what constitutes a high, average, or low score. Most assessment vendors offer generic norm groups based on thousands of people who have taken the test.

That sounds solid, but there is a fundamental problem. That generic norm group contains people from all sorts of industries, functions, levels, and countries. A score of 70 on a cognitive test might be excellent for one role but subpar for another. It is like comparing a marathon runner's performance with the average of all athletes, including swimmers, weightlifters, and chess players.

For companies serious about data-driven hiring, building your own norm group is not a luxury but a necessity. In this article we explain how to build one, step by step.

What exactly is a norm group

A norm group is a collection of scores from a specific population that serves as a reference point for interpreting new scores. When a candidate takes an assessment and receives a score, that score alone says little. Only when you compare that score to a relevant norm group do you know whether it is good or bad.

The elements of a norm group

ElementDescriptionExample

|---------|-------------|---------|

PopulationThe group you compare againstSenior developers at tech companies in the Netherlands SizeThe number of people in the norm groupMinimum 100, preferably 200+ RelevanceHow well the norm group fits your target audienceIndustry, function level, experience RecencyHow recent the data isScores from the past 2 years ValidationWhether the norm group is linked to performance outcomesCorrelation between scores and job performance

Generic vs. specific

A generic norm group is broad and readily available. A specific norm group is narrow and must be built yourself. The difference in value is enormous:

Generic norm group: tells you a candidate scores better than 70 percent of all people who have ever taken this test. Useful as a baseline, but not informative enough for good decisions.

Own norm group: tells you a candidate scores better than 70 percent of successful senior developers at comparable companies. Now you actually know something.

How to build your own norm group

Step 1: define your target population

Start by clearly defining who you are building the norm group for. Be as specific as useful:

  • Function category: all developers, or specifically backend developers, or even Python backend developers
  • Seniority: junior, mid-level, senior, lead
  • Industry: tech, finance, consulting, government
  • Company size: startup, scale-up, enterprise
  • Region: Netherlands, Benelux, Europe
  • The more specific, the more valuable the norm group, but also the longer it takes to collect sufficient data. Start broad and refine as you gather more data.

    Step 2: collect assessment data

    The core of your norm group is assessment data from candidates who have gone through your process. Ideally you collect:

  • Assessment scores from all candidates who took the test, not just those who were hired
  • Demographic context such as experience level and background, to be able to segment the norm group
  • Process outcomes such as whether the candidate was hired and how that person subsequently performed
  • It is essential that you collect data from all candidates, not just hired ones. If your norm group consists only of hires, it is skewed and not representative of the actual distribution.

    Step 3: determine the minimum size

    A norm group must be large enough to be statistically reliable. The rules of thumb:

  • Minimum 30 scores for a first indication, but not yet reliable
  • 100 scores for a usable norm group with reasonable reliability
  • 200 to 300 scores for a solid norm group with high reliability
  • 500+ scores for an excellent norm group you can use with confidence
  • At 50 hires per year for a specific function category, you have a good norm group after 2 to 4 years. Too long? Then start broader, for example all technical functions, and refine later.

    Step 4: calculate norm statistics

    With sufficient data, you calculate the statistics that define your norm group:

    Mean and standard deviation form the foundation. They tell you what the typical score is and how large the spread is.

    Percentile scores translate raw scores into a position relative to the group. A percentile score of 75 means the candidate scores better than 75 percent of the norm group.

    Score bands define what is low, average, and high for your context:

    CategoryPercentile rangeInterpretation

    |----------|-----------------|----------------|

    Low0-25Below expectations for this role Below average25-40Point of attention, investigate further Average40-60On level for this role Above average60-75Strong candidate High75-100Exceptional candidate for this role

    Step 5: validate against performance outcomes

    A norm group becomes truly valuable when you validate it against actual performance. This means investigating the relationship between assessment scores and:

  • Performance reviews after 6 and 12 months
  • Retention: did high scorers stay longer
  • Promotion speed: were high scorers promoted faster
  • Objective performance metrics: sales figures, code quality, customer satisfaction
  • If you discover that a score of 65 on a specific assessment strongly correlates with success in the role, you know that 65 is a meaningful threshold for your norm group. This is the foundation of predictive hiring.

    Step 6: maintain and recalibrate

    A norm group is not a static product. You need to regularly update it:

  • Add new data as you assess more candidates
  • Remove outdated data that is no longer representative, for example after a significant change in your company or market
  • Recalibrate periodically by re-examining the relationship between scores and performance
  • Segment when you have enough data to create subgroups
  • The role of technology

    Building and maintaining your own norm group manually is an enormous task. You need data from multiple sources, statistically correct calculations, and continuous monitoring. This is where a custom hiring system proves its value.

    A good system:

  • Automatically collects all relevant data with every assessment
  • Calculates norm statistics in real time as the dataset grows
  • Signals when the norm group is large enough for reliable conclusions
  • Automatically displays candidate scores in the context of the right norm group
  • Continuously recalibrates the relationship between scores and performance
  • Without such a system, you depend on spreadsheets, manual calculations, and the goodwill of someone who understands statistics. That is not scalable and not reliable.

    Common mistakes

    Only including hires in the norm group: this leads to range restriction. You only see the top of the distribution and miss the full picture.

    Using too small a norm group: with fewer than 50 scores, your conclusions are statistically unreliable. Wait until you have enough data or start broader.

    Never recalibrating: your norm group becomes outdated as the labor market changes, your company grows, or the role evolves. Plan at least an annual review.

    Using one norm group for everything: different roles require different norm groups. A norm group for developers is not usable for product managers.

    Key takeaways

  • Generic norm groups are insufficient for companies serious about data-driven hiring
  • Your own norm group compares candidates with a relevant reference group specific to your context
  • Building requires at minimum 100 scores, with 200 to 300 as the optimum
  • Validation against performance outcomes makes the norm group truly predictive
  • Maintenance and recalibration are essential to keep the norm group current
  • A custom hiring system makes building and maintaining norm groups scalable
  • Want to start building your own norm group? Get in touch or read how generic assessments fall short and why customization is the future.


    Book an intake call · View our AI Hiring System