Skip to main content

Versions

no_ci_v1

  • Description: Assessment version without any Confidence Intervals (CIs).

ci_v2

  • Description: Assessment version introducing Confidence Intervals (CIs) for each skill, question and interview level.

pf_domain_v3

  • Description: Assessment version with Pass/Fail outcome and grouping of domains and Confidence Intervals (CIs) for each skill, question and interview level.
  • Features:
    • Structured by assessment domains.
    • Role-based Pass/Fail scoring for improved candidate filtering.
    • Domain-level scoring with weighted importance.
    • Core domain coverage requirement for passing.

pf_domain_3p_v4

  • Description: Most recent version with Pass/Fail, domain grouping, and Confidence Intervals (CIs) for each skill, question and interview level.
  • Features:
    • Assessment is grouped by domains.
    • Role-based Pass/Fail outcome for each candidate.
    • Enhanced with new tweak for better candidate evaluation.
    • Domain-level scoring with weighted importance.
    • Core domain coverage requirement for passing.

Role-Based Pass/Fail Scoring Approach

The Pass/Fail versions (pf_domain_v3 and pf_domain_3p_v4) use a role-based scoring approach that helps determine whether a candidate passes or fails an interview based on the skills relevant to the role. This methodology consists of three main steps:

Step 1: Map Skills to Role Domains

  1. Define the role – For example, Backend Engineer.
  2. Create a mapping between skills and domains:
    • Group related skills into domains (e.g., Data & DB – SQL, NoSQL, Database Design)
    • Assign a domain type:
      • Core – Essential skills for the role
      • Secondary – Important but not critical
      • Optional – Nice-to-have skills
    • Assign a domain weight to reflect importance (determined by subject-matter experts):
      • Core → higher weight
      • Secondary → medium weight
      • Optional → lower weight
    Note: Weights are scaled so that the sum of all included domains equals 1 for calculation of the interview score. For reference, see the domain and skills mapping spreadsheet. Example domain configuration:
    DomainCategoryWeight
    API Designcore0.30
    Data & DBcore0.26
    System Designsecondary0.19
    Languages & Frameworkssecondary0.15
    Performanceoptional0.07
    Cloud & DevOpsoptional0.04

Step 2: Map Candidate Skills to Domains

For each interview:
  1. Collect the candidate’s demonstrated skills from the interview.
  2. Map each skill to its corresponding domain based on the mapping table.
  3. Calculate the domain score:
    • Domain Score = average(skill scores within that domain)
    • If the candidate did not demonstrate any skills in a domain, mark it as “Not Mentioned” instead of penalizing
    • If the domain is undemonstrated in the PDF, simply leave it blank
    • Domain level confidence intervals use the same approach with pooled skills
  4. Calculate Core Domain Coverage:
    • Core Domain Coverage = (Number of Core Domains with Demonstrated Skills) ÷ (Total Number of Core Domains)
    • To be considered PASSED, a candidate must demonstrate skills in 100% of Core Domains
    • If any Core Domain is not demonstrated, the candidate is automatically marked FAILED, regardless of the overall interview score
    Example domain scoring:
    DomainSkill namesObservationsMean score across skillsConfidence intervals (95%)Stability
    Data & DBSQL, NoSQL98075-83Reliable
    API DesignREST77573-80Reliable
    Languages & FrameworksGit, Docker, Java67065-77Moderately Uncertain
    PerformanceNot Mentioned

Step 3: Calculate Interview Score and Determine Pass/Fail

  1. Combine domain scores using the domain weights to calculate a new interview score based on domains.
  2. Keep confidence intervals to reflect uncertainty in the scoring.
  3. Only include domains with demonstrated skills in the score calculation.
  4. Determine Pass/Fail:
    • Initial baseline:
      • PASS → top 50% of scores
      • FAIL → bottom 50%
    • Absolute threshold: After the first evaluation batch, an absolute threshold for Pass/Fail is introduced (calibrated using real interview data). This threshold helps ensure domain-level reliability and is iteratively refined over time as more data becomes available.

Limitations

If a candidate demonstrates only one skill within a domain, the domain is still considered assessed. However, the resulting domain score may be biased by limited evidence from a single skill.