Ask most L&D teams what they can tell you about their training, and you'll get a familiar answer: who completed it. Maybe a satisfaction score. Possibly an assessment average if the LMS is feeling generous.
Then someone in senior leadership asks the question every L&D lead dreads — "how do we know this is working?" — and the completion dashboard suddenly looks very thin.
This piece is about what to track instead, why most ROI calculations don't survive a finance review, and how to build a measurement approach that holds up when the budget conversation happens.
Effectiveness is the degree to which training changes something that matters to the business — capability, behaviour, or a downstream business outcome. That's it.
It's not how many people finished. It's not how learners rated the course. It's not how much content you shipped. Those are inputs and signals, not outcomes.
The reason this distinction matters: most L&D measurement frameworks were designed in the era of classroom training, when delivery was the expensive part. In eLearning, delivery is cheap. The expensive part is the decision to invest — which means the only measurement that actually justifies the spend is measurement tied to outcomes.
Kirkpatrick's four levels — reaction, learning, behaviour, results — get cited in every L&D measurement article ever written, including the one we just replaced. It's a useful taxonomy. It is not, on its own, a measurement plan.
Here's what actually happens in most organisations:
So the practical question isn't "should we use Kirkpatrick?" — it's "how do we actually get past Level 2?" Two things help:
If you want the modern, less-prescriptive evolution of this thinking, the New World Kirkpatrick model and Brinkerhoff's Success Case Method are both worth a look. They start from the same idea — measure outcomes, not effort — but with sharper tools.
The formula is the easy part:
ROI (%) = [(Benefits − Costs) / Costs] × 100
If a training programme costs $50,000 and delivers $125,000 in measurable benefit, you've got a 150% return. Simple.
The hard part is "measurable benefit." Specifically: how much of that $125,000 can you defensibly attribute to the training, versus the new manager who started the same month, the system upgrade that went live in Q2, or the seasonal lift in sales that happens every year regardless?
This is where most L&D ROI stories fall over in front of a CFO. Three things make them more defensible:
1. Pick outcomes you can actually isolate. Compliance error rates on a specific process. Time-to-competency for new hires. Customer service handling time on a specific interaction type. The narrower and more directly trained-for the outcome, the easier the attribution conversation.
2. Use a control group where you can. Roll the training out to one region before another. Compare cohorts who completed against cohorts who haven't. It's not a randomised trial, but it gives you something to point to.
3. Be honest about contribution, not credit. "This training contributed to a 12% reduction in customer complaints over six months, alongside the new escalation process and the team restructure" is a stronger story than a confident "150% ROI" that a CFO can poke holes in. Finance teams respect L&D leaders who don't oversell.
Beyond completion, here are the metrics that actually tell you something useful. Roughly in the order you should care about them.
The metric that matters most, and the hardest to capture. Manager observation surveys, peer feedback, work-sample analysis, or direct performance data from the system the work happens in. Usually measured 3–6 months after training, once people have had time to apply what they learned. If you only set up one Level 3 measurement, make it a 90-day manager pulse check on a specific observable behaviour.
Not the average score — that hides everything. Look at which questions a high proportion of learners get wrong. That tells you whether the issue is content, question design, or a genuine capability gap. A 70% average score where everyone misses the same two questions is a very different problem from a 70% average where the misses are scattered.
Where do learners abandon the content? At what interaction, on which slide, after what video? Module-level completion data won't tell you. Interaction-level data will. If half your learners drop off at the same scenario, you have a specific thing to fix — not a vague "engagement problem."
Long times can mean engagement or confusion — you have to look at it next to assessment performance to know which. Very short times usually mean skipping. Useful as a diagnostic alongside other metrics; rarely useful on its own.
If learners come back to a specific section repeatedly, either it's a reference resource (good) or it's confusing (bad). Cross-reference with assessment data on the related topic to figure out which.
A lagging signal of design quality, not of learning impact. Useful for spotting problems — if learners hate a course, they probably won't apply it — but a high score doesn't prove anything. People rate things they enjoy highly. Enjoyment ≠ behaviour change.
We're not anti-completion. We're anti-completion-as-the-only-metric. As Chris Dwyer, Learning Systems Manager at Specsavers, put it in a conversation with us: "I wouldn't call them outdated, I'd call them unfashionable." His more useful framing: completion identifies the top of your measurement funnel — the people you now need to track to understand whether their capability, clarity, or motivation actually shifted. Treat it as a starting point for measurement, not the finish line.
Leading indicators are the early signals — engagement, time on task, assessment performance, satisfaction. You can measure them during or immediately after training. They tell you whether the programme is on track and let you fix things in real time.
Lagging indicators are the outcomes — behaviour change, performance improvement, business metrics. They show up months later. They tell you whether the training actually mattered.
The mistake L&D teams make is reporting one or the other, not both. Leading indicators on their own look like activity ("learners are engaged!"). Lagging indicators on their own look like luck ("sales went up, possibly because of training, possibly not"). Together, they tell a story: engagement was strong → assessment performance shifted → 90-day behaviour observation shows X → the business metric moved Y.
If your leading indicators are strong but your lagging indicators are flat, the problem isn't the learning — it's the transfer environment. Manager support, workflow design, available resources. That's worth knowing too. It's also a useful thing to tell stakeholders, because it shifts the conversation from "did the training work?" to "what's stopping it from sticking?" — which is a much more productive question.
We've gone deeper on this distinction in our leading and lagging indicators guide if you want the longer version.
Most L&D measurement plans fail not because the metrics are wrong, but because no one set them up before the training launched. By the time someone asks "did it work?", there's no baseline to compare against.
A workable plan needs five things:
1. A specific outcome you're trying to change. "Improve customer service" isn't measurable. "Reduce average call handling time on Tier 1 enquiries by 15% over 90 days" is. If you can't name it specifically, you don't have a training problem yet — you have a scoping problem.
2. A baseline. Whatever metric you're trying to move, capture its current state before training starts. Pre-assessment scores, current performance data, existing error rates. Without a baseline, post-training data is uninterpretable.
3. Multiple measurement points. Immediately after training (reaction, knowledge), four to six weeks later (early behaviour signals), and three to six months out (sustained behaviour change and business impact). Different metrics need different timelines.
4. Named owners for each measurement. Who is running the 90-day manager pulse? Who is pulling the business metric in Q3? If it's not owned, it won't happen.
5. A reporting rhythm. Stakeholders want a regular drumbeat, not an annual report. Quarterly is usually right. The format matters less than the consistency.
This is unglamorous work. It's also the difference between L&D that gets the next budget request approved and L&D that has to defend its existence every year.
The honest version: most of what we've covered above doesn't require a specific tool. You can measure behaviour change with a Google Form sent to managers. You can do baseline-vs-post analysis with a spreadsheet.
What a tool like Chameleon Analytics gives you is the bit that's genuinely hard to do manually: visibility into what's happening inside the course. Which interactions lose people. Which questions are tripping learners up. Where engagement drops off. Whether learners are actually watching the video or skipping to the assessment.
This is the data that turns "we think the module on de-escalation isn't landing" into "82% of learners drop off at the second scenario, and the ones who finish miss the same two questions about emotional regulation." One of those is an opinion. The other is something you can fix.
If you want to see what interaction-level analytics actually looks like in practice, our analytics page walks through the dashboards, and the learner drop-off article shows how to read the data once you have it.
Measuring eLearning effectiveness isn't actually about measurement. It's about positioning L&D as a function that takes outcomes seriously — that doesn't oversell, doesn't hide behind activity metrics, and can talk about its work in the same language as the rest of the business.
The teams that do this well don't necessarily measure more than everyone else. They just measure the right things, with the right rigour, and they tell honest stories about what the data does and doesn't show.
That's the version that gets a seat at the table.
Kirkpatrick's four levels (reaction, learning, behaviour, results) remain the most widely used. They're a useful taxonomy for thinking about what to measure at each stage — from immediate learner feedback through to business impact. But Kirkpatrick is a framework, not a measurement plan. The teams that actually get past Level 2 are the ones who define their behaviour and business outcomes before the training is built, not after.
The formula is ROI (%) = [(Benefits − Costs) / Costs] × 100. The hard part is the benefit number — specifically, defending attribution in front of a finance team. Pick outcomes you can isolate (a specific compliance error rate, time-to-competency for a defined role, a specific customer service metric), use a control group where possible, and be honest about contribution versus credit. "Training contributed to X alongside Y and Z" is more defensible than a confident headline ROI number that doesn't survive scrutiny.
In rough order of usefulness: behaviour change in the workplace (the metric that actually matters), assessment performance broken down by individual question, drop-off points within the course at the interaction level, time on task as a diagnostic, repeat visits, satisfaction as a quality signal, and completion rate as a top-of-funnel input. Completion isn't useless — it's just not the destination.
Three to six months is the standard range, but it depends on how often the behaviour you're trying to change actually occurs in someone's role. A frontline service interaction can be assessed in weeks. A leadership behaviour that comes up quarterly might take a year. Match your measurement window to the cadence of the behaviour, not the convenience of the calendar.
Leading indicators are early signals — engagement, time on task, assessment performance, satisfaction — that you can measure during or immediately after training. They tell you whether the programme is on track. Lagging indicators are outcomes — behaviour change, performance improvement, business metrics — that emerge months later. They tell you whether the training actually mattered. You need both, reported in the same conversation, to tell a credible impact story.
Chameleon Analytics is built into the authoring platform and tracks learner behaviour at the interaction level — drop-off points, time on each section, replay frequency, and which questions learners get wrong. That data is what lets you move from "we think this module isn't working" to a specific, fixable diagnosis. It doesn't replace Level 3 and 4 measurement (you still need to track behaviour and business outcomes in the workplace), but it makes the engagement and learning levels far more diagnostic than completion rates alone.