All of this and you didn't explain how things can either be measured precisely or none at all.
You're explaining the consequences of being wrong, which I would mostly agree with, but not how it can't be possible to be wrong.
Yes, if the evaluator estimates the fantasy world to be a difficulty 1 and it turns out it's level 10, people will die. Which is why they should have contingency plans. Not having any would be irresponsible. But it can be possible to be wrong. Even if it's usually at a very low rate.
To make a parallel with real-life, do you know what an ETA is? When you have, say, a delivery service, they often give you an ETA... an estimated time of arrival. (Sometimes in the form of a time range rather than an exact time, but still.) And it can be wrong because circumstances can happen that are outside of the possibility to foresee (e.g. a road is blocked by an accident) or you simply happen to fall in the .1% of late cases as deliveries are estimated statistically.
So, here is a case where it is possible to estimate something, but also be wrong about it.
You want matters of life and death? Let's get more dramatic.
Doctors can diagnose you and be wrong. And it can cost you your life... or give you a scare over nothing.
One can diagnose a cold where you have pneumonia.
Or they can test you positive for HIV and you have nothing. (Though this is generally tested a second time for confirmation.)
(Edit: for additional comparison, both have happened in my family. And the worst part is that my relative who got misdiagnosed with a cold never doubted their doctor. Not even after this happened.)
The senior group never doubted the evaluation? So what? They might not have been warned of the possible errors. Or the evaluator gave a wrong estimate purposefully. Or a number of other options. Remember the whole system is based originally on sending random unpowered individuals to random magical worlds and hope they get a power strong enough to deal with the threat there. And those who survive are going to this school / boot camp for heroes. The MC himself was sent to a world where the demon lord was so powerful it had already ended the world.
So, I'm sorry but the base premise of this story is already about an unreliable organisation supported by whimsical divine patrons. So failing an evaluation definitely fits within the logic of this story.
I don't get what sort of explanation do you want from me regarding "how things can either be measured precisely or none at all"?
By my speculation of how the evaluation device ought to work is it reads the entire source code of that world 24/7 and updates instantaneously, even if the boss hides its power level that'll only work on the people of that world who can't see the source code, and not this device from our world looking at its source code. Like in shooting game the players in game may not know where other players are if they're behind covers, but the game server and client must knows exactly everyone is are and if you modifiy it you can make the client display that info, giving you wallhack.
So I'm arguing from the perspective that if the device CAN be wrong, then their whole system that's relying on it to score points wouldn't exist. Since how will you deal with a full raid group taking on a Level 10 boss, only for it to be level 5 and half of them didn't even have to do anything and they defeated it. Do they all still get an undeserved Level 10 credit or do they get level 5 credit that they didn't sign up for and wasted everyone's time and they have to sign up for another level 10 and hope it won't be wrong this time? But since this system does exist and is being used, then the evaluator HAS to be accurate otherwise there'd be uproar and this whole "points and ranking" system would collapse. OR the author is intentionally writing the organization to purposefully doing it to fuck over their patrons and it'll be a plot point later on. Meanwhile they're all just begrudgingly tolerating it. That I will be glad to see. Even though it makes no sense because why would they care that much about ranking up if the mission they conciously picked for the purpose of ranking up could either be more difficult than indicated and get them killed, or be less difficult and thus give them no rank. If a video game does that nobody'd play that game unless forced to by extraneous sources.
To me, all your examples are describing how the people living in the Fantasy World can be wrong when they evaluate the boss, not the people from beyond their world looking in. The people that receive cheat skills or information only available to Gods that gives them a huge advantage mere mortals of that world cannot have. You might have an ETA of when your download is going to finish, but the computer has the exact time calculated based on its current DL speed and it updates if your DL fluctuates, while still giving you the precise time, and if your DL speed is constant then it will finish on the second.
How doctor diagnose thing is they take your "readings" and "guess" based on all these symptoms and whatnot, what could the cause be. That's where the misdiagnose comes from.
Meaning they could just tell you your readings, your blood pressure is at [X], your temperature, your symptom, you have [X] chemicals in your body that's % higher or lower than normal. All of that would be what the devices measured. But means nothing to you. What you want to know is what does this reading mean? And the doctor would say based on his experience/studies when your readings check these boxes (but not all these other boxes), then it probably is X and you should take X medicine, or it could be Y and you should take Y medicine. And since it's very likely they don't have all the possible permutation of cases ever in history, that's where the misdiagnose comes from.
IF we're talking the diagnose is wrong because the readings are wrong. Meaning you don't have X chemical but the device said you did and thus doctor prescribed X medicine because he trusted the reading (though you did say they would do multiple tests to make sure). Then it goes back to my previous point that it'd be harrowing to think this could happen and it wouldn't be a huge scandal. A doctor do all the tests and retests, the readings shows what it is, he prescribed treatment based on the test results and turns out bad. No idea why. All the test results lead to X, why isn't it X? If that evaluator came back Y after the fact, then you have to ask why the fuck did it not show Y all these other times but showed Y now? Human mistake? (Fired) Faulty equipment? (Tossed) If this randomly X and randomly Y is just naturally how it is, then we have to come up with a different method altogether.
To reiterate my point with one last example. If you play WoW and you set your raid to"Normal" difficulty, if for whatever reason, unless purposefully done by the devs, that it can sometime be "Mythic" difficulty. There'd be bug report and uproar that the dev will have to fix it within a day. So logic is, if this bug isn't fixed, that raid would be dead, no one would play it (not even Mythic raiders if they'd sometime end up in Normal mode and blast through the bosses and get loot they don't need, wasting their time and prep). But if people ARE playing it, I'd think the bug must has been fixed. UNLESS of course this whole thing isn't a bug but a feature, then the devs better explain why and be able to convince the players to keep playing moving forward.