• merc@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 days ago

    Say John is a student in med school. As a professor, I want to know if John truly understands everything he has been taught. I can do this one of two ways:

    1. I can give John a difficult exam.
    2. I can send John to the Emergency Room in the hospital and see if he saves lives or kills people.

    Number 2 is obviously a much more accurate way to determine what John has learned. It’s much harder for him to cheat that assessment. It’s a real-world scenario instead of some words on a page. The only slight drawback is that people might die if John didn’t study hard enough. It’s going to be essential to eventually do number 2, but it’s probably better to do number 1 first.

    A while ago I took a course in teaching English to adults. One of the things they talked about is assessments. They talked about restricted vs. freer questions. A restricted question might be a true/false question, or a multiple choice question. A freer question might be an essay type question. There’s a lot of value in restricted questions even if sometimes a student can get them right just by flipping a coin or guessing. The value is that they can help focus in on areas of difficulty, like verb tenses, spelling, etc. An essay type question tests them differently, but it’s still an artificial construct. Even a no time limit, open book test isn’t assessing a student’s performance in the real world.

    Tests and homework may be annoying, and they’re not foolproof, but they’re very useful tools for a teacher to assess progress in learning. People cheat on them because we don’t know of a way of assessing learning in a way that’s fun without demanding way too much of the teacher.

    Also, the whole format of this argument is stupid:

    Thesis: forklifts are capable of lifting heavy weights, and supposed weightlifters are using forklifts instead of lifting weights.

    Antithesis: forklifts do not have muscles.

    Synthesis: lifting weights does not develop muscles.

    Humans are not LLMs. Just because an LLM can give the correct answers for a test without understanding anything doesn’t mean that a human can also pass that test without understanding what’s on it.