No I’m saying the Turing test is a philosophical hypothetical from the time before computers, and doesn’t actually show anything, because it relies on the least accurate tool at our disposal: human pattern recognition machine, one that is oh so happy to be fooled by the ELIZAS of various sofistication. Chatbots were passing the Turing test since the invention of a chatbot. Yeah, modern chatbots are better at that, but it’s more of a damnation of our perception
But as you can see in the paper I linked, ELIZA passes the Turing test in their experiment about 20% of the time (that is to say, it doesn’t pass; passing is 50% in this test) whereas the best LLMs pass about 70% of the time (that is to say, they are significantly more convincing at being human than real humans).
That 20% figure is just a clear indication how shit people are at conducting such a test, and that was basically my original point. 2 in 10 times people were convinced by a particularly echoey room.
If a person murders people only two days out of 10, they’re a murderer, in order to not be a murderer they need to never do that.
Reliably correct is when you’re correct always. Demonstrably incorrect is when you’re incorrect even sometimes.
Agreed, except I add “almost”. “My car reliably starts” it starts “almost always”: more than 2 in 10 times. “You reliably turn up on time” doesn’t mean you’re late 8 in 10 times, it means you almost always turn up on time. To “almost always”, or “reliably” a thing: it means you fail 1 in 100, in a 1000, in 10,000 times. 10k is hyperbole, but the idea is clear right? Almost always/reliably != failing 8 out of 10 times.
Your original point that these bots, that pass 2 in 10 times, reliably pass was wrong. Because: they dont “always pass”, they don’t “almost always” pass, they dont, even “pass in the majority of times”, they rarely pass.
Let’s add our reliable = always substitution to the quote:
Turing test can be [always] passed by a bot that repeats last part of the previous sentence with a question mark at the end […]
You see how that’s wrong not just in fact, but in spirit too?
If a person murders people only two days out of 10, they’re a murderer, in order to not be a murderer they need to never do that.
Relevance? Who says “Fegenerate is reliably a murder?”
Demonstrably incorrect is when you’re incorrect even sometimes.
Relevance? You didn’t use the word "demonstrably passed’. I’d have no problems is you did?
No I’m saying the Turing test is a philosophical hypothetical from the time before computers, and doesn’t actually show anything, because it relies on the least accurate tool at our disposal: human pattern recognition machine, one that is oh so happy to be fooled by the ELIZAS of various sofistication. Chatbots were passing the Turing test since the invention of a chatbot. Yeah, modern chatbots are better at that, but it’s more of a damnation of our perception
OK, sounds like we broadly agree then.
But as you can see in the paper I linked, ELIZA passes the Turing test in their experiment about 20% of the time (that is to say, it doesn’t pass; passing is 50% in this test) whereas the best LLMs pass about 70% of the time (that is to say, they are significantly more convincing at being human than real humans).
That 20% figure is just a clear indication how shit people are at conducting such a test, and that was basically my original point. 2 in 10 times people were convinced by a particularly echoey room.
If an LLM is correct 2 in 10 times, would you call it “reliably correct”?
If a person murders people only two days out of 10, they’re a murderer, in order to not be a murderer they need to never do that.
Reliably correct is when you’re correct always. Demonstrably incorrect is when you’re incorrect even sometimes.
Agreed, except I add “almost”. “My car reliably starts” it starts “almost always”: more than 2 in 10 times. “You reliably turn up on time” doesn’t mean you’re late 8 in 10 times, it means you almost always turn up on time. To “almost always”, or “reliably” a thing: it means you fail 1 in 100, in a 1000, in 10,000 times. 10k is hyperbole, but the idea is clear right? Almost always/reliably != failing 8 out of 10 times.
Your original point that these bots, that pass 2 in 10 times, reliably pass was wrong. Because: they dont “always pass”, they don’t “almost always” pass, they dont, even “pass in the majority of times”, they rarely pass.
Let’s add our reliable = always substitution to the quote:
You see how that’s wrong not just in fact, but in spirit too?
Relevance? Who says “Fegenerate is reliably a murder?”
Relevance? You didn’t use the word "demonstrably passed’. I’d have no problems is you did?