32
Texas is replacing thousands of human exam graders with AI
(www.theverge.com)
News about and pertaining to the United States and its people.
Please read what's functionally the mission statement before posting for the first time. We have a narrower definition of news than you might be accustomed to.
For World News, see the News community.
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
Ugh... I'm deep on the ai sphere, and this seems like a bad idea to me. Gpt (let's face it, they are probably using open ai) can be deeply biased and arbitrary in it's evaluations.
For example, "Two apples and four oranges," might score better than: "4 oranges and 2 apples." for inscrutable reasons. Say, if the question spelled out the numbers, and the LLM has a weighted bias to favor overall textual consistently, it might produces a reason to dock points apparently unrelated to that weight, such as: "incomplete sentence." for the second answer, but not the first.
Students may also receive lower scores due to cultural biases towards certain phrases, and factors as straightforward as their name.
Finally, AI will hallucinate errors constantly if you ask it to evaluate text without any errors. Constantly. Consistently.