If you are in the vicinity of the University of Rochester this Thursday, please join us:
Monthly Archive for April, 2023
I believe I am now caught up on replying to the many comments on my post about Chat-GPT4 failing my economics exam.
My responses are quite out of order, but are labeled with the numbers of the posts I am responding to. So if you search in the page for the number of your post and/or the name you posted under, you ought to be able to find my reply.
Of course if any new comments appear on the GPT4 post, the current post will become at least temporarily inoperative.
Edit: Here is a new and improved version of the post below. I am leaving the original up for historical reasons, but I recommend following the link instead of reading this post.
*****************
Yesterday, I announced on this blog that ChatGPT had failed my economics exam. (All of the questions on this exam are taken from recent final exams in my sophomore-level course at the University of Rochester.) Multiple commenters suggested that perhaps the problem was that I was using an old version of ChatGPT.
I therefore attempted to upgrade to the state-of-the-art GPT-4, but upgrades are temporarily unavailable. Fortunately, our commenter John Faben, who has an existing subscription, offered to submit the exam for me.
The result: Whereas the older ChatGPT scored a flat zero (out of a possible 90), GPT-4 scored four points (out of the same possible 90). [I scored the 9 questions at 10 points each.] I think my students can stop worrying that their hard-won skills and knowledge will be outstripped by an AI program anytime soon.
(One minor note: On the actual exams, I tend to specify demand and supply curves by drawing pictures of them. I wasn’t sure how good the AI would be at reading those pictures, so I translated them into equations for the AI’s benefit. This seems to have had no deleterious effect. The AI had no problem reading the equations; all of its errors are due to fundamental misunderstandings of basic concepts.)
Herewith the exam questions, GPT-4’s answers (in typewriter font) and the scoring (in red):
I am pleased (I think) to announce that I have just submitted to ChatGPT an exam, consisting entirely of questions taken from recent final exams in my sophomore-level intermediate economics class, and it has earned a score of zero. Not only did it earn a score of zero, but several of its answers would have merited negative scores if I were allowed to give them. The answers are in every case egregiously wrong, showing absolutely zero understanding of the basics.
I am frankly a little surprised; I had expected it to get at least a few things right.
Edited to add: If you are looking for the details, look here.