Edit: Here is a new and improved version of the post below. I am leaving the original up for historical reasons, but I recommend following the link instead of reading this post.
*****************
Yesterday, I announced on this blog that ChatGPT had failed my economics exam. (All of the questions on this exam are taken from recent final exams in my sophomore-level course at the University of Rochester.) Multiple commenters suggested that perhaps the problem was that I was using an old version of ChatGPT.
I therefore attempted to upgrade to the state-of-the-art GPT-4, but upgrades are temporarily unavailable. Fortunately, our commenter John Faben, who has an existing subscription, offered to submit the exam for me.
The result: Whereas the older ChatGPT scored a flat zero (out of a possible 90), GPT-4 scored four points (out of the same possible 90). [I scored the 9 questions at 10 points each.] I think my students can stop worrying that their hard-won skills and knowledge will be outstripped by an AI program anytime soon.
(One minor note: On the actual exams, I tend to specify demand and supply curves by drawing pictures of them. I wasn’t sure how good the AI would be at reading those pictures, so I translated them into equations for the AI’s benefit. This seems to have had no deleterious effect. The AI had no problem reading the equations; all of its errors are due to fundamental misunderstandings of basic concepts.)
Herewith the exam questions, GPT-4’s answers (in typewriter font) and the scoring (in red):