Argue with the LLM: A Generative AI Class Assignment
Have students find and answer weaknesses in a position
I have been experimenting in class with ungraded GenAI exercises in class since Spring of 2023 and have also talked with a variety of student groups about GenAI inside and outside my university during this time. But in Summer 24, I decided to go deep on this in a graduate class as part of what I came to describe as my “Hot AI Summer.”
The class was a graduate elective in Marketing and Strategy, and was comprised of 18 students from a variety of programs (MS Management, MBA, MS Business Analytics) and levels of business experience (zero to lots). My impression is that we are hitting an inflection point in student use (or at least acknowledged use!) of GenAI. I assigned 12% of the final grade in this class across three GenAI exercises. These were obviously business applications, but I think in concept are generally applicable across a broad range of disciplines. Here is one exercise that worked.
Arguing with the AI as a Critical Thinking Exercise
One of the more engaging assignments I have used with students over the years is to have them debate a societal topic from a company perspective (my current version is “how much should an airline do about climate change”). Students are forced to assemble their own arguments, present and defend them, and try to find flaws in their opponents’ approaches. Students typically find this a very motivating and invigorating class. They report the experience of seeing or participating in a debate over a complex topic intellectually eye-opening, often reporting that it changed how they thought about the topic.
However, when I don’t force students to argue, I often find them reluctant to do so. In some cases this is due to fear of embarrassment, in some cases discomfort with English, and in some cases the fear of giving offense in a world where offense seems easily given and taken. (In some cases, of course, it is lack of preparation . . . )
Arguing with a Large Language Model (LLM) helps with all of these issues. First, it’s in private, so no one sees the argument except the student and me. Second, students with weaker English can take time to construct their thoughts rather than improvise in a fast-moving classroom discussion. And third, LLMs are pretty hard to offend, and if you do, who cares?
A good argument with an LLM requires a student to be know something about the topic, to think about what parts of an LLM’s argument are stronger and weaker, and to consider how best to rebut whatever ideas the LLM presents. The interactive conversation can go far deeper than a static report or essay or position paper a student might write on a topic. I think this is educational.
I will argue(!) that this is also more representative of what we experience in real life. On a day-to-day basis, we have conversational arguments with people much more than we write position papers. In much the same way that some suggest LLMs as a good practice space for language skills or job interviews, practicing arguing with an LLM is good preparation for life.
Disclosure: I have done this in my professional life. LLMs are infinitely patient advocates or critics of whatever idea is assigned them. While they will come up with “obvious” ideas, they come up with a lot of them meaning I may still experience that head-slapping “of course” moment on one I have neglected. And I can have these arguments with the time to marshal my thoughts and without the fear of embarrassing myself in public. . .
The Assignment
For this assignment, I was inspired partly by Wharton professor Ethan Mollick’s observation that LLMs can be quite good at what are called “pre-mortems.” This is an exercise in which you write a fictional story set in the future of why a particular initiative failed. It’s designed to surface (and maybe force you to look at) risks in a project that need to be addressed. (To learn more, this article outlines the idea.)
I have talked about pre-mortems occasionally in class over the years, as I find them an intriguing technique for trying to counter the excessive optimism that can accompany projects. And, bonus, LLMs will briskly invent stories on any topic you ask. This seemed like a nice opportunity to teach a concept and teach a way of working with LLMs at the same time.
The twist I put on pre-mortems was to have students argue with the pre-mortem.
For this exercise, I had written a business case on Perplexity, a company providing an LLM search engine, as of summer 2024. After an in-class discussion of the case, following which students were very familiar with the company’s situation, I released these instructions:
Assignment Instructions
Perform the following exercise:
· Open your GPT (or alternate, see Notes/Tips) account and start a new chat
· Upload the Perplexity AI case (click the paperclip icon)
· Copy and paste the following prompt:
· Analyze the attached business case.
· The AI will produce an analysis
· Copy and paste the following prompt:
· Imagine you are a market analyst. You have been asked to write a “pre-mortem” on Perplexity. This is an article set two years in the future that imagines that Perplexity failed and explains why.
· The AI will produce an article
ü Pick one of the reasons the AI identifies for failure. Suggest a solution to that problem. Ask the AI why that solution would not work.
· The AI will explain why your solution would not work.
· Take one of the reasons AI says your solution would not work and propose a solution to that problem. Ask why that solution would not work.
· The AI will explain why your solution would not work.
· Iterate at least two rounds in this fashion.
Write a reflection of less than one page about this interaction. Address the following points:
What?
· What was good or bad about the AI’s answers?
· What, if anything, did the AI misrepresent from the case (i.e., hallucinations)?
· Note, hallucination is particularly likely if your LLM accesses the internet, as it will draw on sources beyond the case. You may need to change a setting or explicitly instruct it not to search beyond the case
Why?
· Why do you think the AI suggested what it did? (Do not ask the AI, what it tells you is not necessarily true.)
· Why did you choose what you did from the AIs output?
How?
· How, if at all, did working with the AI deepen your understanding of the case?
· How might you use an AI pre-mortem in the real world?
I would also like to see your transcript. If you have a paid GPT account, you should be able to share a link to your chat with me. If not, copy and paste the chat into a word document or pdf and submit that. While most of your grade will be on the reflection, I will be looking at how you prompted and engaged with the GPT as well. You may submit one document or two as you wish.
This assignment will be worth approximately 4% of your grade. It will be graded on a 0–2 scale.
Notes/Tips
· You are not required to use GPT 4o for this assignment. If you have a model you like better, use that. If you do not use GPT 4o, please tell me what you used.
· Having the AI tell you something is wrong with your idea can be depressing. It’s nothing personal, and in real life no one will know
The Results
Wow, did this work.
I was heavily influenced in the structure of this assignment by Mike Kentz’s perspective on “grading the chats” as a way of developing an assignment around GenAI. One of the fears I think a lot of faculty have is how to grade an interaction with a piece of software. I had certainly never done this. I used Mike’s “What-Why-How” framework as a good starting point for my experiment. All errors are mine!
All students used some version of OpenAI’s GPT product, which likely reflects brand familiarity and the fact that I asked them to open a free account for another exercise (some had paid accounts). Students went anywhere from two to five rounds with the pre-mortem.
As I was reading the chats, a number of things stood out. First, the pre-mortems all bore a strong resemblance to one another. They weren’t exactly the same, but they “rhymed” at some level. This is reasonable. Of course they were all working from the same case, but I expect this also reflected uniform GPT usage. I have found more variance across models in my own use.
I was also very satisfied that I was seeing how students thought through (or didn’t) a problem. When I asked them why they chose what they did from the AI’s output, the two most common responses were (a) it was the weakest objection and (b) it was the most interesting objection. Many came up with creative ways to overcome their LLM’s objection. (One, I suspect, may have used an LLM to come up with ideas. Sigh.) And to my delight, students who had been virtually silent in class blossomed in an interaction with their LLM. I gleefully read the chat from one heretofore quiet student who tore apart his LLM.
The What and How answers were straightforward to grade. The most difficult part of the grading was the “Why do you think the AI suggested what it did?” I got a lot of speculative answers and anthropomorphism. Students more familiar with LLMs gave technical explanations. I later talked with Mike Kentz about this challenge, and he suggested this does tend to be difficult depending on the context and task.
The Feedback
As these were experiments — and presented as such to students — I collected feedback via an anonymous survey at the end of the course. (Note my official teaching ratings did not appear influenced by these experiments.) Sixteen students responded.
This exercise was the hit of the summer: fourteen of sixteen rated the exercise either “useful” or “very useful” on a five-point scale. Informal feedback suggested it was also highly motivating. Students were fascinated by the conversations. One student did the entire exercise twice just to see how it would differ if she changed her prompts. Two shared with me how happy they were when they felt they “beat” their AI. In retrospect, there is a gamification aspect of this that I had not appreciated in advance. This was the LLM equivalent of a “boss battle” in many online games, and students wanted to beat the boss.
Discussion
This was a business exercise, but I would argue pretty much anytime there are two (or more) sides to an argument, this could be useful. I haven’t committed to doing this in the fall at this point, but with the right topic I would. I would not replace my live debate exercise with this, but it may be a useful alternative, especially if you have a class in which a live debate is difficult (e.g., online).
A limitation of my version of the exercise is that the LLM has to have the same source material as the student (allowing that the LLM may search beyond). I gave students permission to upload my copyrighted case, but I would not tell students to upload copyrighted or confidential files. (Though some clearly do. Sigh.)
With that, I will wish all teachers and professors very good luck this Fall. . .
Resources
· Ethan Mollick has become an important resource for educational change for business schools in this area, and he frequently reviews new developments in GenAI. His substack is well worth a look if you are interested in more ideas and examples: https://www.oneusefulthing.org/.
· I will also recommend Jason Gulya (substack: https://higherai.substack.com/ ) and Mike Kentz (substack: https://mikekentz.substack.com/ ) for thoughtful ideas on integrating AI in the classroom. (Disclosure: Mike and I corresponded about some of my exercises this summer — which doesn’t mean he endorses them!)
Bruce Clark is an Associate Professor of Marketing at the D’Amore-McKim School of Business at Northeastern University where he has been a teaching mentor for both online and on-ground teaching. He researches, writes, speaks, and consults on managerial decision-making, especially regarding marketing and branding strategy, and how managers learn about their markets, though increasingly he is engaged with GenAI in business and higher ed. You can find him on LinkedIn at https://www.linkedin.com/in/bruceclarkprof/.
Beyond the headline image, no AI was used in the writing of this article!