What are Generative AIs and why are they suddenly changing?
Generative Artificial Intelligence (GenAI) technologies have garnered significant media attention for their ability to produce high-quality text and images in response to written prompts. Since the public launch of several large language models in 2022, GenAI has generated considerable concern over academic integrity and the future of writing as a learning tool. While some of the initial hype and panic are potentially abating, navigating the landscape of writing in the era of GenAI will involve new ways of thinking about authorship, intellectual property, and the ways our writing technologies shape our writing lives.
What is GenAI? Is it like we see in the movies?
Generative artificial intelligences are based on foundational technologies called large language models (LLMs). While ChatGPT, Google Gemini, and Microsoft CoPilot have garnered the most initial media attention, several large tech companies (Meta, Anthropic, Apple, and others) have LLMs that are increasingly integrated into the tech tools we use every day. Most word processing, document reading, and search tools now offer opportunities to use AI for search, brainstorming, writing, revision, and reinvention of written text. Many companies are developing specialized models for customer service, marketing communication, law, medicine, and other fields.
Like a parrot that can mimic human speech according to cues, LLMs' outputs are based on probabilities and are limited by the characteristics of their training sets. While the rapid development of increasingly complex models has led to improvements in the quality of responses, the output from models becomes less accurate and less plausible as tasks become more complicated. The result of these limits produces a series of adverse outcomes, including the generation of false and counterfactual information, often termed hallucinations.
In the effort to increase output quality and reduce error, most AI developers are leaning heavily into the LARGE of large language models. While the initial models used millions of statistical parameters to identify relationships among words, the largest models now use trillions of such relationships. This expansion is predicated on the ongoing expansion of computing power and the increasing demand for power, water, and other resources.
Emily Bender and colleagues described LLMs as “Stochastic Parrots” in their 2021 article on the dangers of these technologies, and it remains a valuable metaphor. The models are stochastic, meaning they rely on statistical prediction, along with all the affordances and challenges that statistical estimation entails. Unlike human text production, which is based on meaning and memory, LLMs convert the ‘natural language’ of a written input into code, which is then assessed according to parameters, leading eventually to an output that is either written text or a generated image.
Movie representations of Generative AI (whether in the form of glowing supercomputers or anthropomorphic androids) tend to ascribe these technologies vast, superhuman powers that extend far beyond practical realities. While current GenAI systems are powerful, they are not yet independent thinking agents capable of general intelligence.
Where did GenAI come from? Does it matter?
Generative AI has evolved from two areas of computational research: natural language processing and machine learning. Natural language processing research aims to enable more effective human/computer interaction by assisting technologies in interpreting how ordinary humans use speech and writing. Machine learning, by contrast, aims to create technologies that can operate more or less independently of human intervention by developing systems that can test and improve themselves without requiring the direct involvement of a programmer. These roots are clear in the development of Microsoft CoPilot, conceived as a tool that enables programmers to use an AI code-writing assistant to create new tools and technologies.
A recent breakthrough in these technologies emerged with the development of transformer architectures (in fact, the GPT in ChatGPT stands for Generative Pretrained Transformer). Transformers “learn" to create generalizations based on a training dataset by encoding and decoding samples. The first generation of transformers was trained to recognize and sort items into classes (so-called discriminative models). In contrast, generative transformers use the conventions learned from examples to generate probable examples. ChatGPT is pre-trained on massive statistical parameters (175 billion) to examine a gigantic quantity of text. Based on its analysis of previous human-written discourse, it generates a text that mimics the plausible output of an actual writer. Most Gen AI Models operate with a similar architecture but vary in the number of parameters used and the content of their training sets.
While the technical details of GenAI tools move quickly into the highly technical domains of computer programming, it’s valuable for faculty to understand that GenAI technologies are not entirely new, nor are they entirely different from the computational work that brought about ubiquitous low-cost computing, the internet, and aggressive attention to algorithms. In contrast to the utopian visions of AI marketing teams and the nightmare scenarios imagined by critics, researchers Arvin Narayanah and Sayah Kapoor suggest it is wisest to think of AI as a “normal technology.” In most cases, GenAI tools promise increased productivity and efficiency through automation, intending to augment or replace human labor.
Is it weird that it feels like GenAI is talking to me?
Yes, it is kind of weird, and it’s no accident. Because Gen AI tools are familiar with the conventions of written language, their interfaces often employ familiar language patterns and even use first-person pronouns. Researchers have identified a clear explanation for this design choice: anthropomorphizing AI agents increases users’ trust in those tools. For example, here’s how ChatGPT explains its use of writing conventions:
ChatGPT uses first-person pronouns as a linguistic convention to create a more conversational and user-friendly experience for users like yourself. It helps to establish a sense of interaction and connection between the AI and the user. However, it's important to note that as an AI language model, I don't possess personal experiences or consciousness. I am a machine learning model developed by OpenAI, and my responses are generated based on patterns and information from the training data. (ChatGPT query response, 2023)
The irony of this quotation lies in its content, which clearly states that an AI language model lacks personal experience and consciousness (and introduces ChatGPT in the third person as the subject of sentence one. Nevertheless, the output later exhibits conversational features typically associated with human dialogue (the output is “I don’t possess…”, rather than “ChatGPT does not possess…”). While sequences of prompts can appear to generate a conversation, the responses generated are mostly independent predictions of appropriate responses, not a dialogue with a thinking, feeling, knowing being. The term ChatBot is a helpful reminder to describe such interactions, where the Bot is the presumptive ‘speaker’ or ‘writer’ who responds to human input.
The most recent models of GenAI tools are becoming increasingly adept at adapting their output based on the language choices of prompt writers, which resembles the linguistic accommodations that humans often unconsciously make when engaging with others in conversation. It’s valuable to remember, however, that while a user may feel like they are interacting with a Gen AI tool, what’s actually occurring is that their natural language is converted into computer code, analyzed based on complex probabilities of an extensive training set, and then a response is calculated to offer the best approximation of a human response.
Limitations of Large Language Models
While large language models help produce writing that resembles human output, they face some challenging limitations.
- Most GenAI models work according to predictions about the statistically likely next word or token, despite their apparent “choices” about sentences, paragraphs, and overall response length. Models are likely to keep responses short and vague without repeated prompting.
- Because they operate according to statistical patterns rather than a meaningful understanding of text, GenAIs do not comprehend, read, select, recognize, or quote material like human readers.
- This “meaning gap” produces counterfactual claims, where a Gen AI tool reports false things as accurate (e.g., describing the Hillary Clinton Presidency) or gathers small pieces of relevant information but organizes them in ways that don’t correspond to reality.
- GenAI tools generate false outputs, including fabricated citations (what experts in AI research refer to as “hallucinations” or “confabulations”). For example, an author’s actual name might be associated with a real journal in a citation, but the “article” being referenced doesn’t exist.
- The training sets of large language models are limited by what is available to programmers, so older material and content behind paywalls are typically excluded. Numerous publishers and content creators are suing LLM companies for unfair use of their copyrighted materials.
Because these statistical models generate text differently from how humans compose written language, LLMs will consistently produce output with no relationship to reality. In a recent famous instance, an attorney asked ChatGPT to identify relevant legal precedents for an upcoming case. Because the technology could recognize the form of legal precedents, it generated text that appeared plausible but didn’t actually refer to real legal cases. Human content experts can detect these hallucinations, but the error-free, grammatical, and declarative output of Generative AI can fool casual readers.
Several models have introduced some technologies for fact-checking, and occasionally, tools will point out misconceptions. Those selling the technology are quick to point out strides in accuracy and reductions in hallucinations. At the same time, critics and users can still observe how often GenAI tools confidently express nonsensical content.
Is all this technical detail important for instructors?
Some technical understanding of how LLMs work is vital for both users of Gen AI and anyone who may encounter AI-generated content online for several reasons:
- Materials provided by companies about their tools may assume a certain level of technical knowledge, and understanding the basics of training, prompting, and data collection can make users more informed and effective.
- Users who understand how Large Language Models work can distinguish between tools and products more effectively.
- While Grammarly, Khanmigo, and Perplexity offer their services for a fee, each uses ChatGPT as its engine for generating results. The University of Minnesota provides enterprise access to two of the largest LLMs, Google Gemini and Microsoft CoPilot.
- Resisting the temptation to anthropomorphize Generative AI can help us recognize that these tools are powerful, yet limited.
- While users are often initially impressed by the tools' ability to provide reasonably accurate answers to general questions, the credibility of their output relies on statistical prediction rather than an understanding of meaning or concepts.
- Understanding how these tools operate will be crucial for determining when AI use may be beneficial to students and when it may hinder learning.
- Few instructors bemoan the decline of students’ spelling competency in the wake of spellcheck. At the same time, learning researchers are very concerned about the propensity of these tools to shortcut the critical thinking and skill acquisition that are at the core of higher education. Students require practice in information seeking, reasoning, and writing to have transferable skills for work and life.