The transcript says one thing. The knowledge says another.
Since the public release of ChatGPT in late 2022, "excellent" grades in AI-compatible courses — English composition, coding, essay-based humanities — have risen by 30% at one large U.S. research university, according to a study by Igor Chirikov, a professor at UC Berkeley. In courses where AI offers no practical advantage, like sculpture and lab-based sciences, grades stayed flat. The divergence is not subtle. As Chirikov told Axios, the change is not a B-plus student rounding up to an A. "We have a C student who is now an A student."
That sentence should stop every university administrator cold. Because what it describes is not a grading quirk or an honor-code problem. It is the functional collapse of the transcript as a meaningful document — and with it, the core promise that higher education has made to students, employers, and society for the better part of a century.
The university Chirikov studied is unnamed in the research — he says he chose it because its grade distribution data is publicly available, and that he withheld its identity because he believes the pattern is sector-wide, not institution-specific. The school is described as a selective Texas research university with over 50,000 students. The restraint in naming it is itself telling: there is no single institution to blame here, and Chirikov knows it. This is a structural failure, not a local scandal.
The structural failure predates AI by decades. Grade inflation — the long, slow drift of average grades upward — has been climbing since the early 2000s. Chirikov's research found that courses weighting homework assignments more heavily than in-class exams already showed higher rates of inflation before ChatGPT existed. The logic is straightforward: unsupervised work is easier to outsource, whether to a tutoring service, a ghostwriter, or, now, a large language model. AI did not invent the vulnerability. It industrialized it.
There is a second force compounding the problem, and it runs through the faculty rather than the students. Chirikov notes that professors are sometimes incentivized to grade leniently because student evaluations of instructors are tied to promotion decisions. A professor who grades hard risks lower evaluations. Lower evaluations risk career consequences. The result is a quiet, rational pressure toward softness — not corruption, exactly, but a misalignment of incentives that makes rigorous grading professionally costly. AI arrives into a system that was already tilted toward leniency, and tilts it further.
This matters beyond the individual transcript. The credential — the degree, the GPA, the letter grade — functions as a signal. Employers use it to sort applicants. Graduate programs use it to filter candidates. Students use it to justify the cost of attendance, which at many private universities now exceeds $80,000 a year. When the signal degrades, the costs do not. Students pay full price for a credential that communicates less and less. The people who pay most dearly for credential inflation are the ones who cannot afford to supplement it: first-generation college students, students from under-resourced high schools, students who lack the professional networks that allow the credential to be bypassed entirely.
The institutions bearing responsibility here are not acting with urgency. Some professors have moved to handwritten or oral exams — a response that Chirikov acknowledges but notes is neither scalable nor universally accessible. As Tinsel News has reported, the shift to handwritten assessments has already created serious problems for students with disabilities, who rely on accommodations that blue-book exams do not easily accommodate. The fix for one failure is producing another.
University administrations, meanwhile, have largely responded with policy language rather than structural change. Academic integrity statements have been updated. AI disclosure requirements have been added to syllabi. But the enforcement infrastructure does not exist. AI detection tools are unreliable, frequently flagging human-written work as AI-generated and missing AI-assisted work that has been lightly edited. The detection arms race is one the institutions are losing, and most know it.
Current AI detection tools have documented false-positive rates that disproportionately flag non-native English speakers and students with certain writing styles. They also fail to catch AI-assisted work that has been substantively edited. Chirikov's study found the grade inflation pattern in aggregate data — not through detection of individual instances — which is why the scale of the problem is visible while individual cases remain invisible to enforcement.
Chirikov's proposed solution is more honest than most institutional responses: build AI-integrated assignments that require students to document how they used language models, treating AI as a tool to be used transparently rather than a cheat to be caught. "We need to be creative and think of AI-integrated assignments, and that students can use [LLMs], but they should properly document that," he told Axios. "That's not an easy process, but we definitely should invest in that more than we do right now."
That framing is worth taking seriously — but it also sidesteps a harder question. If AI can complete the assignment well enough to earn an A, what exactly is the assignment measuring? The answer, in many cases, is nothing that couldn't be measured differently. Essay assignments in large undergraduate courses were never primarily about writing — they were about demonstrating comprehension, analysis, and the ability to construct an argument. If AI can simulate all three at an A level, the assignment has failed on its own terms. Redesigning assignments to be AI-integrated is not just a detection workaround. It requires faculty to ask what they are actually trying to assess, which is a question many institutions have not seriously asked in a generation.
The economic dimension of this is one the source material gestures at but does not pursue. There is a growing divide between students who are fluent in AI tools and those who are not — and that divide maps, with uncomfortable consistency, onto existing economic and racial inequalities. Research on AI fluency has found that experienced AI users are meaningfully more productive than newcomers, and that gap is already stratifying labor markets. In higher education, the dynamic runs in a more troubling direction: the students most likely to use AI fluently to inflate their grades are not necessarily the students who will be most capable in the workforce. The credential is decoupling from the competence it was supposed to certify.
What this produces, at scale, is a cohort of graduates with transcripts that overstate their subject-matter knowledge and employers with no reliable way to know it. Chirikov is direct about this: universities must worry that graduates are leaving AI-proficient rather than knowledgeable about their fields. The two are not the same. AI proficiency is a real and marketable skill. But a nursing student who used AI to pass pharmacology coursework, or an engineering student who used it to complete structural analysis assignments, carries a risk that their transcript does not disclose.
The accountability question is not only about students. Faculty who are structurally incentivized toward leniency, administrators who have responded with policy statements instead of resources, accreditation bodies that have not updated standards to address AI-assisted work, and the technology companies that released tools specifically designed to complete academic assignments without building any institutional partnership to manage the consequences — all of them are actors in this failure. The student submitting an AI-generated essay is the most visible node in the system. They are not the only one responsible for it.
The 30% grade increase in AI-compatible courses is not a data point about student dishonesty. It is a measurement of institutional unreadiness. Higher education spent the last two decades building an assessment architecture optimized for scale — large lecture courses, standardized assignments, rubric-based grading that could be delegated to teaching assistants. That architecture was already fragile. AI exposed how fragile. The question now is whether universities will treat this as an enforcement problem, which is a losing frame, or as a design problem, which is the only frame that has a solution. The institutions with the resources to redesign are the ones whose students need it least. Everyone else is waiting.