Beep boop - this is a robot. A new show has been posted to TWiT…
What are your thoughts about today’s show? We’d love to hear from you!
Beep boop - this is a robot. A new show has been posted to TWiT…
What are your thoughts about today’s show? We’d love to hear from you!
The conversation about Grok’s weekend meltdown (IM Episode 827 starting at 1:19:05 in the transcript) ignored several details that were well-known at the time of the recording. They can be found – could have been found – on a variety of social media sites. Interestingly, they are not available on any established news sites. I asked a friend who’s a journalist to do a writeup on the story; he said one is in the works.
AFAICT, the best starting-point for the decisive context on this story is Grok itself. Here’s the question I asked. “The weekend” refers to the weekend of July 5-6, 2025:
when exactly was the prompt for grok changed over the weekend? Does it look fishy that the prompt was changed over a holiday? Was there any exigent reason for the prompt to be amended when many employees were out of the office? Why do you think this was done, and done at that time. Use ludicrous mode.
Grok’s prompt was altered at 4pm PDT on Sunday, July 6. That modification to the prompt was undone with a second GitHub commit on July 8.
Why would a prompt be changed on the weekend? Why would any American company deliberately change the public behavior of their software over the 4th of July holiday? Why was an X.com employee fired this week? Why was CEO Linda Yaccarino asked to resign? What is the relationship between these two job terminations and the holiday weekend code modification?
As Gizmodo noted, Elon did tweet about upgrades to Grok on July 4. What was he talking about? Almost certainly, he was talking about the announcement of Grok 4, which is now publicly available. Grok 4 scored a 44.4% on Humanity’s Last Exam: a significant jump over any AI’s performance. Grok 4 Heavy hit a score just north of 50% on the exam. This is a significant boost from previous benchmarks; I’ll defer to Matt Berrman’s video Grok 4 is really smart… Like REALLY SMART for the details, but it’s safe to say this is a mightily impressive upgrade. Side note: Matthew Berrman would be a great guest for Intelligent Machines.
After viewing this segment and various sources online, there are several obvious questions:
Someone made a modification to Grok’s system prompt at 4pm PDT on Sunday. The simplest explanation is that this was an unauthorized modification of Grok. The vandalism certainly detracted from x.ai’s announcements this week: many news sources completely ignored the revolutionary advances and instead talked about the e-vandalism.
Controls should have been in place such that no individual could make a change like this without managerial/executive sign-off. If there was a failure of those controls, I could easily imagine that a CEO could get fired (i.e., asked to resign) for that failure.
Does anyone have an alternative explanation for this week’s X.ai events?
It sounds like Grok is still erring on the side of wild speculation and conspiracy theories.
What use is the “smartest AI in the world” if you can’t trust it?