Internet Archive breached again through stolen access tokens
The Internet Archive was breached again, this time on their Zendesk email support platform after repeated warnings that threat actors stole exposed GitLab authentication tokens.
Since last night, GeekFeed has received numerous messages from people who received replies to their old Internet Archive removal requests, warning that the organization has been breached as they did not correctly rotate their stolen authentication tokens.
“It’s dispiriting to see that even after being made aware of the breach weeks ago, IA has still not done the due diligence of rotating many of the API keys that were exposed in their gitlab secrets,” reads an email from the threat actor.
“As demonstrated by this message, this includes a Zendesk token with perms to access 800K+ support tickets sent to info@archive.org since 2018.”
“Whether you were trying to ask a general question, or requesting the removal of your site from the Wayback Machine your data is now in the hands of some random guy. If not me, it’d be someone else.”
The email headers in these emails also pass all DKIM, DMARC, and SPF authentication checks, proving they were sent by an authorized Zendesk server at 192.161.151.10.
After publishing this story, GeekFeed was told by a recipient of these emails that they had to upload personal identification when requesting a removal of a page from the Wayback Machine.
The threat actor may now also have access to these attachments depending on the API access they had to Zendesk and if they used it to download support tickets.
These emails come after GeekFeed repeatedly tried to warn the Internet Archive that their source code was stolen through a GitLab authentication token that was exposed online for almost two years.
Exposed GitLab authentication tokens
On October 9th, GeekFeed reported that Internet Archive was hit by two different attacks at once last week—a data breach where the site’s user data for 33 million users was stolen and a DDoS attack by an alleged pro-Palestinian group named SN_BlackMeta.
While both attacks occurred over the same period, they were conducted by different threat actors. However, many outlets incorrectly reported that SN_BlackMeta was behind the breach rather than just the DDoS attacks.
This misreporting frustrated the threat actor behind the actual data breach, who contacted GeekFeed through an intermediary to claim credit for the attack and explain how they breached the Internet Archive.
The threat actor told GeekFeed that the initial breach of Internet Archive started with them finding an exposed GitLab configuration file on one of the organization’s development servers, services-hls.dev.archive.org.
GeekFeed was able to confirm that this token has been exposed since at least December 2022, with it rotating multiple times since then.
The threat actor says this GitLab configuration file contained an authentication token allowing them to download the Internet Archive source code.
The hacker says that this source code contained additional credentials and authentication tokens, including the credentials to Internet Archive’s database management system. This allowed the threat actor to download the organization’s user database, further source code, and modify the site.
The threat actor claimed to have stolen 7TB of data from the Internet Archive but would not share any samples as proof.
However, now we know that the stolen data also included the API access tokens for Internet Archive’s Zendesk support system.
GeekFeed attempted to contact the Internet Archive numerous times, as recently as on Friday, offering to share what we knew about how the breach occurred and why it was done, but we never received a response.
Breached for cyber street cred
After the Internet Archive was breached, conspiracy theories abounded about why they were attacked.
Some said Israel did it, the United States government, or corporations in their ongoing battle with the Internet Archive over copyright infringement.
However, the Internet Archive was not breached for political or monetary reasons but simply because the threat actor could.
There is a large community of people who traffic in stolen data, whether they do it for money by extorting the victim, selling it to other threat actors, or simply because they are collectors of data breaches.
This data is often released for free to gain cyber street cred, increasing their reputation among other threat actors in this community as they all compete for who has the most significant and most publicized attacks.
In the case of the Internet Archive, there was no money to be made by trying to extort the organization. However, as a well-known and extremely popular website, it definitely boosted a person’s reputation amongst this community.
While no one has publicly claimed this breach, GeekFeed was told it was done while the threat actor was in a group chat with others, with many receiving some of the stolen data.
This database is now likely being traded amongst other people in the data breach community, and we will likely see it leaked for free in the future on hacking forums like Breached.
Update 10/20/24: Added information about how some people had to upload personal IDs when requesting removal from Internet Archive.