ClimateGate: A hacker or…an anonymous American benefactor-cum-CRU-researcher?
What has hacking got to do with the ClimateGate saga? As we shall see, most likely very little. â€œHackingâ€ of course is the hypothesis (often presented as fact) the affair has been portrayed as being, in mainstream media, something looked upon with disdain and suspicion by all apart from the most hardened computer buffs. After all whoâ€™s not scared of seeing their data fall in malignant hands?
Thatâ€™s the way it should have been at UEA too, and instead thatâ€™s where the â€œcrimeâ€ is said to have happened. But letâ€™s not forget that inculpating an external agent for the ethical and actual crime of stealing information, may also be the easiest excuse to avoid talking at all about the contents of what has been leaked. â€œWe have been the victim of a malicious attackâ€, thatâ€™s what people say, â€œand that is so outrageous, everybody should pretend nothing has been made publicâ€. Hard to argue against, in principle, but a practical impossibility.
On the internet, when the stableâ€™s doors have swung open thereâ€™s no way to get the cattle back inside. And still, the â€œshouldnâ€™t have seen it, will not talk about itâ€ is an argument as weak as popular among those directly involved, and those that evidently base their thoughts upon acts of faith on the crystal-clean nature of CRU.
How about the â€œromanticâ€ hypothesis then? The story of a principled insider, increasingly at odds with the rude, disdainful and omerta-based conspiratorial attitude of his or her colleagues, an insider pushed finally towards publishing the misdeeds for the world to see. Well, thatâ€™s more difficult to explain to the media with going into details and it is also hard to believe.
Letâ€™s try to examine then the third possibility, mix of carelessness, of amateurish hacking and what can only be described as self-harm.
There are many ways to share a file on the internet: large sites such as Rapidshare and Megaupload, virtual disk providers (Dropbox), torrent-based P2P sites (e.g. ThePirateBay and many others), the eMule P2P network, anonymous sharing on freenet or Tor.
The HTTP-based solutions can be anonymized (via a proxy or tor) and are safe for the â€œuploaderâ€. On P2P platforms instead one is exposed to the risk of being intercepted due to the very nature of peer-to-peer, where the IP address of the “sharer” can be captured.
Also freenet and Tor â€œhidden servicesâ€ are inherently safe, anonymous and impossible to be intercepted and geolocalized. On the other hand, they are technically more complex: the mere fact of using either of them shows already the computing expertise of the â€œsharerâ€.
For all we know, in ClimateGate an anonymous ftp site has been used. This suggests an “old school” user, somebody familiar with techniques very popular in University computer-science departments at least two decades ago, when the only way do file sharing was precisely to use anonymous FTP sites (yes, the writer is an â€œold schoolâ€ guy himself 😉 ).
Given that the people involved are no spring chicken, rather expert professors and researchers of proven reputation of integrity, this seems the most plausible scenario (apart from thinking the hacking was so shrewd, all hints were left to point in the direction of one of the researchers being at fault).
Again according to what is now believed, the ftp site of choice was tomcity.ru (now closed to anonymous access). Why? A google search doesnâ€™t suggest anything in particular (the site, “automatically scanned” in July 2009, appears full of diverse and at times deplorable material). Note that the automated scanner is the first link appearing for a search on “russian ftp scanâ€, but that is likely irrelevant as the search is done in hindsight. :).
Leaving aside allegations of a RealClimate hacking to publish the link there first, the original comment making the link public appeared on the Air Vent blog (and on WUWT as well, so the particular siteâ€™s choice does not appear very significant). The language used is quite unusual for a hacker/cracker, without the usual slang and boasting of their technical skills. To this untrained foreign eye, the phraseology used appears from England rather than the USA (this is pure speculation, though).
There is a recent post on ClimateAudit with an IP address for the original comment, which seems to be in Russia. But again, it looks like a â€œtransparent HTTP proxyâ€, and as such might reveal nothing about the location of the â€œsharerâ€.
Time to move on and see if any clue about the â€œhackingâ€ is provided by the actual contents of the FOI2009.zip file. As well known, the files are organized in two main directories, â€œdocumentsâ€ and â€œemailâ€. In the former, documents appear to be collected unsystematically, among them tree-ring raw data files presumably already being scrutinized by a number of experts, and some well-written fortran code (as well as Fortran code could ever be :P) with well-known explicit comments.
It is plausible, and my opinion is shared by many, that most of those documents had been readied in case a FOI request had been accepted.
How e-mails are formatted
Contrarily to what appears to be the case for the documents, the messages do look like having been carefully chosen for the purpose of disclosing in public a series of subterfuges. Letâ€™s focus on the style in which they were written and formatting.
The quoting style often suggests the use of Microsoft Outlook, with the original text placed at the bottom of the message, in a way that was however uncommon with the Unix mail readers widely used in scientific circles in the 90s.
Attachment Converted: "c:eudoraattachPhil_letter_draft_091109.doc"
However, we have carried out some tests indicating that Eudora has not been used to export the messages to text format, as indicated in the headers:
Content-Type: application/msword; name="prescient.doc"
Content-Disposition: attachment; filename="prescient.doc"
The way messages look like when exported from Outlook is the same we can find in the CRU messages:
Date: Tue, 23 Nov 2009 21:40:23 +0100Test--
Where are the messages from
Considering as a given that the messages haven been exported from Outlook, is there any indication that they have been â€œstolenâ€ rather than chosen by an insider?
Fact is that hacking the CRU mailserver (or whatever mailserver is involved) is quite complex technically (again revealing a profound expertise on the perpetrator). It would also expose all uea.ac.uk mailboxes (or other domain ones), potentially including many studentsâ€™. The process of selecting individual mailboxes and, within those, specific messages would certainly involve a monumental effort.
A hacker would have simply released all mailboxes in their entirety. Notably, when messages reside in a mailserver, they will always contain normally-invisible headers:
Received: by 10.140.158.18 with HTTP; Mon, 23 Nov 2009 12:40:23 -0800 (PST)
Date: Mon, 23 Nov 2009 21:40:23 +0100
Subject: test 3
Depending on the software configuration of your mail server, the mailbox are saved in different formats: some keep all messages in one single file, others in a separate file for each message. In the first case, the extraction of individual messages would have been even more complex.
Finally, letâ€™s keep in mind that the persistence of messages on the mail server depends on the client configuration. In most default configurations, after fetching a message email clients physically delete it on the server (Gmail is the only system behaving otherwise, ignoring the deletion). Given that some messages are from 1996, it is doubtful that they have been exported by hacking a mailserver.
An indication in message filenames
Saving thousands of messages to text files from Outlook can be a tedious process, with individual file names to be provided interactively. The work might have been done using a simple VB macro such as the following.
Interestingly, the file names correspond to the messagesâ€™ â€œunixtimeâ€ (i.e. the number of milliseconds from 1/1/1970, the way unix systems store timestamps). To simplify the exposition we only use emails written in England, that donâ€™t have timezone specification in the Date: line.
Take file 1199466465.txt, representing a mail written by Phil Jones and likely sent from East Anglia, UK. An unixtime of 1199466465 corresponds to 17:07:45 UTC on January 4, 2008, but the text contains “Date: Fri Jan 4 12:07:45 2008”. Thatâ€™s 5 hours apart, the difference between England (UTC) and the U.S. east coast (EST, UTC-5).
Similarly, 1248790545.txt, by Phil Jones too. An unixtime of 1248790545 corresponds to 14:15:45 UTC on July 28, 2009, but in the text the message says “Date: Tue Jul 28 10:15:45 2009” without timezone specification. This at first sight seems a 4-hour difference but, as timezone is not specified and due to daylight savings time, the delta is still 5 hours between England (BST, UTC +1) and the U.S. east coast (DST, UTC-4). Strangely, Jonesâ€™ PC did not automatically change timezone in summer daylight savings period.
It is true that the same result could be achieved by changing the PCâ€™s time in order to make it appear as if the processing has been done on the American East Coast. Or file names may have been changed with a UNIX script to mix up the evidence.
– File sharing was not done the way a hacker would have done it;
– The original AirVent comment with the hyperlink is not in the classical hacker style;
– The contents of documents and code appear to have been chosen with care;
– The messages have been carefully selected from 1996;
– The mail export seems to have been done in the U.S. east coast,
my current best guess is that there has been no violation of the mail server of the University of East Anglia, but rather that the mail and documents were made public by a CRU researcher, presumably American. Second-best is the hypothesis that the PC of a CRU member has been hacked into, compromised for an extremely long time.