I have a similar problem to that which you wrote about in How to read message headers in Google Gmail: If i have received an email from my friend, how can I tell whether it is spoofed or not?
This is a very interesting question and I feel like we’re moving just a little bit into the realm of exciting police procedural TV show stories. This is like “CSI: Internet” or something!
Alright, I can dream, right?
Nonetheless the question you ask is a tough one and it does involve some of what we can call forensic data analysis: how do you prove that a message received in your mailbox is legitimate, not a fake or “spoof”?
The short answer is you can’t.
The long answer – before you panic and lose all trust in the Internet email system – is that the email system and transport agents do tend to leave fingerprints, so even though a very smart person could spoof just about all facets of a legitimate email message, just about all bad email has obvious marks that tell you it’s not real.
To see how this works, I’ll pull out a piece of spam I received this morning. Here are just the headers:
Received: from k127.smtproutes.com (k127.smtproutes.com [22.214.171.124]
by limbo3.aplonis.com (126.96.36.19960614/8.13.6) with ESMTP id m3UJ0HEL046908
for firstname.lastname@example.org; Wed, 30 Apr 2008 19:00:19 GMT
Received: from ccm01.constantcontact.com ([188.8.131.52])
by k127.smtproutes.com ([192.168.1.127])
with ESMTP via TCP; 30 Apr 2008 19:00:08 -0000
Received: from p1-ws008 (unknown [10.250.0.102])
by ccm01.constantcontact.com (Postfix) with ESMTP id F2EDD510102
for email@example.com; Wed, 30 Apr 2008 13:45:30 -0400 (EDT)
Date: Wed, 30 Apr 2008 15:00:08 -0400 (EDT)
From: Cell Labs <firstname.lastname@example.org>
Subject: Cell Labs Wants to Purchase Blackberry 6000/7000 Series Phones
X-Mailer: Roving Constant Contact 0 (http://www.constantcontact.com)
This message is from an email list management application (Constant Contact). Notice all the weird X- headers on the bottom, for example. More important, and this is a key characteristic of spoofed email, compare the From address to the Message-ID domain. The from is email@example.com, but rather than the domain of the Message-ID matching this address domain, it’s not a valid domain at all, and the MessageID is “@scheduler”.
On a message that’s spoofed and not really from you, this is the most common way you can tell that it’s not legit. If I send a message, for example, from “spamtest.com”, then the Message-ID should be some sort of unique message identifier “@spamtest.com”. Go look at the email in your inbox and you’ll see what I mean.
If a message doesn’t get a Message-ID, then one of the email transport agents (we mail geeks call them MTA’s, by the way) will automatically add it en route, but that’s extraordinarily unusual and just about every email program I know that’s legit (not for bulk mail or spamming) as a matter of good practice adds a Message ID in the standard format. That being missing is instantly highly suspicious.
In the above message, the From: and Reply-To: match. That’s another thing to examine: if you get a message “From” your friend, but the Reply-To is a different address, the second address might well be the sender and the “from” is just a spoofed value. Be suspicious.
Here’s another header except, this time of a message that was spoofed:
(SMTPD-8.22) id AC0D0DC4; Thu, 10 Apr 2008 22:25:17 -0400
From: “hezekiah nancy” <firstname.lastname@example.org>
Subject: X-IMail-SPAM-Statistical Medications Coupon for holliecantuauhqg
Date: Fri, 11 Apr 2008 00:37:58 +0000
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
What you should notice here is the inconsistency between the nonsensical Message-ID domain (“xwhhef”) and, more importantly, the jarring mismatch between the email address (which implies “taylor” should be part of the name) and the actual user name shown (“hezekiah nancy”, generated by a spam tool that randomly pairs names out of a dictionary). If the address was email@example.com or even nhezek@ or anything even vaguely related to the given name, maybe it would seem legit, but these sort of inconsistencies are the mark of spoofed email.
An advanced thing to consider is that the originating domain doesn’t appear anywhere in the cookie crumb trail of machines that saw this email message as detailed in the Received: header values. They’re hard to read, but generally you can look to see if the sender’s domain shows up somewhere in the chain, ideally as the “received by” or “received from” in the top Received header.
To be fair, none of these by themselves prove that a message you’ve received is spoofed or legitimate. None of these rules always apply. For example, a lot of web-based email systems have mismatched from and message ID values. Nonetheless, these should hlep you investigate a message you’re finding suspicious.
One more thought: if you have a message from the person in question that you know is legit, compare its headers and routing to that of the message you’re unsure about. They should be pretty darn similar.
Good luck, I hope this is helpful!