I’m skeptical! The FBI found 650,000 emails on a server and in two weeks managed to go through them all and decide there was nothing damning? How could someone possibly review all those messages in so little time?
In early October 2016, FBI agents discovered 650,000 emails on a computer owned by Anthony Weiner, the husband of top Clinton aide Huma Abedin. A week or so later FBI Director James Comey announced they were investigating the messages to see if there were breaches of confidentiality related to Hillary Clinton’s private email server from when she was US Secretary of State. November 6th the FBI announced the investigation was complete and exonerated Clinton of any wrongdoing.
In response, Donald Trump insisted at a rally on Nov 6 that it would have been impossible for the FBI to review what has been reported to be as many as 650,000 emails in such a short time.
That’s what you’re asking about, and it’s a legitimate question. But the FBI completing this task in the allotted time is not only entirely believable, it’s likely that the primary research and investigation was done quite a bit faster. How? Because researching 650,000 email messages doesn’t mean each and every one has to be read word-for-word. That’s why we have search engines, and they’re incredibly fast at finding needles in haystacks.
To demonstrate, I’ll turn to my own Gmail archive, which contains over 100,000 email messages. To get to it, I simply clicked on “All Mail”:
Gmail shows me that my email archive actually contains 112,092 messages:
That’s not 650,000 email messages, but it’s about 1/6th of the Weiner email archive. Now, how fast can I search for a specific name? Let’s try Obama:
Before I’ve even finished typing in the word Google’s Gmail service is searching the archive and showing me potential matches, including two messages from February, 2013.
Once I press Enter, Google completes the search:
There are matches from mailing lists I’m on that can and should be removed from the results. No problem, I can get rid of those by using what Gmail calls stop words, words that are prefaced by a dash.
See the numeric result above? In just a few seconds of work, I weeded through 112,000 email messages, those sent and those received, to isolate the 128 that reference Obama.
But the Weiner email archive likely encompasses much more than just the date range under investigation, so that’s a fast way to limit the search results too. Gmail has that covered too: Let’s take the exact same search as above, but isolating matches to only those sent or received between Jan 1, 2015 and July 4, 2015.
That’s done by adding after:2015/1/1 before:2015/7/4 to the search string. That chops things down so efficiently that I don’t even need the stop words used above:
Now, if I can do this and find the 44 matching messages in an archive of 112,000 email messages in less than 30 seconds, do you think it’s possible that the FBI has better tools, faster computers and highly trained agents with a dataset of hundreds of key words, names, agencies, and even nicknames and code names, and apply that to the 650,000 email messages from Weiner’s server?
If the intention was for every single email message to be read and analyzed, that could be done in a few weeks by the FBI too, actually: The Federal Bureau of Investigation has “nearly 35,000 employees, including special agents and support professionals such as intelligence analysts, language specialists, scientists, and information technology specialists.”
Let’s say that only 10% of them are trained to do this sort of research, that still divides 650,000 messages across 3,500 employees, which works out a modest 185 messages per agent. To review in three weeks. If they just do it on lunch break for those three weeks that’s a mere 12 messages per day they need to read and review before the entire dataset is reviewed. That’s few enough that it wouldn’t even adversely affect their lunch plans.
To be fair, if you’re thinking that the research team is just a couple of agents who are manually printing out, reading, analyzing and cross-checking 650,000 email messages, then you’re right to be skeptical. That’s a daunting task. But that’s not how it works and to this tech expert, at least, three weeks to review and analyze the entire dataset is entirely believable.
Addendum: Turns out that the whole problem was quickly reduced down to a much smaller dataset anyway. As the New York Times reported: “law enforcement officials said, there was no need to review all of the emails, only Ms. Abedin’s. Those emails numbered in the thousands, and even many of those were duplicates of messages that had been looked at previously”.