US election 2016: How to review 650,000 emails in eight days

  • Published
Donald Trump at a rally in VirginiaImage source, AFP
Image caption,

"You can't review 650,000 emails in eight days," says Mr Trump, but is he right?

To the outrage of Donald Trump and his supporters, the FBI says it has found no evidence of criminality in a newly-discovered trove of emails linked to Hillary Clinton.

With election day in touching distance, late last month FBI director James Comey said the bureau was investigating new emails potentially connected to its investigation into Mrs Clinton's private email server.

He has since faced a backlash from leading Democrats, with President Obama saying investigations should not operate on "innuendo" and the party's leader in the US Senate, Harry Reid, even suggesting Mr Comey may have broken the law.

There was little sign that US voters would see a conclusion before the final vote.

But now, in another letter, Mr Comey has effectively concluded they have found nothing new. And Mr Trump has made his displeasure clear.

"You can't review 650,000 emails in eight days," Mr Trump told a rally in Michigan.

"Hillary Clinton is guilty, she knows it, the FBI knows it, the people know it and now it's up to the American people to deliver justice at the ballot box on 8 November."

Image source, @GenFlynn

Several computing experts, though, say otherwise.

"That's taking a rather naive view of it," the University of Surrey's Alan Woodward said of Mr Trump's claim. "The investigators don't go through each email manually."

The emails themselves were found on a device belonging to Anthony Weiner, the estranged husband of Clinton aide Huma Abedin. Mr Weiner, a former congressman, is subject to a separate FBI investigation.

Details about the fresh FBI inquiry remain scant. Several reports say that the emails discovered were simply duplicates of ones already examined.

In the latest letter, Mr Comey said investigators had "reviewed all of the communications that were to or from Hillary Clinton while she was Secretary of State", leaving open the possibility they were still looking into some of the emails.

Image source, @khanoisseur

For Steven Murdoch, a research fellow at the University of London, the key word is "review".

"It doesn't mean they have been read," he said, adding that privacy considerations and the sheer volume of data would have been prohibitive.

Despite the seemingly intimidating size of the email cache, there are several ways they could have been narrowed down, experts say, such as using the to and from field to determine which messages came from Mrs Clinton, filtering out duplicate emails, or using search parameters.

Dr Murdoch compared the process to how officials might root through vast amounts of court documents.

Using these techniques, it is unlikely there would have been many emails investigators would have to read with their own eyes.

"Very quickly you would find that the haystack becomes the needle," as Prof Woodward said.

Fugitive US intelligence leaker Edward Snowden offered a few more tips to the authorities on how they might go about their search.

Image source, Twitter

Mr Snowden suggested they may have used hashing, which would involve coding the two sets of emails into a shorter expression of that data for quick comparison - something the authorities presumably had a head start on given the months of investigation into Mrs Clinton's email use.

Speaking anonymously, one former FBI expert told Wired, external he had processed much larger sets faster.

"We'd routinely collect terabytes of data in a search," he said. "I'd know what was important before I left the guy's house."

For the Errata Security blog, external, "the question isn't whether the FBI could review all those emails in eight days, but why the FBI couldn't have reviewed them all in one or two days. Or even why they couldn't have reviewed them before Comey made that horrendous announcement that they were reviewing the emails."

Image source, @ditzkoff