WikiLeaks reports having received 2.5m emails from Syrian government officials, which it began releasing Thursday night. Founder Julian Assange: "The material is embarrassing to Syria, but it is also embarrassing to Syria's opponents."
The database comprises 2,434,899 emails from the 680 domains. There are 678,752 different email addresses that have sent emails and 1,082,447 different recipients. There are a number of different languages in the set, including around 400,000 emails in Arabic and 68,000 emails in Russian. The data is more than eight times the size of 'Cablegate' in terms of number of documents, and more than 100 times the size in terms of data. Around 42,000 emails were infected with viruses or trojans. To solve these complexities, WikiLeaks built a general-purpose, multi-language political data-mining system which can handle massive data sets like those represented by the Syria Files.