AMITIAE - Tuesday 15 December 2015
Cassandra: Metadata - Saviour and Scourge
By Graham K. Rogers
As part of the assessment process necessary in almost any course that is taught, I assign the students to summarise what they have learned and to comment both on the specifics as well as on the value to them of the course as a whole. Students tend to recognise the potential that metadata has, and most recycle the comment of General Hayden, which I show in a video clip, "We kill people using metadata", although they omit the following comment, "but that's not we do with this metadata." I am never sure if I am reassured by that parenthetical addition.
As I was reading through the student submissions this term, a number of questions came to mind about the purpose and the ethical uses. I also considered ways in which metadata should be used.
Some of the EXIF metadata for one photograph
The normal Spotlight search provides a list of files that might be suitable depending on the user's search parameters: usually keywords. Deeper searching may also be needed if the basic search does not produce a proper result. On a Mac this is done using a Finder panel and the Search box (or Command + F). This immediately causes a problem as a user may not remember the file name; and Contents may not be valid if the file needed does not contain text.
The Other option has a wider selection of options, some of which may only apply to specific file types. When Other is selected, an A-Z list of attributes is opened, from which the user may select one (or more). To illustrate the strength of such search methods and the variety of options, I show the students the item concerning (photographic) Flash: "Whether the picture was taken using a flash".
The range of metadata contained in some files is therefore considerable, so what is the purpose? Metadata is a type of information that contains "an underlying definition or description" (What.is). This may not, however, illustrate the concept sufficiently. Reading the What.is page lower down, the suggestion is (as per above) to assist in "finding and working with particular instances of data". This is aimed at specific users and the management of data on their computers. However, the page also adds the important idea of metadata within webpages and the obvious function of search, although with the more sophisticated algorithms used nowadays, web metadata may be less significant than (say) 10 years ago.
Unfortunately, we are not in a perfect world, and metadata may become part of evidence in criminal trials. While photographic evidence was originally presented with the declamation that the experts had examined the untouched negatives which were in their possession, the advent of digital images has changed this. It was acceptable for an expert to point out if a photograph had been altered from the original, so there would seem to be little difference with the digital counterparts, especially when any alteration may be evidence of criminal intent.
The same would apply to digital versions of typed documents. Before the computer, comparisons might be made of hand writing, or of the specific key impressions on a page to prove which specific typewriter had been used (attributes that the Romanian Department of State Security under Nicolas Ceaucescu found invaluable). While some printers do produce different output, this is usually hard to detect: far more reliable would be the use of metadata, showing which computer or operating system had been used, the times of creation or alteration and other attributes of a specific file in question.
Where the unethical become ethical is when metadata is used or altered to conceal such potential criminal activity that the expert witness would otherwise uncover. Of late, however, there has also been the valuable use of DNA in investigations. This most personal of metadata can be used both to prove a person's involvement in a crime and to disprove that a suspect was connected. While police have been keen to add this to the crime-detection armory, some have been less willing to allow its use to prove the innocence of someone already convicted.
There are clear-cut reasons for the use of DNA. Like fingerprints, no two examples are alike. Even examples from identical twins which were previously thought-of as undifferentiated can now be analysed to reveal subtle epigenetic changes.
The grey areas appear when governments (by way of their proxy agencies) not only collect data on their own citizens but store it as a form of insurance: in case a crime comes to light in the future; or one of those persons whose metadata is collected, turns out to be some form of criminal. It is clear that more information than is needed has been collected by legal, but questionable means (e.g. when internet packets are conveniently routed outside the country, even for milliseconds).
It has long been suspected that the security agencies share information on their respective citizens to circumvent the long-held notion that "we do not spy on our own." At the reading of the long-expected Investigatory Powers Bill in Parliament it was revealed by the UK Home Secretary, Teresa May, that she and her predecessors had approved the bulk collection of communication data in the UK since 2001 by MI5 and GCHQ. This was apparently not illegal, but the laws that applied had holes so large a coach and horses could be driven through them. It is with this in mind, that the new law will apparently provide oversight. Some are less than reassured by this.
The dichotomy is captured succinctly by Amulya Gopalakrishnan of the Economic Times of India:
He perhaps meant it to be a witty rejoinder to David Cole's summary of the NSA's operational capability and the risks that metadata posed, but the effect was chilling. The 4-minute clip is available on YouTube
Whether or not General Hayden was referring to location data (the metadata that might be available when a phone links to a specific carrier tower) or building a picture of suspects and their contacts, or more, was not made clear. Nonetheless, the overall effect perhaps did more damage to the security case than was imagined. The idea that metadata is collected on thousands of persons, with no criminal background, no intent, and no contact with terrorist organisations is repellent; but the authorities sell the concepts in such a way that it seems unreasonable (for some) to object.
Architecture of Radio - Mapping Signals that Surround us
The differences occur after the attacks, when evidence is collected, data is retrieved and the dots are joined using some of that stored data: a time when data could be retrieved by legal means such as warrants. Law enforcement agencies would claim that resorting to such legal niceties would slow them down; but speed is no longer of the essence as the attacks have already taken place. A thorough investigation, with a methodical approach would be preferable.
Where the retention of metadata is undesirable is when it is used as a source for a fishing expedition months or years after its initial collection. That leads to the surveillance state with all the implications of a society that is constantly looking over its collective shoulder. That fear erodes all ideas of a free society and in itself is as unsafe as any attack.
Graham K. Rogers teaches at the Faculty of Engineering, Mahidol University in Thailand. He wrote in the Bangkok Post, Database supplement on IT subjects. For the last seven years of Database he wrote a column on Apple and Macs. He is now continuing that in the Bangkok Post supplement, Life.
For further information, e-mail to
Back to Home Page