Metadata provides a general description of what is contained in a collection or repository of data which can range from technical specifications such as resolution or color depth for images, what the general contents are for GIS applications (region, lat/longs, etc.), to where and how the data was collected. With the right model, metadata can become the workable data for determining trends. Combining the metadata with the actual contents of the data can provide even more information with the right process.
There are countless examples of metadata, but if you’ve worked with any modern operating system you’ve seen metadata in practice. The creation date, size, resolution, etc. all contribute to the potential understanding of what a piece of data represents. By knowing when a file was created, you know which file came first. You can combine and collate metadata to paint a picture of the hierarchy of data on different metrics.
Metadata is more complex than just parameters of a piece of data. Abstract descriptions of the rough contents exist as well. Some metadata is most efficient used with digital analysis, other data is made for humans consumption. Human readable data can vary heavily in objective quality and accuracy, but once you can interpret it, it tells you more. The contrast of quantifiable dimensions and subjective classifications makes the practice of interpreting metadata more unpredictable but leaves more room for further insight.
Metadata in Use
Metadata is a form of data which describes other data. Some of the data will paint a picture while other data just describes the format. Some metadata is automatically populated, while other pieces are user supplied. There isn’t really a standard for metadata, but different formats or types of data have different rules and standards.
For instance, most image formats support the addition of EXIF data to images. EXIF stands for Exchangable Image File (Format) and describes data about how an image was produced. This can include shutter speed, aperture, etc., but also things like GPS coordinates in most modern devices. I try to always strip my EXIF data to prevent leaking identifying information due to the disasters I’ve witnessed from it.
EXIF, and metadata in general isn’t inherently compromising, but the problem comes from when you give up too much information unintentionally. I’ve used EXIF data to prove I was where I said I was at a specified time for basically bureaucratic purposes. On the flip side, that means this data can be used to show where you were at a specific time.
Fields like GIS use metadata to make searching and selecting the right data easier. You need to have a general picture of what the data is to have a context of what it’s useful for. A set of shapefiles or a geodatabase of Germany does absolutely nothing for me when I’m mapping a specific basin in North America. How the data was compiled with a specific coordinate system, resolution, type of information (e.g. topology, oil and gas wells, legal boundaries), etc. is just as important as where the data was actually collected. Metadata gives you a general picture which means you don’t need to waste time reinventing the wheel.
Taking Metadata Further
Whether you’re releasing data intentionally or not, the metadata can reveal more about your plans than you intend. It can even exacerbate geopolitical situations. A photo you didn’t want to have released may be harmless for the actual content, but the time and place can reveal parts of your schedule or give definitive proof to what you’re doing and when. While this doesn’t sound damaging on its own, it can lead to exposure of your address, where you work, where you shop, etc.
The metadata attached to certain business filings or even intentional media releases can provide traces into internal processes or confidential work. Using the same techniques outlined before, it can show business alliances, proof of locations, etc. Crossing the right data (either leaked or intentionally released) and metadata can lead to full exposure of delicate business plans or at least enough knowledge for a business rival to work off of. Certain non-optional filings can require you to show a hint of your hand before you announce anything to the public.
If you announce you’re driving somewhere at 7:00 PM on social media, then take a picture at a location with EXIF data at 7:30 PM, it probably shows you live within about 30 minutes of driving from the location. There may be exceptions, but they are ironed out by continual analysis of the metadata. A few trips with potential driving times could narrow down where one lives without being provided the address data. This information is further used to derive whom you work with, whom you may know, etc. in conjunction with other metadata from other individuals and sources. See this tongue in cheek analysis of how to use metadata to find Paul Revere to see just how far metadata can take you.
Why Privacy Matters
Metadata helps both people and machines make sense of data without having to analyze the actual data itself. At the same time, it provided a large amount of extra data about the contents of the data. In fields like GIS, this just makes it easier to search for data with specific parameters. As data sets ballooned, it got impractical to just grab everything and sort through it.
You need to be aware of the privacy implications caused by metadata. This same extra data can help invade privacy or paint a different picture of individuals. For instance, EXIF data provides both the time and the coordinates on most smart phones. The EXIF data on its own doesn’t say much except that the photo was taken a certain way at a certain location. What happens in the photo or combining this data with other sources is where it can invade privacy.
This is a tool that stalkers and burglars have used in the wild. Wait until you get back from your trip to post those photos in a public forum showing you’re away from your house. Scrub the exact times and similar to reduce the chance of an unwanted guest picking up on your schedule. Consider what information goes into public record as publicly accessible metadata (or data) and take appropriate cautions depending on how it can impact your privacy.
The erosion of privacy breaks down the walls between your internal world and the general public. This can feel uncomfortable or even downright invasive. Metadata is a powerful tool, but it can also act as a window too far into your privacy. You may not be able to fully control your privacy online, even if you leave social media, but you can prevent leaking more than you want.
Image by Pete Linforth from Pixabay