Update: Apple responded with a press release on April 27, 2011
The Row Over Location
Since an inquiry into Apple’s use of location data by US Congressmen Edward Markey and Joe Barton in June, 2010, Apple has publicly clarified that they are actively collecting cell phone tower location, Wi-Fi access point information, and GPS coordinates in order to “provide location-based services” as well as to “help Apple update and maintain its database with known location information.”
Using the tool that Allan and Warden developed to visualize the location data in consolidated.db, and in conjunction with the disclosed purposes in Apple’s reply to Markey and Barton, it seems that the information collected is not directly personalized (“None of the information transmitted to Apple is associated with a particular user or device,” Markey & Barton, p 7). In fact, because it only comprises “visible” wireless access points and cell phone towers, the location is likely to be more coarse than GPS, especially in areas with fewer sources to triangulate against. Researcher Peter Batty has even entertained the suggestion that it could be a predictive download from Apple, to store on your phone location details of places you may soon visit (towers and Wi-Fi hotspots in your vicinity). Since Batty’s investigation suggests only the latest log of a given set of towers and access points remains in the database, it is not as helpful for deriving a history of how often and when someone was in the area over time. Still, it could be quite helpful in further attempts to de-anonymize or provide additional context to other datasets.
Update: Apple confirmed that the information was not tracking the phone’s location:
“Rather, it’s maintaining a database of Wi-Fi hotspots and cell towers around your current location, some of which may be located more than one hundred miles away from your iPhone, to help your iPhone rapidly and accurately calculate its location when requested. Calculating a phone’s location using just GPS satellite data can take up to several minutes. iPhone can reduce this time to just a few seconds by using Wi-Fi hotspot and cell tower data to quickly find GPS satellites, and even triangulate its location using just Wi-Fi hotspot and cell tower data when GPS is not available (such as indoors or in basements).”
Via the process described by Apple, the result is to allow users’ devices to collect nearby information, transmit this information to Apple, and get a rough location in return when more fine-grained location services like GPS are unavailable. In comparing the results from the android-locdump tool to the more widely publicized iPhone Tracker tool, the primary difference that exists is that Android regularly purges content in its cache.cell and cache.wifi logs; cache.cell only stores 50 records, and cache.wifi only 200 (according to the android-locdump author’s review of old source code). What Allan and Warden found was that iOS consolidated.db did not regularly purge content, allowing their visualization to extend as far back as the creation of the file (previous iOS versions used a different file name and path).
Problem #1: Indefinite Retention
If the users’ own log is not used to provide them with offline location information (and from Apple’s description of the process, it appears this is not the case) then there is a potential for abuse if the file remains. Android seems to handle this much better, by limiting the locally stored records by quantity. In the instance of either abuse or forensic analysis, investigation will only uncover a limited view into a device’s location history. While Apple is certainly in a much better position than I am to determine what must remain locally for current functionality, I would imagine that there could be a reasonable trade-off in using either a quantity-based or age-based pruning of records to limit the exposure. Alternately, some method could be provided to manually purge this history and accept some acceptable loss of fidelity by Apple and the user.
Update: Apple has decided to regularly prune this information, keeping only 7 days worth of cache on the user’s device. The cache will also not be included as part of a device backup. These changes will take effect after a forthcoming free software update. A later major version update will cause the cache to be encrypted on the device itself.
Smart Devices and the Storage of Forensically Significant Data
This issue is not Apple’s alone. Google’s Android has been shown, via the recently developed android-locdump tool, to collect location information. This issue is also not about location alone. Location is of somewhat increased significance when discussing mobile devices, because they are both personal and designed to communicate bi-directionally, but with multi-gigabyte storage capacity they also are capable of archiving a significant amount of information.
Forensic tools are capable of extracting mountains of information from mobile devices. Multiple books exist on the topic, as do many tools for performing investigations. Screenshots, SMS records, call logs, location information, documents, photos, and more can be extracted and plotted on timelines to give a comprehensive record of a device’s usage. This is not new, but it may be poorly understood by users who are more concerned with the ease and freedom that these powerful and capable gadgets provide to them.
Problem #2: Awareness
Being aware of the capabilities and side effects of technology use is key to making informed risk decisions. In this regard, I am thankful for the work that Allan and Warden did; while Alex Levinson, a forensics expert, has taken exception with the claim that the collection of this location information is new, what we have is a key opportunity to understand that there is a difference between Awareness and Education. Education is a deep understanding that is typically sought only by few intensely interested parties, whereas Awareness is a general notice that is accessible to those with even a cursory interest. Education on the matter has existed, as evidenced by the work of Levinson and others to write books on the topic; however, Allan and Warden have succeeded in an arguably sensational visualization of the information that has made the otherwise uninterested cognizant of the capabilities and activities of their devices.
Valuation and Accessibility of Forensically Significant Data
Understanding what is being collected and transmitted is an important step to assessing the risks associated with location data. In order to retrieve the information, many have rightly provided the caveat that someone would either need to have physical access to a device (in order to use a hardware forensic collection tool), access to a user’s computer (in order to have access to an unencrypted device backup), or privileged access on the device (e.g. through a web-based security vulnerability that allows an attacker to read any content on the device). In each of these scenarios, the attacker obtains quite a bit of accessibility into the user’s private information and is capable of doing much more than reading location history. Given these caveats, the exposure of coarse location data may pale in comparison to other details that would likely be exposed simultaneously.
Comparing coarse location data to other information on the device, then, leads many to make the statement that location data is not important. In light of the other details available, location is small potatoes, one might say; after all, they can now read email, view photos of the kids, or peruse the calendar to see when vacations are planned.
Similar review could be made for all information collected by smart devices. Users should consider their threat profiles and investigate whether smart device data collection could lead to unacceptable exposure. Often, such review will be unnecessary for some kinds of data, either because threat profiles do not warrant concern or because detailed analysis is more costly than simple avoidance. That is, when one is sensitive to location history, it may be possible to simply travel without any electronics capable of recording or transmitting one’s location. But for some kinds of information it may not be quite so clear-cut, and further review may be warranted.
Problem #3: One-Size-Fits-All Security
For some, even coarse location detail is too sensitive. It may be even more sensitive than having every other piece of information on the device compromised, and may mean that the cost of using a device that kept such a persistent log locally is a bridge too far. Yes, it’s probably true that this is not a high-priority risk for most people, but without awareness and an understanding of the data’s scope, not everyone is able to make a complete evaluation and assessment of risk. For this reason, the recent row over location information on smart phones is significant. Not only has it raised location data awareness, but it should prompt further investigation into the other risks that may arise from the increased usage of smart devices. One cannot assume that one size fits all when it comes to security and dismiss this research as unimportant. For many, these risks will continue to be low and often outweighed by the utility received from their use. But for a few, such discoveries are an invaluable insight.