We reported earlier that UK’s Information Commissioner’s Office (ICO), following an FCC report reopens Google Street View Investigation. The search giant has responded now back to the ICO with a lengthy letter from its Global Privacy Counsel, Peter Fleischer responding to specific assumptions and questions the ICO asked the company. The search giant even seen insisting that it did not did not touch collect data nor did it tamper with it.
Here is what the Google said in its letter back to ICO (via Telegraph):
18 June 2012
Dear Mr. Eckersley,
Google Street View Wi-Fi Collection
Thank you for your letter dated 11 June 2012.
Google has always said that we are profoundly sorry for having collected payload data from unencrypted networks. As soon as we realised what had happened, we stopped collecting all Wi-Fi data from our Street View cars and immediately informed the authorities. We have done our best to provide clear, comprehensive and accurate information to all investigating in the two years since then.
In the UK, we cooperated fully with the Information Commissioner’s Office (ICO) investigation, including by making payload data available to the ICO for inspection. And, at the conclusion of this investigation, we entered into signed Undertakings with your office committing to enhancements to our privacy practices with a view to preventing a similar incident in the future.
In the months since then, we have improved our training programmes and product development processes. As you know, we expanded privacy and security trainings for new employees, developed new privacy and security training for all staff, and enhanced core training for engineers to focus on the responsible collection, use and handling of data. At the same time, we improved our internal procedures by introducing new documentation for engineering projects and by creating a team to review, and where appropriate audit, those projects.
Your office audited these changes and your team concluded that they had “reasonable assurance that Google have implemented the privacy process changes outlined in the Undertaking”. Your subsequent report verified the improvements we’ve made to our internal privacy structures, training programmes and internal reviews, and identified some scope for continued work. We have worked to implement these recommendations and, as agreed, plan to update your office on our progress on 22 June.
In light of the foregoing, Google is surprised that the ICO has decided to re-open its investigation into this matter. However, as in its previous dealings with the ICO, Google intends to cooperate fully and to respond to the ICO’s questions in an open manner.
* * *
We note at the outset that your letter of 11 June contains a number of statements and assumptions that incorrectly suggest that the disk made available to the ICO for analysis was “pre prepared” and not representative of the payload collection, and that Google had greater knowledge about payload collection prior to its May 2010 blogpost than previously had been disclosed, apparently based on the findings of the United States Federal Communications Commission (FCC). We address those points up front before answering the questions in your letter.
(a) Google did not “pre prepare” data for the inspection.
With respect to the ICO’s inspection of the payload data in July 2010, the data was not “pre prepared”. A hard drive used by one of the Google Street View vehicles that drove in the UK was “mounted” at Google’s Belgian data centre where it, along with other Street View drives, was physically located at the time. “Mounting” the drive merely refers to the process of connecting it to a computer in the Belgian data centre so that the data on the drive may be accessed through the computer’s file system. This process of allowing the ICO to inspect data collected in the UK remotely was agreed with the ICO’s representatives in advance of the inspection. It also is a process that was used for inspections by other European data protection authorities.
As you know, data is stored in binary format on a computer hard drive; that is, in a form which is not human-readable. In order to ascertain whether the hard drive contained any personal data or not, including that of the kind referred to in your letter (emails, URLs and passwords), it was necessary for the data to be viewed in a “text” format, rather than as an indecipherable (to the human brain) series of “1s” and “0s”. This being the case, Google employed a proprietary piece of software called the “Codex” that merely converts binary data (stored in a particular format) into human readable text. Where the underlying binary data does not represent text, such as where it represents an image, the Codex still converts the binary data, but it appears as a meaningless string of alphanumeric characters. To be clear, without Google’s use of the Codex, the data on the hard drive would not have been human-readable or searchable using key-words, which the ICO representatives specifically requested. This Codex was the same one used to convert the binary payload files to human-readable text for other data protection authorities that inspected the payload.
Other than through using the Codex described above, the data on the hard drive inspected by the ICO was not “pre prepared” in any way. Indeed, until the ICO’s inspection, Google had not viewed or analysed the payload data on the hard drive used, and nor has it since.
(b) An Erroneous View of the Extent of Knowledge about Payload Collection within Google
Your letter raises questions about the extent of knowledge of the payload collection in the Company prior to Google’s public disclosure of the activity two years ago. The FCC Report and recent media coverage suggests that there was widespread knowledge. That is not the case. The documents we produced to the FCC, the salient portions of which which we have provided to you, show that, at most, a few people early in the project could have seen some red flags in a document or an email and inquired further. But that assumes too much. These few individuals are unequivocal that they did not learn about the payload collection until May 2010. As Google’s submissions to the FCC made clear, the red flags in these handful of documents were missed, as the individuals’ sworn declarations confirm, but this is a far cry from suggesting that Google’s managers knew about the payload collection.
Google searched several million, and manually reviewed over 500,000, documents for indications of knowledge about payload collection, yet only a few were discovered that could have raised a red flag about the collection. In hindsight, had those been recognised, the collection might have been discovered. Both FCC and US Department of Justice attorneys interviewed individuals who saw or could have seen these red flags, and each individual signed a declaration under oath, confirming that each didn’t learn about the payload collection until May 2010.
Google has acknowledged that there were opportunities missed along the way to catch and stop the payload collection. However, it is important to recognise that the purpose of the Wi-Fi collection was to identify wireless access points for location-based services; no project leader asked for or wanted the payload data; and no payload data was ever used in any product or service. That’s the context in which the documents Google has disclosed should be viewed.
(c) Responses to your questions 1 to 7.
1. Please list precisely what type of personal data and sensitive personal data was captured within the payload data collected in the UK?
As explained in paragraph (a) above, Google’s only sight of payload data collected in the UK was the same as that of the ICO’s representatives during the inspection at Google’s offices. It has not further viewed or analysed the UK payload data on this hard drive, nor that stored on any other hard drive. Therefore, Google cannot definitively list what types of personal data and/or sensitive personal data were captured within the payload collected in the UK.
However, as the FCC report notes, at least two European data protection authorities that conducted extensive inspections with technical personnel indicated that some percentage of the payload data did include entire emails and URLs, as well as passwords. This is referred to in the Undertakings agreed with the ICO, as well as publicly in a blogpost that we posted on 22 October 2010, four weeks prior to entering into our Undertakings. Other data protection authorities reported that they found only a small percentage of the payload data collected in their jurisdictions contained such personal information.
While Google is not in a position to inspect the payload data collected in the UK, it has no reason to suppose that the categories of payload data collected in the UK were substantively different to those referenced in its 22 October 2010 blogpost or those discovered by other data protection authorities. Google also believes that the proportion of that payload data which amounted to personal data was very small.
2. Please confirm at what point Google managers became aware of the type of payload data being captured during operations in the UK and what technological or organisational measures were introduced to limit further data collection prior to the admissions made by Google Inc on the blogpost dated 14th May 2010?
As Google has previously said, the purpose of the project was to identify Wi-Fi access points for use in location-based services. No project leader asked for or wanted the collection of payload data (and as noted many times, the proof is that payload data was not used in any product or service). Google manually reviewed over 500,000 documents and identified only a few documents that arguably could have raised a red flag about the payload collection. The FCC Report accurately concludes that “[m]anagers of the Street View project and other employees who worked on Street View have uniformly asserted in declarations and interviews that they did not learn the Street View cars were collecting payload data until April or May 2010”.
Google has released the documentation it provided to the FCC now, rather than waiting for the full Freedom of Information Act process to be completed, to promptly address suggestions in the FCC Report about the extent to which others at the Company knew about the payload collection. The documents Google released show some isolated references to payload collection early in the project that could have been seen as red flags by the recipients with the benefit of hindsight. But in context, at the time, the red flags were missed or not understood. The individuals themselves have said under oath that they did not learn of the payload collection until it became public in May 2010.
By way of example, the FCC Report makes much of the fact that the Engineer sent the design document by email to the Street View team, announcing that he had completed the Wi-Fi system design. In the overall technical description of the configuration necessary to identify Wi-Fi access points, the document does include an ancillary reference to the collection of “user traffic patterns”. But Google was unable to identify anyone who read the document, let alone who would have read the technical document in detail and understood the reference.
Google has acknowledged that there were missed opportunities to learn about the payload data collection as part of this project. But Google managers confirmed that they did not learn of the payload data collection, let alone the type of payload data being captured during operations in the UK, until May 2010 at or around the time Google made its blogpost.
3. Please provide a substantial explanation as to why this type of data was not included in the pre prepared data sample presented to and viewed by staff from the Information Commissioner’s Office.
As explained under (a) above, there was no “pre preparation” of the payload data viewed by the ICO’s staff; it was merely rendered “readable” using Google’s Codex software so that it could be understood and queried using search terms of the ICO’s choosing. It also is not correct to describe the data presented as a “sample” as this incorrectly suggests that Google only provided the ICO with an extract of the data stored on the hard drive. In fact, Google made available the complete hard drive (albeit remotely mounted), containing 700 GB of data, which was collected in towns in the East of England.
The minutes of the ICO’s inspection of this hard drive record that numerous keyword searches were made, but only a minimal amount of data that was even recognisable as English language words was found. The most likely reason why the ICO failed to find any significant personal data is because it was only present in very small quantities on the hard drive concerned. As also set out in the ICO inspection minutes, the composition of data on the Google Street View hard drive inspected by the ICO was estimated to be as follows:
Approximately 0.0131% of the data stored on the hard drive was Wi-Fi data (the remainder of the data consisting primarily of Street View images and related data which occupies the majority of the space on the hard drive); and
Approximately 1.5% of the Wi-Fi data was payload data (as opposed to network data such as SSIDs, etc).
Having not analysed any of the payload data collected by the Google Street View vehicles, we have no information on what proportion (if any) of payload data may have been “personal data.” However, we consider it likely to be a small percentage having regard to the manner in which the payload was collected — namely, using Street View vehicles that were on the move and using Wi-Fi equipment that automatically changed channels five times per second.
4. At what point had the senior managers within Google seen the software design documents and been briefed about the code and precisely what type of data it could capture during the development process and actual capture of payload data?
Google has not identified any senior managers who reviewed the software design document or who were “briefed” by the Engineer or anyone else about the collection of payload data. The FCC documents we provided to you contain the communications that we uncovered where, in hindsight, a red flag should have been raised, but again these individuals have stated in sworn declarations that did not learn about the payload collection until May 2010.
5. Please provide copies of the original software design document and any subsequent version control software documentation and associated logs used to record managerial decisions and rationale?
A copy of the relevant design document is included with this response. There are no associated logs used to record managerial decisions in regard to the design documents or otherwise.
6. Please outline in full the privacy concerns identified by Google Managers once the engineer revealed the practice, including details of how this threat was managed and what decisions were made to continue or terminate this practice?
We understand the reference in this question to “the engineer revealed the practice” as referring to the communications involving the engineer that are described in the FCC Report. However, as already noted above, at most, a few people early in the project could have seen some red flags and inquired further, but these individuals are unequivocal and have signed sworn declarations that they did not learn of payload collection before May 2010.
To ask what privacy concerns were identified by Google Managers erroneously presupposes that managers recognised the red flags identified in the documents Google has produced. As described above, however, this was not the case. And since there was no such recognition, the questions of how to manage the threat and/or whether to continue or terminate the practice simply did not arise.
7. Please outline what measures were introduced to prevent breaches of the Data Protection Act of 1998 at each stage of the Google Street View process.
As a result of this incident, and as noted in this letter above, Google has introduced a series of enhancements to our training programmes and product development processes, many of which are articulated in our Undertakings and many of which were audited by your office in July 2011. As mentioned above, we expanded privacy and security trainings for new employees, developed new privacy and security training for all Googlers, and enhanced core training for engineers to focus on the responsible collection, use and handling of data. At the same time, we improved our internal procedures by introducing new documentation for engineering projects and by creating a team to review, and where appropriate audit, those projects.
Your office audited these changes and your team concluded that they had “reasonable assurance that Google have implemented the privacy process changes outlined in the Undertaking.” Your subsequent report verified the improvements we’ve made to our internal privacy structures, training programmes and internal reviews, and identified some scope for continued work. We have worked to implement these recommendations and, as agreed, plan to update your office on our progress on the 22 June.
* * *
We hope that having considered the above information and our responses to your questions, your office will share our view that the recent publication of the FCC’s findings does not in any way change the position from that at the time that Google and the ICO agreed Undertakings in November 2010.
Finally, as requested, we enclose copies of the certificates of destruction in respect of payload data collected in the UK, which were originally provided to your office in November and December 2010.
Global Privacy Counsel