Who's responsible for personal data in cloud computing?
You and your Saas, Paas and IaaS providers
Published 16:03, 23 May 11
A business holding personal data about other people, eg its customers, is "controller" of that data under the EU Data Protection Directive (DPD).
If it chooses to store or work on that personal data in the cloud, it remains controller. It can't offload its data protection law responsibilities just by putting the data into the cloud.
That much is clear.
(Note. I'll assume that all cloud customers discussed below are "controllers" under the DPD, and aren't exempt eg because they're holding personal data for purely personal, not business, purposes. Also, only controllers with certain EEA connections are within the DPD's scope. I'll cover the required connections in a future article, but when controllers are mentioned below, I assume that they have that connection.)
Cloud services providers
What about providers of cloud services?
Now, a cloud provider is "controller" of its human customers' personal data, whether obtained in the sign-up process or from their use of its service. (I say "human" because most EU states give data protection law rights only to humans, not non-individuals like companies.)
The more interesting and difficult question is, what's the provider's position if its customer uses its service to process other people's personal data, eg of the customer's own customers?
The now well-known key categories of these services are IaaS, PaaS and SaaS. But it's important to note that cloud services can be "stacked" or layered.
An internet startup offering SaaS applications or services online, eg contacts management or photo sharing, could develop and deliver its services using a third party's IaaS or PaaS behind the scenes, instead of its own servers. Many have.
One questions there is, to what extent is the SaaS provider responsible for personal data processed via its service by its own customers? But a further question is, to what extent is the IaaS or PaaS provider responsible for personal data processed via its services by the SaaS provider, or indeed by the SaaS provider's customers?
Personal data processed in the cloud
Take, as a concrete example, a SaaS provider of a webmail service, whose customers store other people's contact details, and of course emails to and from the customer, using the provider's service. It in turn uses an IaaS or PaaS provider to offer its service to end users. The SaaS provider itself might (or might not) be mining or running ads against that personal data, etc. What are the data protection law responsibilities of the SaaS provider there? Of the IaaS/PaaS provider?
Is the IaaS/PaaS provider merely a "processor" of that personal data on behalf of its customer, with contractual obligations to the customer to process data only as they instruct and to take certain security measures? Or, is it a "controller" of that data, with concomitant full EU data protection responsibilities?
The answer's not straightforward, because the distinction between "controller" and "processor" isn't always clear. To complicate matters, there can be more than one controller for a single processing operation, or for different operations on the same personal data.
A "controller" determines the "purposes and means" of personal data processing. A "processor" processes the data for a controller, as in outsourced data processing.But, processors might well determine "means" of processing, eg software used, so how do you draw the line?
Even before cloud computing, the binary controller/processor distinction was increasingly blurring, in the face of complicated modern multi-partite data processing arrangements. EU data protection regulators collectively tried to provide guidance, suggesting a concept of "effective means": you're "controller" if you control the "effective means" of processing, not just minor technical means. But the guidance, while helpful, leaves some grey areas.
It's still not easy to decide who's a "controller". With the possible layers of providers and sub-providers in cloud computing, it's often unclear which party determines (and to what extent) the "means" of processing personal data in the cloud, such as security measures.
Controller, processor - neither?
Most discussions to date in this area have focused on whether cloud providers are processors or controllers. But we at the Cloud Legal Project suggest there's another possibility: Neither.
If I sell you a computer, which you use on your premises to process personal data in your business, you're controller of that personal data. I'm not your processor; I've just provided you with a processing tool. Even if I've pre-loaded software on that computer for your use, I'm not a processor.Similarly, if I rent out my computer to you, I'm not a processor ( assuming I've not planted spyware on it; if I have, I'm probably even a controller!). Maintaining and upgrading hardware and software shouldn't make me a processor.
Going further, if I keep a computer on my own premises, which I let you come and use, again I'm providing you with facilities and tools - but that doesn't make me a processor. Why should it be any different if I rent out computing hardware and software, which customers use over the internet, but which I keep and maintain on my own premises?
Arguably, in that case I'm not even a "processor", just a resources provider - especially if customers use my services for raw processing power to crunch data, and never store the resulting data permanently on my servers, but on their own local computers.
However, if customers store personal data in persistent storage in my data centres, then I'm likely to be a "processor". That's because, under the DPD, "processing" includes almost anything that can be done with data using a computer, including storage and transmission.
So, under current laws, someone who stores personal data for a controller is a "processor" - even if the storage provider has no idea whether the data are "personal data" or not, and possibly even if the data are securely encrypted and it doesn't have the key ((we've argued previously that securely-encrypted data shouldn't be considered "personal data" in the hands of someone who doesn't have the key, but the issue's not clear).
Defences for lack of knowledge/control
Now, let's consider specifically what I'll call infrastructure providers: IaaS and PaaS providers, and providers of utility storage as a service.
To encourage online services and e-commerce, the EU E-Commerce Directive gave internet hosts, caches and "mere conduits" certain defences against liability for content they store or pass through. This was because web hosts, for example, don't necessarily know what kind of content their customers store on their servers. Host generally aren't liable unless they get notice or know that the content is illegal, and fail to take it down.
However, this Directive specifically excluded the DPD, and therefore "personal data" content.
Surely, as with web hosts, infrastructure providers generally won't know what kind of data their customers store or process using their services. That's one reason why we call it the "cloud of unknowing".
Is particular stored information "personal", or not? Without actually looking, they won't know. (They probably do have the legal right to look - most cloud providers, in our earlier survey of standard TOS, reserved contractual rights to access data stored with them by their customers. Depending on the service's design, many may technically be able to login as the customer and read the data, or may have the decryption key. In practice, particularly given the huge amounts of data stored, they may not actually do that unless, for instance, law enforcement authorities ask them to.)
Going further, with securely-encrypted data where the provider doesn't have the key, they can't know whether the data's "personal data", or not.
Shouldn't infrastructure providers be allowed similar defences in relation to "personal data" content?
Although the E-Commerce Directive specifically excluded data protection law, we propose it should now be included. Just as web hosts lose their defences on acquiring the appropriate knowledge and control, so too we suggest infrastructure providers should not be treated as "processors" of any personal data processed using their services, unless and until they gain sufficient knowledge and control (access).
Infrastructure providers are qualitatively different from SaaS providers like social networking sites, where the service by its nature involves processing data known to include "personal data". The latter kind of provider might well be a processor - or even controller, if eg it uses personal data for non-agreed purposes such as allowing third parties access to it.But, for infrastructure providers, we argue that the cloud of unknowing should not be the cloud of DPD "processing".
End to end accountability
One might ask, wouldn't it reduce protection for personal data if infrastructure providers aren't made to commit to processor-style contractual obligations, eg to secure data? The answer is two-fold.
Yes, of course, security is important. But, firstly, it's the cloud customer holding personal data, who knows the nature and sensitivity of data it wishes to process in the cloud; and it's the customer who is, and remains, responsible for the personal data, and for the data's proper protection.
A prospective cloud customer needs to consider the situation proactively, and think through it fully, before processing the data in the cloud. If it's not confident about the security of a cloud provider, it shouldn't use it; if the provider can access the data and won't promise not to give it up to third parties, the customer should encrypt the data first, or use another provider; if it's not sure the provider will backup the data adequately, the customer should backup the data internally, or to another cloud service.
In other words, if the risk of a provider compromising or losing data is a concern, arguably it's the prospective cloud customer who should very clearly be made responsible for considering and addressing the risks.
Secondly, we suggest that the current simplistic binary controller/processor distinction, which is very difficult to apply in the real world, should be abolished.
Instead, we propose a principle of end to end accountability for personal data, such as in the Canadian Personal Information Protection and Electronic Documents Act 2000.
Of the many parties involved through the data life cycle, only some should be considered to be processing personal data, with varying degrees of obligations and liabilities under data protection law, based on issues such as the risk of harm, as we have suggested previously.
That approach would, we feel, help achieve a more appropriate balance between commercial and privacy considerations in modern complex relationships, given the potential for stacking of services and the increasing integration of supporting services in cloud computing.
Review and consultation
The Data Protection Directive is being updated, with a draft new Directive due out later in 2011; and the European Commission recently launched a public consultation on cloud computing, including data protection, closing on 31 August 2011. The law here is therefore in a state of flux, and it will be interesting to see how issues like the above are addressed.
The full paper discussing the above in detail, along with other related issues, "Who is Responsible for 'Personal Data' in Cloud Computing? The Cloud of Unknowing, Part 2", is available for free download.
"'Personal Data' in Cloud Computing - What Information is Regulated? The Cloud of Unknowing, Part 1" dealt with "personal data" in the cloud.