Making sense of the GDPR as a software engineer (and social human being)
Article 25 of the General Data Protection Regulation (GDPR) addresses data protection by design and by default as a general obligation for data processors and controllers. From a software engineer’s perspective, it is really challenging to distill the relevant information and concrete action points from this new regulation. We can get stuck easily.
Software engineers need concrete and actionable answers on what and how to do things. Unfortunately, legislation and legislators are hardly ever concrete either because they might not be inclusive enough, or because there’s a risk of misinterpreting the law. However, in very simples terms, software is binary, and cannot cope with the sort of answers such as “it depends” (nor can most software engineers). For instance, data can be collected or not, it must not depend on any context that requires human interpretation.
My common sense understands why the law is formulated in such way, but my software engineering side struggles with this uncertainty. When considering point 1. of Art. 25, my engineering side raises a lot of questions.
The ambiguity of Art. 25 of the GDPR
The first point of art. 25 makes several statements that are explicitly stated and commented below.
“taking into account the state of the art”
Well, for starters, it would be useful to state what the working group means by “state of the art“. What is the state of the art when it comes to enforce data minimisation, or lawfulness, or transparency, or even accuracy?
As far as I understand, information security, more specifically, confidentially and integrity, do have a state of the art: it is called encryption and message authentication codes/signatures. Although most software engineering eduction programs do not explore information security in much detail, software developers do know how to enforce data confidentiality and integrity. What about data accuracy? What is the state of the art to enforce accuracy?
Please, don’t get me wrong, I do know that there is a lot of research on Privacy Enhancing Technologies (PETs), but more than often these techniques are not the solution, because they tend to be very context specific.
“taking into account the risks of varying likelihood”
Right, but this could not be more vague. What is the actual risk of not being transparent when processing personal data? What type of personal data would actually bring a risk for lack of transparency when being processed? I think regulators have a lot to say here, but unfortunately no concrete answers yet.
“the data controller shall implement appropriate technical and organisational measures to implement data protection principles…in an effective manner…to integrate safeguards into the processing”
What is considered to be technically appropriate? What are the effective ways of enforcing the data protection principles? For instance, when considering pseudonimisation: is it enough to hash personal data with a collision-resistant hash function? Does the hash needs to be salted? Does it need a password? Is it actually hashing a good mechanism to enforce pseudonimisation?
My main concern is that if such a single article raises so many questions and uncertainty, how can we expect that by May 2018 all systems will be GDPR compliant? Remember: GDPR has more than 90 articles.
Notice however, that I’m not advocating that the GDPR is not going to bring more privacy-friendly software systems. It is exactly the other way around. From my perspective, the GDPR is and will be spreading awareness regarding privacy and data protection and must be enforced by the regulators. Nevertheless, we need to work on a more constructive and concrete way to effectively implementing the GDPR.
In my opinion, shallow and superficial data protection assessments are not the solution to make the software GDPR compliant. On the other hand, education, awareness, training, and research do, and we need to start from there.