Ethical, Scalable Survey Design for GDPR Impact Study

Table of Links

Background to the GDPR
Literature Review

3.1 Consumer awareness and knowledge of the regulation

3.2 Consumer awareness and knowledge of the regulator

3.3 Consumer perceptions of privacy

3.4 Business response to Data Protection regulation

3.5 Employee awareness of their employer’s Data Protection regulator

3.6 Employee perception of benefit of the GDPR to their employer

3.7 The research goal is the consumer/employee perception of the GDPR

3.8 Summary
Methods

4.1 Design

4.2 Data Analysis and 4.3 Ethical considerations
Analysis and Results

5.1 Background demographics and 5.2 Hypothesis 1: Consumers are aware and knowledgeable about the GDPR

5.3 Hypothesis 2: Consumers lack awareness and knowledge about the regulator

5.4 Hypothesis 3: Consumers feel their privacy is better since GDPR was introduced

5.5 Hypothesis 4: Companies have responded to GDPR and made changes

5.6 Hypothesis 5: Employees lack awareness of the GDPR regulator at work

5.7 Hypothesis 6: Employees have seen little benefits to their company from GDPR

5.8 Research question: GDPR: Is it worth it? and 5.9 A regression model based on the dual professional-consumer perspective
Discussion and 6.1 High consumer awareness and knowledge of the GDPR

6.2 Respondents lacked a formed opinion and 6.3 GDPR has driven changes

6.4 Perceptions of privacy have improved and 6.5 The profile of the regulator may not matter

6.6 Regulator Enforcer and 6.7 GDPR is worth it if...

6.8 Implications

6.9 Limitations and future work
Conclusion, Funding and Disclosure Statement, and References

4 METHODS

The hypotheses underlying the research question lend themselves to qualitative and quantitative analysis. Interview and experimental methods were considered and discounted. Recruiting individuals at scale who have been in continuous employment for over five years from diverse organisations is, unfortunately infeasible. Thus, a survey-based method was both a sensible and a realistic option.

An early key decision was to limit the survey to the UK. The GDPR may be the same across the EU but the composition of the national regulators and how they implement the regulation are very different for historic reasons. Expanding it to mainland Europe may seem an attractive opportunity to consider a broader cross-cultural perspective but it also came at the cost of introducing too many variables into the study

4.1 Design

The survey was developed in three phases: a test to check interest, a test to check potential population size, and the final study. Participants were recruited via Prolific, an on-demand platform for connecting researchers with volunteers worldwide. The data was collected using the Qualtrics survey platform between the 30th of May and the 16th of June 2022.

In phase #1, N=10, we used a fast one-minute survey to test the strength of interest in the topic, as we noted some research topics lay ignored for weeks on the platform. In phase #2, N=273, we used a longer three-minute survey to confirm there were enough individuals with relevant experiences on Prolific to warrant a full study. To ensure respondents had worked pre-and post-GDPR in the workplace, we pre-screened respondents to have at least 5 years of tenure with the same organisation. We also asked if they had heard of the initials GDPR. If they answered the initials were unfamiliar to them (only 7% of respondents), they were paid, thanked and dropped from the survey. It did provide a measure of unfamiliarity or basic unawareness of the GDPR. We found respondents answered the survey suspiciously quickly, so we redesigned the survey to add more nonsense and attention checks, repeated and reworded some questions to measure response consistency. The final survey design can be found in Appendix C.

Based on the responses of phase #2, we expect to find a mediumsized effect (≈ 0.5) in the main study. For one-tailed repeated measures t-tests at a significance criterion of 𝛼 = 0.05, the minimum sample size is 45 participants. To give room for Bonferroni corrections a sample of 90+ should allow us to achieve statistically significant and generalizable results.

Thus, in phase #3, we recruited N=102 participants. A representative demographic distribution of the UK was enforced in sampling from the phase #2 database to make the research findings more generalisable. The final sample shows an appropriate distribution in terms of gender, age and education compared to census data and consists of 51 female and 51 male respondents with an average age of 45 years for both sexes. Answering the final survey took 10 minutes on average. Seven participants failed exactly 1 of the 7 nonsense, attention and consistency checks. No one failed more than one, so we did not exclude any responses from our analysis. A statistical analysis of mouse and keyboard browser events showed that participants paused and answered questions thoughtfully. This proves we have high-quality responses.

4.2 Data Analysis

The survey consisted of open, closed, slider and multiple-choice questions on a 7-point Likert scale. It focused on the six hypotheses before finishing with the central research question. While the main part of the survey was framed neutrally, we ended with a series of provocative statements to flush out their emotional reaction to the GDPR.

The qualitative analysis of the free text responses was informed by the Braun and Clarke six-step process 2006 and the Williams and Moser art of coding and thematic exploration 2019. The first author coded all the data, following an open-axial-selective coding process. An open, primarily inductive process was used to develop an initial codebook. The codes were discussed and refined with the second author in weekly meetings. Any inter-coder differences of interpretation were resolved by discussion. This process eventually led to the themes presented in this work. The codebooks and statistical distribution are in Appendix A.

The quantitative analysis was executed in Python. The precise statistical tests for each question are described in the Analysis and Results section. The data and reproducible analysis pipeline is available at osf.io[1].

4.3 Ethical considerations

The authors’ departmental Research Ethics Committee approved this study. The online survey is designed to include pseudonymity, confidentiality and informed consent. The study does not identify individual participants. We do not ask questions that could identify the organisations (or the individuals themselves). The participants were aware of the research’s purpose, the researchers involved, and their role in it. Participants were offered compensation at a rate of £10 per hour for participating.

Authors:

(1) Gerard Buckley, University College London, UK ([email protected]);

(2) Tristan Caulfield, University College London, UK ([email protected]);

(3) Ingolf Becker, University College London, UK ([email protected]).

This paper is available on arxiv under CC BY 4.0 DEED license.