The European Data Protection Supervisor (EDPS) has developed open source software tools for the automation of privacy and personal data protection inspections of websites.
The EDPS releases its tool Website Evidence Collector under the European Union Public License (EUPL-1.2). The software is available for download via this webpage (see download link below), on the European Commission’s collaborative platform Joinup and on the popular development platform GitHub.
The EDPS welcomes any feedback and suggestions for improvements to be sent to: email@example.com
Website Evidence Collector
The tool collects evidence of personal data processing, such as cookies, or requests to third parties. The collection parameters are configured ahead of the inspection and then collection is carried out automatically. The collected evidence, structured in a human- and machine-readable format (YAML and HTML), allows website controllers, data protection officers and end users to understand better which information is transferred and stored during a visit of a website, i.e. the consecutive loading of a number of web pages without giving consent or logging in.
The tool starts Chromium, i.e. a stripped down open source version of the Chrome browser, with a new user profile and loads all web pages included in the visit one after another with no further user interaction. During the visit, the tool collects amongst others:
1. web page screenshots
2. list with HTTP links from the entry web page, categorised by:
- a. internal link (same website),
- b. external link
- c. link to social networks and collaboration services
3. list of visited web pages
4. information stored in HTML5 local storage (including the responsible web page and component causing processing)
5. all cookies in the browser profile (including the responsible web page and component causing processing)
6. the HTTP traffic between the browser and the Internet as HAR file, in particular
- a. list of requests identified by EasyPrivacy filter list to cause behaviour tracking (including the responsible web page)
- b. list of requested first- and third-party hosts
7. all messages exchanged via Web Sockets (alternative transmission method to HTTP requests)
Compatibility and Installation
The Website Evidence Collector should be compatible with Windows, MacOS, Linux and all platforms that support NodeJS and Chromium. However, the EDPS has run the Website Evidence Collector only on Linux and MacOS. For an installation on MacOS and Linux without administrator privileges, please follow our advice in the FAQ on Github or as part of the software download.
To use the tool, you will first need to install Node.js and the Node.js package manager (NPM). You then have two options:
a) obtain the package file linked to below, extract the archive and follow the instructions in the README.md file.
b) install the package directly from the command line by following the instructions on our dedicated GitHub page, see video tutorial here below for a demonstration.
Please read carefully the EUPL licensing conditions. As stipulated in its Section 7, the EDPS provides this tool on an ‘as is’ basis and without warranties of any kind, including fitness for a particular purpose, absence of defects or errors, or accuracy.
With the new EU data protection legislation applicable, in particular Regulation (EU) 2016/679 (General Data Protection Regulation, GDPR) and the Data Protection Regulation (EU) 2018/1725 for Union institutions, bodies, offices and agencies, many websites have updated their privacy consent management mechanisms and rethought their personal data processing operations. This change, plus personal data breaches on websites, led to an increasing public awareness on privacy issues of websites and resulted in an increasing number of complaints to supervisory authorities.
This EDPS tool allows laypersons after a brief introduction to gather evidence on personal data processing operations of websites using a reproducible, reliable, and fast method. No third-party cloud service is involved to gather evidence. The tool is self-consistent and can be used in intranets without internet access. The open software license allows experts to adapt the tools to their own needs.