Skip to content

Privacy and GDPR

Klervi edited this page Jul 3, 2019 · 2 revisions


The European Regulation for the Protection of Personal Data (GDPR) was adopted on April 27, 2016 after 4 years of involved negotiations. Being a directly applicable regulation in each of the Member States (that is, not requiring a national law to implement), it should enable the harmonization of the statutes having to do with the protection of personal data within the European Union and bring the principles of protection into line with the realities of the digital era. It will go into effect on May 25, 2018.

See this article about GDPR

Vocal Assistant and AI

In order to process “personal data,” companies must obtain opt-in consent from users:

Consent must be clear and distinguishable from other matters and provided in an intelligible and easily accessible form, using clear and plain language. It must be as easy to withdraw consent as it is to give it.​ Explicit consent is required only for processing sensitive personal data — in this context, nothing short of “opt in” will suffice. However, for non-sensitive data, “unambiguous” consent will suffice.> >

It’s safe to say that these devices will be “processing sensitive personal data” and that explicit consent will be required in every case.

There’s no explicit mention of smart speakers in the GDPR documentation. However, artificial intelligence is addressed to some degree in Article 22 (“Automated individual decision-making, including profiling”), which says:

The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her [unless explicit consent is provided].

Most consumer-facing AI technologies, including smart speakers and self-driving cars, will require explicit opt-in consent in Europe. For Echo or Home, it might be as simple as a verbal statement played upon setup, which asks for the owner to OK use of his or her personal data. Alternatively, there might need to be ongoing or periodic disclosures and consent.

There’s currently a lack of clarity about what will be specifically required from smart speaker makers.

Server

BackOffice

The conversation flow is designed with the IBM Node-RED UI.

  • The UI is secured by an HTTPs access
  • The UI is secured by a BCrypt password
  • The UI can be disabled in production.
  • The UI can be limited to dedicated IP Addresses

Backup

All server backup should be perform with state of the art process.

The conversation flow (and other datas) should be stored in a secured GIT in order to handle revision and helps migration from development to qualification to production.

All credential data are crypted in a dedicated files scoped to a given server state. The cryptographic key MUST NOT be store in the GIT.

Files

Do not serve "secret" files as is from webapp/static/. You should instead

  • generate tokenized filename and/or check if user is logged.
  • or use cloud buckets like Amazon S3

User Data

The conversation flow is designed with variable informations retrieve from 3rd Party APIs.

  • In order to query these APIs, the server MUST store some credential data
  • Some data are cached to reduce API calls (limited to 2000ms by DialogFlow)

The bot designer should design a solution

  • to display CGU
  • to let users see and clean there own data.
  • to allow customers to clean user's data.
  • A clean can be also performed after a given amount of days of inactivity

Logs

The Node-RED server generate a lot of logs:

  • from Debug Nodes
  • from console.log()

Node-RED's settings provides an option to control log level, files, ...

  • Developers should also be careful of data written in logs
  • IT should also rotate or delete log's files.

PM2 provides log management system. Logs can be flushed by CLI.

Database

  • NeDB Database file should be encrypted on the server to ensure security.
  • MongoDB has an option to encrypt it's database files
  • Trusted third party database like CosmoDB (Microsoft) or Firebase (Google) are also secured

Statistics

Thirds Party APIs provides statistics and analytics dashboard BUT data should be anonymized.

  • Facebook analytics directly store Messenger statistics and can receive Events
  • Microsoft Azure AppInsight or Google ChatBase provide an API
  • Elastic search can also be used to perform onPremise analytics (with developments)

REST APIs

The communication between 3rd party APIs and Node-RED MUST use HTTPs. The data privacy policy of these services must be available and accepted while setting up the service.

Microsoft

Most Microsoft APIs can be geolocated in a Europe region.

Microsoft will treat personal information or data it receives from you or your App Integration as described in the Microsoft Privacy Statement, as updated from time to time With respect to any personal information subject to the European Union General Data Protection Regulation (GDPR) that is processed in connection with these Terms, for purposes of the GDPR, you agree that you and Microsoft are independent data controllers. You agree to comply with all relevant parts of the GDPR.

Effective Dec. 13, 2017, Azure Services terms, including the Online Services Terms apply to Bot Framework.

Here is the Microsoft statement about GDPR

Bot Framework

  • Communication between the Node-RED server and Microsoft is secured in HTTPs.
  • A Token is exchange in internal APIs to avoid man in the middle attacks.
  • Microsoft front end is designed to handle various attacks like DDOS.
  • Microsoft servers are scalable, there is no dedicated IP Address to filter with a Firewall

Bot Service will (before the GDPR date in May) operate as a Data Processor from a GDPR perspective.

  • Neither Bot Framework nor Bot Service use chat data for service improvement. A record of the success or fail, timing, etc. metrics for messages sent is kept through the service, but none of the message content itself with the exception of DirectLine/Webchat.
  • DirectLine/Webchat cache conversation for up to 24 hours for reliable delivery in the region for the directline instance. Ie: Europe.directline.botframework.com would cache those results in Europe, encrypted on a per-bot, per-conversation basis.
  • Bot Service is a global azure service (from a channel connectors perspective data). Bot registration data and metadata about conversations for example is kept in the US and cached next to the regional deployments.

LUIS

  • All the LUIS conversations are stored in Microsoft data centers. Microsoft is following bot framework steps and working on being data processor from GDRP perspective.
  • &log=false to with every endpoint request to disable logging

Cognitive Services

Data that is sent to the Cognitive Services is treated differently than other customer data. Microsoft may use Cognitive Services data to improve Microsoft products and services. For example, we may use content that you provide to the Cognitive Services to improve our underlying algorithms and models over time. To do that, we may retain Cognitive Services data after you are no longer using the services. Further information is provided on this page and the Cognitive Services section of the Online Services Terms.

Google

Google explain in a dedicated document all the efforts done to be compliant with GDPR for Google Cloud Platform (GCP) and Google Suite (GSuite). Google also update it's data processing terms (see section 10)

DialogFlow

Terms of DialogFlow Standard Edition

As of 22/11/2017, Google introduced Dialogflow Enterprise Edition, covered by the Google Cloud Platform Terms of Service, including the Data Privacy and Security Terms. Enterprise Edition users are also eligible for Cloud Support packages, and the Enterprise Edition will soon provide SLAs with committed availability levels.

Depending on your agent, conversation logs could include personally identifiable or confidential information and your agent may need to comply with legal or other restrictions. To help with this, you can disable logging for your agent in its settings.

ChatBase

TBW.