Skip to content

Latest commit

 

History

History
405 lines (300 loc) · 20 KB

get_started.md

File metadata and controls

405 lines (300 loc) · 20 KB
layout title
homepage
Get started with PrivacyStreams

Table of contents:

Installing PrivacyStreams

To use PrivacyStreams in your Android app, you can either install it with Maven/Gradle or from source.

Installing with Gradle

This is the most convenient way to install PrivacyStreams for most Android developers. Simply add the following line to build.gradle file under you app module.

dependencies {
    compile 'io.github.privacystreams:privacystreams-android-sdk:0.1.7'
    ...
}

Installing from source

If you are a contributor or want more flexibility, you can install PrivacyStreams from source code. For example, in Android Studio, the installation involves the following steps:

  1. Create a new project from Github in Android Studio.
    • Click File -> New -> Project from version control -> GitHub;
    • In Git repository URL field, input https://github.com/PrivacyStreams/PrivacyStreams-Android, and click Clone);
  2. In the new project, create a new module (your app module).
    • Click File -> New -> New module....
  3. Open the build.gradle file of the new module, add the following line to dependencies:
    • compile project(':privacystreams-core')

Quick examples

Accessing and processing personal data with PrivacyStreams is simple. Let me show you two examples.

Querying random data

void foo(Context context) {
    UQI uqi = new UQI(context);  // Initialize a UQI (unified query interface) instance.
    
    // Get random MockItem stream for testing.
    uqi.getData(MockItem.asRandomUpdates(10, 10.0, 100), Purpose.TEST("Testing first data query."))
       .limit(10)                // Limit the number of Items to at most 10.
       .debug();                 // Print the items for debugging.
}

UQI stands for "unified query interface", which is the most important class in PrivacyStreams.

The above code constructs a UQI instance, accesses a mock data stream and prints 10 items. The data being accessed is a stream of randomized data specified by MockItem.asRandomUpdates(10, 10.0, 100). Each item in this stream is a map of some random values. The definition of MockItem's format can be found here.

The second parameter of uqi.getData() specifies the purpose of data access. Explaining the purpose can help users understand your permission request, hence better for privacy. We suggest you carefully define the purpose in your app. We have several purpose categories (such as Purpose.ADS(..), Purpose.HEALTH(..), etc.) for you to select.

If you execute the above code (for example, calling foo(MainActivity.this)), you will get some output in logcat like:

D/PrivacyStreams: {y=1, z=5.245425734292725, x=1, id=0, time_created=1489529999937}
D/PrivacyStreams: {y=8, z=5.4303601061807, x=8, id=1, time_created=1489528265617}
D/PrivacyStreams: {y=4, z=0.7657566725249387, x=4, id=2, time_created=1489528265718}
D/PrivacyStreams: {y=5, z=0.49851207499276406, x=5, id=3, time_created=1489528265819}
D/PrivacyStreams: {y=0, z=3.1471844164445564, x=0, id=4, time_created=1489528265920}
D/PrivacyStreams: {y=6, z=6.541989969401724, x=6, id=5, time_created=1489528266021}
D/PrivacyStreams: {y=1, z=5.484224955776141, x=1, id=6, time_created=1489528266122}
D/PrivacyStreams: {y=8, z=0.01880078580959288, x=8, id=7, time_created=1489528266224}
D/PrivacyStreams: {y=3, z=5.170116507338301, x=3, id=8, time_created=1489528266325}
D/PrivacyStreams: {y=2, z=3.3222939911622795, x=2, id=9, time_created=1489528266427}

Getting recent called contacts

Here is a more realistic example: getting a list of recent-called phone numbers.

List<String> recentCalledNumbers = 
    uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your recent called contacts."))
       .filter(Phonecall.TYPE, "outgoing")  // Only keep the outgoing call logs
       .sortBy(Phonecall.TIMESTAMP)         // Sort the call logs according to timestamp, in ascending order
       .reverse()                           // Reverse the order, now the most recent call log comes first
       .limit(10)                           // Keep the most recent 10 logs
       .asList(Phonecall.CONTACT)           // Output the CONTACT field (the phone number) to list

The above code accesses the call logs with Phonecall.asLogs() and processes the call logs with functions like filter, sortBy, etc.

The functions operate on the item fields, and the fields of Phonecall item are shown as follows:

Reference Name Type Description
Phonecall.TIMESTAMP "timestamp" Long The timestamp of when the phonecall is happened.
Phonecall.CONTACT "contact" String The contact (phone number or name) of the phonecall.
Phonecall.DURATION "duration" Long The duration of the phonecall, in milliseconds.
Phonecall.TYPE "type" String The phonecall type, could be "incoming", "outgoing" or "missed".

Note that "Reference" is the equivalence to "Name", i.e. filter(Phonecall.TYPE, "outgoing") is the same as filter("type", "outgoing").

  • About permissions. Accessing call logs requires READ_CALL_LOG permission in Android. To use the above code, you need to request the permission in AndroidManifest.xml and handle the exception if the permission is denied by user. For example:

    In your AndroidManifest.xml:

    ...
      <uses-permission android:name="android.permission.READ_CALL_LOG" />
      
      

    In your Java code:

      try {
          List<String> recentCalledNumbers = 
              uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your closest friends."))
                 . ...  // filter, sortBy, etc.
                 .asList(Phonecall.CONTACT)
      } catch (PrivacyStreamsException e) {
          if (e.isPermissionDenied()) {
              String[] deniedPermissions = e.getDeniedPermissions();
              ...
          }
      }

That's it! More details about exception handling will be discussed in Permissions and exception handling section.

PrivacyStreams API

This section will explain the details about PrivacyStreams pipeline with a more complicated example.

Suppose we want to do the following programming task with PrivacyStreams:

  • Get the phone number that has the most phonecalls with the user in the past year.

The code to do the task with PrivacyStreams is as follows:

String mostCalledContact = 
     uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your closest contact."))                      // Get a stream of call logs
        .transform(Filters.keep(TimeOperators.recent(Phonecall.TIMESTAMP, Duration.days(365))))            // keep the call logs in recent 365 days
        .transform(Groupers.groupBy(Phonecall.CONTACT))                                                    // group by contact (phone number)
        .transform(Mappers.mapEachItem(ItemOperators.setGroupField("#calls", StatisticOperators.count()))) // set "#calls" to the number of logs in each group
        .transform(Selectors.select(ItemsOperators.getItemWithMax("#calls")))                              // select the item with largest "#calls"
        .output(ItemOperators.<String>getField(Phonecall.CONTACT));                                        // get the contact field of the item

Looks messy? Don't worry, next I will show you what happened and how to simplify it.

Uniform query interface (UQI)

In PrivacyStreams, all types of personal data can be accessed and processed through the unified query interface (UQI).

UQI.getData(Provider, Purpose)[.transform(Transformation)]*.output(Action)

The query describes a PrivacyStreams pipeline, which is a sequence of three types of functions, including:

  • 1 data providing function (i.e. Provider) that gets raw data from data sources and converts it to a stream in standard format.
    • For example, Phonecall.asLogs() convert raw call logs in Android database to a stream of Phonecall items;
  • N (N=0,1,2,...) transforming functions (i.e. Transformations), each of them takes a stream as input and produce another stream as output.
    • For example, filter(Phonecall.TYPE, "outgoing") filters the stream and only keeps the items whose TYPE is "outgoing";
  • 1 data outputting function (i.e. Action), which outputs the stream as the result needed by the app.
    • For example, asList(Phonecall.CONTACT) collect the CONTACT field of items to a list.

The Transformation and Action functions are based on a lot of operators, including comparators, arithmetic operators, etc.. For example, Filters.keep() is a Transformation, and it accepts operator TimeOperators.recent() as a parameter, meaning it only keeps the items whose TIMESTAMP field is a recent time.

  • The full list of available data types and corresponding providers is at here;
  • The full list of available providers, transformations, actions and operators is at here.

Simplifying the code

In practice, the nested functions may be redundant, thus we wrap some common function combinations to one function for simplicity. For example:

  • .transform(Filters.keep(xxx)) can be simplified as .filter(xxx);
  • .transform(Groupers.groupBy(xxx) can be simplified as .groupBy(xxx);
  • .transform(Mappers.mapEachItem(ItemOperators.setGroupField(xxx))) can be simplified as .setGroupField(xxx);
  • ...

With the simplification, the code in the above example can be written as:

String mostCalledContact = 
     uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your closest contact."))
        .filter(TimeOperators.recent(Phonecall.TIMESTAMP, Duration.days(365)))
        .groupBy(Phonecall.CONTACT)
        .setGroupField("#calls", StatisticOperators.count())
        .select(ItemsOperators.getItemWithMax("#calls"))
        .getField(Phonecall.CONTACT);

If you use static import, the code can be even briefer. For example, with import static io.github.privacystreams.commons.time.TimeOperators.recent;, you can simplify TimeOperators.recent(xxx) with recent(xxx). With static import, the above code can be simplified as:

String mostCalledContact = 
     uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your closest contact."))
        .filter(recent(Phonecall.TIMESTAMP, Duration.days(365)))
        .groupBy(Phonecall.CONTACT)
        .setGroupField("#calls", count())
        .select(getItemWithMax("#calls"))
        .getField(Phonecall.CONTACT);

PrivacyStreams pipeline

The figure below shows the overview of a PrivacyStreams pipeline:

PrivacyStreams overview

The basic data types in PrivacyStreams are Item and Stream.

  • Item is an element in a Stream. All Items are in a map format, in which each key-value pair represents the name and value of a field.

    • Each kind of personal data has a list of pre-defined fields. Below is an example of call log Item:
      // An example of call log Item.
      {
          "timestamp": 1489528267720,
          "contact": "14120001234",
          "type": "outgoing",
          "duration": 30000
      }
  • Stream is what being produced, transformed and outputted in a PrivacyStreams pipeline, and a Stream is a sequence of Items. In PrivacyStreams, we have two kinds of Streams:

    1. MStream (short for multi-item stream) contains multiple items.
      • For example, the “call log stream” (Phonecall.asLogs()) contains many phonecall items, and the stream of location updates contains many location items;
    2. SStream (short for single-item stream) contains only one item.
      • For example, the “last-known location stream” (GeoLocation.asLastKnown()) only contains one location item.

    The fine-grained data processing state machine is as follows:

PrivacyStreams data processing state machine.

The pipeline of the running example is illustrated as follows (note that some field names are simplified and the field values are mocked):

An pipeline illustration of the code in the example.

Reusing streams

Sometimes you may need to reuse a stream for different actions. For example, in the above example, if we also want to get the phone number that has the longest total phonecall duration, we may need to reuse the call log stream.

We provide a method fork(int) to support stream reusing, where the int parameter means the number of reusable times.

MStreamInterface streamToReuse = 
              uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your closest contact."))
                 .filter(recent(Phonecall.TIMESTAMP, Duration.days(365)))
                 .groupBy(Phonecall.CONTACT)
                 .fork(2);  // fork current stream to reuse twice.
        
String mostCalledContact = 
    streamToReuse.setGroupField("#calls", count())
                 .select(getItemWithMax("#calls"))
                 .getField(Phonecall.CONTACT);
                 
String longestCalledContact = 
    streamToReuse.setGroupField("durationOfCalls", sum(Phonecall.DURATION))
                 .select(getItemWithMax("durationOfCalls"))
                 .getField(Phonecall.CONTACT);

Non-blocking pipeline

So far I have shown how to build a blocking pipeline (the pipeline will block the execution until the result returns).

In Android, non-blocking pipelines might be more common. A non-blocking pipeline will NOT pause the code execution, and its result will be returned asynchronously.

PrivacyStreams provides many callback Actions (in Callbacks class) and callback-based collector Actions (in Collectors class) for building non-blocking pipeline.

  • For example, following code will not block, and each item will be printed asynchronously.

    • .debug() is the equivalence of .output(Callbacks.forEach(ItemOperators.debug())).
       uqi.getData(MockItem.asRandomUpdates(10, 10.0, 100), Purpose.TEST("Testing mock data query."))
          .debug();
  • The "most-called contact" example can also be implemented as non-blocking.

    • .output(getField(), callback) is the equivalence of .output(Collectors.collectItem(getField(), callback))
       uqi.getData(Phonecall.asLogs(), Purpose.SOCIAL("finding your closest contact."))
          .filter(recent(Phonecall.TIMESTAMP, Duration.days(365)))
          .groupBy(Phonecall.CONTACT)
          .setGroupField("#calls", count())
          .select(getItemWithMax("#calls"))
          .output(ItemOperators.<String>getField(Phonecall.CONTACT), new Callback<String>() {
              @Override
              protected void onInput(String contact) {
                  System.out.println("Most-called contact: " + contact);
              }
          });

That's it. When you are developing you app, select either blocking or non-blocking pipeline to fulfill your need.

Permissions and exception handling

Sometimes the pipeline may be failed due to exceptions, such as InterruptedException, PermissionDeniedException, etc.

In PrivacyStreams, exception handling is extremely easy for both blocking pipeline and non-blocking pipeline.

Handling exceptions in blocking pipelines

For blocking pipelines, simply put your query in a try block and catch PrivacyStreamsException. For example:

    try {
        result = uqi.getData(...).transform(...).output(...);
    } catch (PrivacyStreamsException e) {
        System.out.println(e.getMessage());
    }

Handling exceptions in non-blocking pipelines

For non-blocking pipelines, simply override the onFail(PrivacyStreamsException e) method in your result handler. For example:

     uqi.getData(...)
        .transform(...)
        .output(..., new Callback<Object>() {
            @Override
            protected void onInput(Object result) {
                ...
            }
            
            @Override
            protected void onFail(PrivacyStreamsException e) {
                System.out.println(e.getMessage());
            }
        });

Permission configuration

In Android, personal data is controlled with a permission-based access control mechanism. Android apps need to declare permissions in AndroidManifest.xml. For Android 6.0+, apps must request permissions at runtime, including checking whether permissions are granted, prompting users to grant the permissions and handling users' access control decisions. With Android standard APIs, these are often headache.

In PrivacyStreams, configuring permissions can be much easier. Follow the steps below:

  1. Write your pipeline, and cache the exception;
  2. Print the exception, and you will see which permissions are needed;
  3. Add the needed permissions to AndroidManifest.xml.

That's it. PrivacyStream will automatically generate a dialog to ask users to grant permissions. If not granted, there will be a PrivacyStreamsException.

Debugging and testing

PrivacyStreams provides some simple interfaces to support debugging and testing.

Mocking data source

You can mock a data source using MockItem class for debugging and testing. For example:

  • Mocking a stream with random items.
    • MockItem.asRandomUpdates() can provide a live MStream that produces random items periodically;
    • MockItem.asRandomList() can provide a MStream that produces a list of random items in a batch;
    • MockItem.asRandomInstance() can provide an SStream that contains a random item.
  • Mocking a stream from a file.
    • uqi.getData(...).transform(...).archiveTo("/sdcard/data.json") will record the stream to a file;
    • uqi.getData(MockItem.fromArchive("/sdcard/data.json"), ...) will load and replay the stream from the file.

Printing the streams

Most data types support serialization, i.e. you can easily print the streams and see what happens.

  • For example, if you have a N-step pipeline uqi.getData(...).step1(...).step2(...)....stepN(...)), you can print any step you want to see what is going on.
    • uqi.getData(...).debug();
    • uqi.getData(...).step1(...).debug();
    • uqi.getData(...).step1(...).step2(...).debug();
    • ...

Read more

For more information about PrivacyStreams APIs, please refer to: