Wednesday, January 6, 2016

A note on sensor2 dequeue performance

I've examined sensor2 dequeue performance. Some interesting observations indeed!
  • A single dequeue loop (1250 samples for 25 seconds) time takes a bit over 7 seconds
  • A little under 1 second of this time is getting data from CMSensorRecorder
  • Around 4 seconds is required to prepare this data
  • The time to send the samples to DynamoDB depends on the network configuration:
    • 3 - 6 seconds when the Watch is proxying the network through iPhone using LTE network (with a few bars of signal strength)
    • 2 - 4 seconds when the Watch is proxying the network through iPhone (6s plus) and my home WiFi
    • Around 1.5 seconds when the Watch is directly connecting to network using home WiFi

Speeding up the data preparation will help some.  I will set a goal of 1 second:
  • Hard coded JSON serializer
  • Improvements to the payload signer
  • Reduce the HashMap operations (some clever pivoting of the data)

Monday, January 4, 2016

New "serverless" site to explore sensor data

I have updated the UI to parse and render the data from the new data model. You can try it out here.


Recall the data flow is:

  • CMSensorRecorder is activated directly on the Watch
  • When the application's dequeue is enabled, the dequeued events are:
    • Parsed directly into our DynamoDB record format
    • Directly sent to DynamoDB from the Watch
And this pure static website directly fetches those records and pivots the data in a vis.js and d3.js format for display.

Next up:

  • Get AWS Cognito into the loop to get rid of the long lived AWS credentials
  • Work on the iOS framework memory leaks
  • Speed up the dequeue (or, resort to a Lambda raw data processor)

Sunday, January 3, 2016

Progress: CMSensorRecorder directly to DynamoDB

Relative to before, the pendulum has swung back to the other extreme: a native WatchOS application directly writing to AWS DynamoDB.  Here, we see a screen grab with some events being sent:



This has been an interesting exercise. Specifically:
  • With iOS 9.2 and WatchOS 2.1, development has improved
  • However, I can't yet get the AWS iOS SDK to work on the Watch directly
  • So, I have instead written code that writes directly to DynamoDB
    • Including signing the requests
    • Including implementing low level API for batchWriteItem and updateItem
  • I have also redone the application data model to have a single DynamoDB row represent a single second's worth of data with up to 50 samples per row
    • Initially, samples are indexed using named columns (named by the fraction of a second the sample is in)
    • Later this should be done as a more general documentDB record
    • This approach is a more efficient use of DynamoDB -- provisioning required is around 2 writes/second per Watch that is actively dequeuing (compared to 50 writes/second when a single sample is stored in a row)
  • This application also uses NSURLSession directly
  • This means that the Watch can send events to DynamoDB using configured WiFi when the iPhone is out of range!
  • I have also redone the command loop using GCD dispatch queues (instead of threads)
Anyway, it appears to be doing the right thing. Data is being recorded in CMSensorRecorder, the dequeue loop is processing data and transmitting up to 1250 samples (25 seconds) of data per network call. The custom request generator and call signing are doing the right thing. Perhaps a step in the right direction? Not quite sure:
  • I see that the actual on-Watch dequeue processing takes about 6 seconds for 25 seconds worth of data. Since all of the data preparation must occur on the Watch (there is no middle man), the additional work of pivoting the data, preparing the DynamoDB request are borne by the Watch.
  • Profiling shows the bulk of this processing time is in JSON serialization!
  • Another approach would be minimal processing on the Watch.  e.g. "dump the raw data to S3" and let an AWS Lambda take care of the detailed processing. This is probably the best approach although not the cheapest for an application with many users.
  • I'm now running tests long enough to see various memory leaks! I've been spending a bit of time with the memory allocator tools lately...
    • I have run into a few with the NSURLSession object
    • The JSON serializer also appears to leak memory
    • Possibly NSDateFormatter also is leaking memory
Here's what a dequeue loop looks like in the logs. You can see the blocks of data written and the loop processing time:

Jan  3 21:05:11 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: dequeueLoop(1)
Jan  3 21:05:11 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: flush itemCount=23, minDate=2016-01-04T05:01:54.557Z, maxDate=2016-01-04T05:01:54.998Z, length=2621
Jan  3 21:05:12 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: data(Optional("{}"))
Jan  3 21:05:13 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: commit latestDate=2016-01-04 05:01:54 +0000, itemCount=23
Jan  3 21:05:13 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: dequeueLoop(2)
Jan  3 21:05:13 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: flush itemCount=49, minDate=2016-01-04T05:01:55.018Z, maxDate=2016-01-04T05:01:55.980Z, length=5343
Jan  3 21:05:14 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: data(Optional("{}"))
Jan  3 21:05:14 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: commit latestDate=2016-01-04 05:01:55 +0000, itemCount=72
Jan  3 21:05:15 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: dequeueLoop(3)
Jan  3 21:05:20 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: flush itemCount=1250, minDate=2016-01-04T05:01:56.000Z, maxDate=2016-01-04T05:02:20.988Z, length=88481
Jan  3 21:05:23 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: data(Optional("{\"UnprocessedItems\":{}}"))
Jan  3 21:05:23 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: commit latestDate=2016-01-04 05:02:20 +0000, itemCount=1322
Jan  3 21:05:23 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: dequeueLoop(4)
Jan  3 21:05:30 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: flush itemCount=1249, minDate=2016-01-04T05:02:21.008Z, maxDate=2016-01-04T05:02:45.995Z, length=88225
Jan  3 21:05:32 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: data(Optional("{\"UnprocessedItems\":{}}"))
Jan  3 21:05:32 Gregs-AppleWatch sensor2 WatchKit Extension[152] <Warning>: commit latestDate=2016-01-04 05:02:45 +0000, itemCount=2571

And here is what a record looks like in DynamoDB. This shows the columnar encoding of a few of the X accelerometer samples:


I have a checkpoint of the code here. Note that this code is somewhat hard coded for writing only to my one table with only AWS authorizations to write.

TODO:
  • Update the UI to help explore this data
  • See if there is a more efficient use of the JSON serializer
  • Examine some of the framework memory leaks
  • Try to speed up the dequeue to be better than 6 seconds of wall clock for 25 seconds of data.