Tuesday, February 23, 2016

Wow: Multiple Identity Providers and AWS Cognito

I've finally found time to experiment with multiple identity providers for Cognito. Mostly to understand how a CognitoId is formed, merged, invalidated. It turns out this is a significant finding, especially when this Id is used, say, as a primary key for data storage!

Recall, the original sensor and sensor2 projects were plumbed with Login With Amazon as the identity provider to Cognito. This new experiment adds GooglePlus as a second provider. Here you can see the test platform on the iPhone:

Keep in mind that for this sensor2 application, the returned CognitoId is used as the customer's key into the storage databases. Both for access control and as the DynamoDB hash key.

The flow on the iPhone goes roughly as follows:
  • A user can login via one or both of the providers
  • A user can logout
  • A user can also login using same credentials on a different devices (e.g. another iPhone with the application loaded)
Now here's the interesting part. Depending on the login ordering, the CognitoId returned to the application (on the watch in this case) can change! Here's how it goes with my test application (which includes "Logins" merge)
  • Starting from scratch on a device
  • Login via Amazon where user's Amazon identity isn't known to this Cognito pool:
    • User will get a new CognitoId allocated
  • If user logs out and logs back in via Amazon, the same Id will be returned
  • If the user now logs into a second device via Amazon, the same Id will be returned
  • (so far this makes complete sense)
  • Now, if the user logs out and logs in via Google, a new Id will be returned
  • Again, if the user logs out and in again and same on second device, the new Id will continue to be returned
  • (this all makes sense)
  • At this point, the system thinks these are two users and those two CognitoIds will be used as different primary keys into the sensor database...
  • Now, if the user logs in via Amazon and also logs in via Google, a CognitoId merge will occur
    • One, or the other of those existing Ids from above will be returned
    • And, the other Id will be marked via Cognito as disabled
    • This is a merge of the identities
    • And this new merge will be returned on other devices from now on, regardless of whether they log in solely via Amazon or Google
    • (TODO: what happens if user is logged into Amazon, has a merged CognitoId and then they log in using a second Google credential?)
This is all interesting and sort of makes sense -- if a Cognito context has a map of logins that have been associated, then Cognito will do the right thing. This means that some key factors have to be considered when building an app like this:
  • As with my application, if the sensor database is keyed by the CognitoId, then there will be issues of accessing the data indexed by the disabled CognitoId after a merge
  • TODO: will this happen with multiple devices going through an anonymous -> identified flow?
  • It may be that additional resolution is needed to help with the merge -- e.g. if there is a merge, then ask the user to force a join -- and then externally keep track of the merged Ids as a set of Ids -> primary keys for this user...
Anyway, I'm adding in a couple more providers to make this more of a ridiculous effort. After which I'll think about resolution strategies.


Wednesday, February 17, 2016

TestFlight version of sensor2 is available in Apple Store

The current version of sensor2 is available as a test flight application if you'd like to try it out. This will give you a quick try of the basic sensor capture and display in the minimal website.

Recall it uses Amazon Login as the sole identity provider right now.

Drop me a note, or comment here if you'd like to try the TestFlight version.


Sunday, February 7, 2016

sensor2 code cleanup -- you can try it too

After a bit of field testing, I've re-organized the sensor2 code to be more robust. Release tag for this change is here. Major changes include:
  • The Watch still sends CMSensorRecorder data directly to DynamoDB
  • However, the Watch now asks the iPhone for refreshed AWS credentials (since the AWS SDK isn't yet working on Watch, this avoids having to re-implement Cognito and login-with-amazon). This means that with today's code, the Watch can be untethered from the iPhone for up to an hour and can still dequeue records to DynamoDB (assuming the Watch has Wi-Fi access itself)
  • If the Watch's credentials are bad, empty or expired and Watch can't access the iPhone or the user is logged out of the iPhone part of the app, then Watch's dequeuer loop is stopped
  • Dependent libraries (LoginWithAmazon) are now embedded in the code
  • A 'logout' on the phone will invalidate the current credentials on the Watch
This code should now be a bit easier to use for reproducing my experiments. Less moving parts, simpler design. I'll work on the README.md a bit more to help list the steps to set up.

And finally, this demonstrates multi-tenant isolation of the data in DynamoDB. Here's the IAM policy for logged in users:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "mobileanalytics:PutEvents",
                "cognito-sync:*",
                "cognito-identity:*"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "Stmt1449552297000",
            "Effect": "Allow",
            "Action": [
                "dynamodb:BatchWriteItem",
                "dynamodb:UpdateItem",
                "dynamodb:Query"
            ],
            "Resource": [
                "arn:aws:dynamodb:us-east-1:499918285206:table/sensor2"
            ],
            "Condition": {
                "ForAllValues:StringEquals": {
                    "dynamodb:LeadingKeys": [
                        "${cognito-identity.amazonaws.com:sub}"
                    ]
                }
            }
        }
    ]
}

In the above example the important lines are the condition -- this condition entry enforces that only rows with HashKey the same as the logged in user's cognitoId will be returned. This is why we can build applications with direct access to a data storage engine like DynamoDB!

You can read the details of IAM+DynamoDB here.

Anyway, back to performance improvements of the dequeue process. Everything is running pretty good, but the Watch still takes a long time to get its data moved.