End-to-end encryption with CloudKit

It seems more and more Apple provided iCloud features are using end-to-end encryption. In iOS 13 for example, protection is extended to Safari history and open tabs. I'm wondering if there's a good way to support end-to-end encryption in our own apps built on CloudKit.


As explained in the iCloud security overview:

End-to-end encryption provides the highest level of data security. Your data is protected with a key derived from information unique to your device, combined with your device passcode, which only you know. No one else can access or read this data.

One approach that comes to mind would be to generate a symmetric key (such as AES-GCM) on the device and put this in the user's keychain to safely share it with the user's other devices (to avoid the need for a separate passphrase or other custom key sharing mechanism). You could then use that key to encrypt

CKRecord
fields and
CKAttachment
s. That means you lose type safety of individual fields and can no longer rely on CloudKit's support for indexes or
CKReference
s, but I don't see a way around that.


This is mostly out of curiosity, but would this approach make sense? Does it resemble Apple's use of end-to-end encryption, or does that rely on CloudKit features that aren't available to third-party developers?

Replies

> Does it resemble Apple's use of end-to-end encryption, or does that rely on CloudKit features that aren't available to third-party developers?


I suspect that Apple would de-encrypt the data before storing in CloudKit so that queries would work. They would do that with private APIs.

Well, if they do that it wouldn't be end-to-end encryption I think, since the whole idea of that is that even Apple wouldn't have access to the key or the decrypted data.

This is my understanding of encryption - I am only an amateur here.


End-to-end encryption encrypts between 'ends'. If you are storing information in CloudKit and wish to use queries in CloudKit then CloudKit becomes one of your 'ends'. The data subject to queries searches must be decrypted. I see you have referenced that in your original post. (Interestingly, it may be possible to maintain certain fields as unencrypted and query-able and sort-able while keeping other fields encrypted.)


Encryption and decryption techniques are available in OpenSSL. You can encrypt end-to-end using a private key and its associated public key. You send the public key to anyone and ask them to use that key to encrypt their message to you. They do that. Only you can decrypt such a message using the private key associated with that public key. They are one end, you are the other. In the cases you reference, you are both ends but the same techniques are used because you need to transport your (public) key from one device to another.


I think Apple transports CloudKit data from a device to CloudKit using end-to-end encryption. But it decrypts the data before storing it in the database - it must do that in order to allow queries and sorts. It does the same thing when returning data from CloudKit to a device; encypting before it leaves CloudKit and decrypting it when it arrives at a device. If you add your encryption to that then the data will be doubly encrypted and it will be stored on CloudKit singly encrypted.


Security of databases is one thing. Security of transmissions from database to devices is another. The later is commonly hacked, the former less so.

Thanks for taking the time to write down your thoughts on this. As far as I know, end-to-end encryption is different both from encryption in transit (which is where SSL would come in) and at rest on the server. CloudKit always uses transport security and applies various levels of encryption to the data stored on the server. But ultimately, Apple still holds the keys and has the ability to decrypt stored data. Users worry more and more about the implications of that, because it leaves them vulnerable to hacking and government interference, which is where end-to-end encryption comes in. The idea there is that no one besides the owner of the data (or sender and recipients in the case of messages) holds the key, so not even Apple can decrypt it. That fits with Apple's focus on privacy, which I believe is why protection is extended to Safari history and open tabs in iOS 13 for example.


I came across a pretty thorough iOS Security Guide that explains in detail how this works for iCloud Keychain and iMessages, and which also includes a discussion of CloudKit end-to-end encryption:

Many Apple services, listed in the Apple Support article “iCloud security overview” (https://support.apple.com/HT202303), use end-to-end encryption with a CloudKit Service Key protected by iCloud Keychain syncing. For these CloudKit containers, the key hierarchy is rooted in iCloud Keychain and therefore shares the security characteristics of iCloud Keychain—the keys are available only on the user’s trusted devices, and not to Apple or any third party. If access to iCloud Keychain data is lost (see “Escrow security” section later in paper), the data in CloudKit is reset; and if data is available from the trusted local device, it is re-uploaded to CloudKit.

That still leaves me with questions however, because it seems encrypting individual fields makes working with records cumbersome and loses type safety. So I'm wondering if there's another (private) mechanism that allows encryption of whole records with a client-held key.

OpenSSL can be used to encrypt from any one point to any other point. The problem is that database query and sort functions cannot operate as expected on encrypted data - they require decrypted data. As I wrote above (".....maintain certain fields as unencrypted and query-able and sort-able while keeping other fields encrypted....") you could divide the data into two packets leaving one encrypted in the database and the other decrypted in the database and therefore available for queries and sorts. Since Apple would encrypt both packets in transit to CloudKit, you do not need extra encryption for that second data packet.


So what you could do is:

1) generate a public and private key in device 1. Post the public key in CloudKit

2) device 2 would download device 1's public key

3) device 2 would encrypt data packet A (the message) using the public key and send it along with data packet B (e.g. the To: field) to CloudKit.

4) Apple sends both A and B encrypted to CloudKit. Apple will decode packet B and store it exposed on CloudKit. Apple will decrypt packet A but it will then still be encrypted because of #3. Apple would be able to read packet B but not packet A.

4) device 1 would query CloudKit and detect data packet B

5) device 1 would download the encrypted data packet A and decrypt it with the private key.


End-to-end differs from 'encryption in transit' because encryption in transit usually refers to what my internet service provider is doing with my transmissions from their port to the port on your internet provider. That means the guy sitting next to me (or you) in Starbucks can intercept my transmission and read it. In end-to-end the data is encoded as it goes out of my device and so that guy can't read it. Where a storage database is in 'end-to-end' is unclear. I believe it is an 'end' not a 'to'.