Managing Duplicate Objects in Core Data (or SwiftData) with CloudKit Sync When Devices were Offline during object creation

Suppose I have two iPhones that are offline. On the first iPhone, at 1 PM, I create a Person object with the details: name: "John", lastName: "Smith", and age: 40. Then, on the second iPhone, which is also offline, I also create Person object at 2 PM with the same name: "John" and lastName: "Smith", but with a different age: 30.

Both iPhones come online at 3 PM and sync with CloudKit. I would expect CloudKit to reconcile these two records and end up with a single record—specifically, Person(name: "John", lastName: "Smith", age: 30), assuming a "last writer wins" approach.

Any guidance or best practices for handling this situation would be greatly appreciated!

My idea is that I could generate a 128bit UUID as hash from first name and last name and then I would have to force this UUID to be used as recordName in CKRecord as this would trigger a conflict on CloudKit side and prevent two instance to be created. But how do I accomplish this with SwiftData or CoreData?

Answered by rkhamilton in 799614022

You will need to handle this yourself in your code to remove the duplicate record. You can search for Core Data “deduplication” in your favorite search engine.

I posted this thread asking for advice for how to solve this in Swift Data but at the time it was not possible. With the new iOS 18 history processing it may be possible, but I haven’t looked into it yet. That session mentions identifying changes that originate in other processes like widgets, but does not explicitly mention Cloud-originated changes. If anyone has tried this I’d like to hear if it works to identify CloudKit changes.

If you are using Core Data there are good tools to process data changes that originate on another device. Apple’s best article on the topic is here and it should give you all of the code you need to solve this for a Core Data application.

The short version is that in your example scenario you will see that both records are created, and both records will be synced to CloudKit, and both records will be sent to both devices. You can set up rules that will process incoming data that originated from a CloudKit sync, and identify if they are appropriate to preprocess in some way (such as by removing duplicates according to whatever business logic is appropriate for your app).

You will need to handle this yourself in your code to remove the duplicate record. You can search for Core Data “deduplication” in your favorite search engine.

I posted this thread asking for advice for how to solve this in Swift Data but at the time it was not possible. With the new iOS 18 history processing it may be possible, but I haven’t looked into it yet. That session mentions identifying changes that originate in other processes like widgets, but does not explicitly mention Cloud-originated changes. If anyone has tried this I’d like to hear if it works to identify CloudKit changes.

If you are using Core Data there are good tools to process data changes that originate on another device. Apple’s best article on the topic is here and it should give you all of the code you need to solve this for a Core Data application.

The short version is that in your example scenario you will see that both records are created, and both records will be synced to CloudKit, and both records will be sent to both devices. You can set up rules that will process incoming data that originated from a CloudKit sync, and identify if they are appropriate to preprocess in some way (such as by removing duplicates according to whatever business logic is appropriate for your app).

Given this constraint, I believe there could be a more elegant sync approach than the "Create duplicates in CloudKit and delete one" method if only it would be possible for me to set recordName as I wish.

There isn't unfortunately. This topic is discussed in Remove duplicate data. SwiftData with CloudKit Sync is based on NSPersistentCloudKitContainer, so the discussion applies to SwiftData as well.

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

I would think you would want the duplicate behavior in this example? I can often have two people with the same first and last name but they are two different people, and the different age could be that distinguishing factor. I'm trying to think, when would you want this merged and not duplicated? Duplicating here seems to be the safe / correct behavior, no?

Managing Duplicate Objects in Core Data (or SwiftData) with CloudKit Sync When Devices were Offline during object creation
 
 
Q