Overlapping Segments and Duplicate Start Times in Workout Data

Hello,

We're encountering overlapping segments and discrepancies when analyzing running/workout data for a 5-mile run. The expected splits for the run are:

  1. 8:39
  2. 9:06
  3. 8:30
  4. 8:39
  5. 8:43
  6. 0:08

However, the raw data includes segments where start times begin before the previous segment ends, and there are duplicate start times. Below is a sample of the raw data:

                        "startDate": "2024-09-09T19:32:00.308-0400",
                        "eventType": "segment",
                        "eventTypeInt": 7,
                        "endDate": "2024-09-09T19:37:56.135-0400"
                    },
                    {
                        "startDate": "2024-09-09T19:32:00.308-0400",
                        "eventType": "segment",
                        "eventTypeInt": 7,
                        "endDate": "2024-09-09T19:41:08.476-0400"
                    },```

// Here's an example of where the second segment start time falls in side the first segments startDate and endDate

"startDate": "2024-09-09T19:54:22.658-0400",
                        "eventType": "segment",
                        "eventTypeInt": 7,
                        "endDate": "2024-09-09T19:59:41.215-0400"
                    },
                    {
                        "startDate": "2024-09-09T19:58:44.624-0400",
                        "eventType": "segment",
                        "eventTypeInt": 7,
                        "endDate": "2024-09-09T20:07:23.216-0400"

The splits you provided seem to be the time per mile, which are different from the segments in a workout event (HKWorkoutEvent). In our documentation, we define segments in the API reference as follows:

"Segment events mark important periods during the workout, while markers identify important points in time.”

That being said, the segments are not necessarily split by mile, and so it is not surprised that the periods are overlapping.

If you need to calculate the time per mile, or other by-distance metrics (like average heart rate / mile or average cadence / mile), you might need to calculate with your own code. Concretely, I'd consider the following steps:

  1. Find your target workout (HKWorkout).
  2. Use the workout to find the .distanceWalkingRunning samples.
  3. Go through the samples to calculate the time frames per mile.

The time frames you get at step 3 should be the data you are looking for. You can give it a try and share if that is the case.

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

Hello, the HKQuantityTypeIdentifierDistanceWalkingRunning is grouped into one and not broken down into laps or segments or markers

Below, please find our query and results.

Here is our query

      guard HKHealthStore.isHealthDataAvailable() else {
          completion(nil, NSError(domain: "HealthKit", code: 1, userInfo: [NSLocalizedDescriptionKey: "HealthKit is not available on this device."]))
          return
      }

      guard let distanceType = HKObjectType.quantityType(forIdentifier: .distanceWalkingRunning) else {
          completion(nil, NSError(domain: "HealthKit", code: 2, userInfo: [NSLocalizedDescriptionKey: "Unable to create distanceWalkingRunning type."]))
          return
      }

    let predicate = HKQuery.predicateForSamples(withStart: startDate, end: endDate, options: .strictStartDate)

      let query = HKSampleQuery(sampleType: distanceType, predicate: predicate, limit: HKObjectQueryNoLimit, sortDescriptors: nil) { _, samples, error in
          if let error = error {
              completion(nil, error)
              return
          }
        
          let runningDistances: NSMutableArray = []
        
          let distanceSamples = samples as? [HKQuantitySample]
          if let distanceSamples = distanceSamples {
              for sample in distanceSamples {
                  let distance = sample.quantity.doubleValue(for: HKUnit.meter())
                  let eventStartDate = self._dateFormatter.string(from: sample.startDate)
                  let eventEndDate = self._dateFormatter.string(from: sample.endDate)
                
                  let dict:[String:Any] = [
                    "startDate":eventStartDate,
                    "endDate":eventEndDate,
                    "distance": distance
                  ]
                
                runningDistances.add(dict)
              }
          }
        
        completion(runningDistances, nil)
        return
      }

      let healthStore = HKHealthStore()
      healthStore.execute(query)
  }

Our results are attached.

"the HKQuantityTypeIdentifierDistanceWalkingRunning is grouped into one and not broken down into laps or segments or markers"

The sample is not supposed to give you a break-down list, and so you will need to calculate with your own code. In your example, you get the following data:

"endDate": "2024-09-19T12:10:36.356Z",
"startDate": "2024-09-19T10:56:18.464Z",
"totalQuantity": 16156.587324428696,
"quantityType": "HKQuantityTypeIdentifierDistanceWalkingRunning"

you would be able to calculate the time per kilometer by doing: (12:10:36.356 - 10:56:18.464) / 16.156587324428696

I'd expect that you would get more samples of the .distanceWalkingRunning type if you run long enough.

Regarding the following:

"Our results are attached."

It seems that the JSON data you provided contains the statistic data tied to a workout HKWorkout, and not the samples that you retrieved using HKSampleQuery, doesn't?

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

I don't think we are on the same page. That run was 10 miles, so it should have been long enough to retrieve information.

Better summary of what we are trying to do: We are trying to get a split-by-split breakdown of each mile or kilometer. I attached the Apple UI of what we are looking to re-create.

The query above creates the json provided above. Appreicate your help. Thank you

Indeed. I guess I didn't read the decimal point correctly and thought it was just around one mile. Sorry for that.

Still, your workout having only one .distanceWalkingRunning sample, as shown in your JSON file, is quite different from what I get from my HealthKit store. For example, I used the following code to retrive the .distanceWalkingRunning samples of a workout I did yesterday:

let startDateSort =  NSSortDescriptor(key: HKSampleSortIdentifierStartDate, ascending: true)
let query = HKSampleQuery(sampleType: HKQuantityType(.distanceWalkingRunning),
                          predicate: HKQuery.predicateForObjects(from: workout),
                          limit: HKObjectQueryNoLimit,
                          sortDescriptors: [startDateSort]) { (_, results, error) in
    guard let distanceSamples = results as? [HKQuantitySample],
            distanceSamples.count > 0 else {
        return // Error handling
    }
    ... // Do `po` here.
}
healthStore.execute(query)
}

And here is what I got:

(lldb) po distanceSamples
▿ 1014 elements
  - 0 : ... 84.5248 m ..., (10.6.1), "Watch4,2" (10.6.1) "Apple Watch"  (2024-09-20 19:33:43 -0700 - 2024-09-20 19:34:14 -0700)
  - 1 : ... 77.9179 m ..., (10.6.1), "Watch4,2" (10.6.1) "Apple Watch"  (2024-09-20 19:34:14 -0700 - 2024-09-20 19:34:44 -0700)
  - 2 : ... 75.562 m ..., (10.6.1), "Watch4,2" (10.6.1) "Apple Watch"  (2024-09-20 19:34:44 -0700 - 2024-09-20 19:35:15 -0700)
  - 3 : ... 76.1746 m ..., (10.6.1), "Watch4,2" (10.6.1) "Apple Watch"  (2024-09-20 19:35:15 -0700 - 2024-09-20 19:35:46 -0700)
...

You can see that my workout has 1014 .distanceWalkingRunning samples and the distance of every sample is quite small. With this kind of data, you can add up the distances to calculate the time per mile.

Your data is under allStatistics section of your workout, and so I am wondering if that is the sum of all the total distance of all your .distanceWalkingRunning samples...

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

Thank you, this is exactly what we needed.

@sgonser which application is the source of the workout?

Some apps save segments for both kilometer and miles so that it is quick and easy to fetch later based on the users preference and locale. To your earlier comments and question, this might be why you are seeing segments overlap of segments.

Namely the Apple Activity app on watchOS does this. I recorded a walk on a beach in Ireland earlier this year. You can see my pace wasn't very high...there were many interesting shells, rocks, and sealife.

So from the image above, the first segment is the first 1 KM, the second is the first 1 MI, third is 2nd KM (remainder), and the fourth is the 2nd MI (remainder).

The HKStatisticsQuery API is very powerful. Another approach you could take, given another workout apps segments, compute the sum distance of that segment interval. If it is near 1-mi, you know the segment is a mile, otherwise if it is near 1-km, you know it is kilometers. In terms of determining that last segment, look at the segment before which has an end date that matches or roughly matches the start date of the last segment. Don't do math that looks exactly for 1-KM or 1-MI because the odds of the sample data generating exactly a round number are slim at best.

Overlapping Segments and Duplicate Start Times in Workout Data
 
 
Q