I am working on a personal use app, to transcribe audio files. I have over 1000 Voice Memos of ideas for a dog training app and book, recorded while... walking dogs, of course.
I seem to not have the built in transcription option, either because Sonoma doesn't support it or my region doesn't, but I have learned a lot of Swift building an app that works great fort files in a folder in Documents.
I have also found the path to to all the Voice Memo recordings. But when I try to read the contents of the folder to build the queue for transcription I get The file “Recordings” couldn’t be opened because you don’t have permission to view it.
I expected this to be locked down, and some searching brought me to this and I have added Access User Selected Files (Read Only) = YES to the entitlements file, but I am not seeing where in the TARGETS editor I would assign com.apple.security.files.user-selected.read-only. If I add it as a key under info I don't get a popup to select, either in Xcode or when running the app. If I try to add that key to the entitlements file it doesn't allow for selection either.
I am sure I am just missing something in the documentation, likely as a result of being an Xcode & Swift noob. So, if I CAN do this and I am just missing something, can someone point the way? And if a folder inside another app is just verboten, manually copying those files to a documents folder for processing won't be the end of the world.
Post
Replies
Boosts
Views
Activity
I am attempting to do batch Transcription of audio files exported from Voice Memos, and I am running into an interesting issue. If I only transcribe a single file it works every time, but if I try to batch it, only the last one works, and the others fail with No speech detected. I assumed it must be something about concurrency, so I implemented what I think should remove any chance of transcriptions running in parallel. And with a mocked up unit of work, everything looked good. So I added the transcription back in, and
1: It still fails on all but the last file. This happens if I am processing 10 files or just 2.
2: It no longer processes in order, any file can be the last one that succeeds. And it seems to not be related to file size. I have had paragraph sized notes finish last, but also a single short sentence that finishes last.
I left the mocked processFiles() for reference.
Any insights would be greatly appreciated.
import Speech
import SwiftUI
struct ContentView: View {
@State private var processing: Bool = false
@State private var fileNumber: String?
@State private var fileName: String?
@State private var files: [URL] = []
let locale = Locale(identifier: "en-US")
let recognizer: SFSpeechRecognizer?
init() {
self.recognizer = SFSpeechRecognizer(locale: self.locale)
}
var body: some View {
VStack {
if files.count > 0 {
ZStack {
ProgressView()
Text(fileNumber ?? "-")
.bold()
}
Text(fileName ?? "-")
} else {
Image(systemName: "folder.badge.minus")
Text("No audio files found")
}
}
.onAppear {
files = getFiles()
Task {
await processFiles()
}
}
}
private func getFiles() -> [URL] {
do {
let documentsURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first!
let path = documentsURL.appendingPathComponent("Voice Memos").absoluteURL
let contents = try FileManager.default.contentsOfDirectory(at: path, includingPropertiesForKeys: nil, options: [])
let files = (contents.filter {$0.pathExtension == "m4a"}).sorted { url1, url2 in
url1.path < url2.path
}
return files
}
catch {
print(error.localizedDescription)
return []
}
}
private func processFiles() async {
var fileCount = files.count
for file in files {
fileNumber = String(fileCount)
fileName = file.lastPathComponent
await processFile(file)
fileCount -= 1
}
}
// private func processFile(_ url: URL) async {
// let seconds = Double.random(in: 2.0...10.0)
// await withCheckedContinuation { continuation in
// DispatchQueue.main.asyncAfter(deadline: .now() + seconds) {
// continuation.resume()
// print("\(url.lastPathComponent) \(seconds)")
// }
// }
// }
private func processFile(_ url: URL) async {
let recognitionRequest = SFSpeechURLRecognitionRequest(url: url)
recognitionRequest.requiresOnDeviceRecognition = false
recognitionRequest.shouldReportPartialResults = false
await withCheckedContinuation { continuation in
recognizer?.recognitionTask(with: recognitionRequest) { (transcriptionResult, error) in
guard transcriptionResult != nil else {
print("\(url.lastPathComponent.uppercased())")
print(error?.localizedDescription ?? "")
return
}
if ((transcriptionResult?.isFinal) == true) {
if let finalText: String = transcriptionResult?.bestTranscription.formattedString {
print("\(url.lastPathComponent.uppercased())")
print(finalText)
}
}
}
continuation.resume()
}
}
}
First off, given that I didn't find a tag for Code Review, I hope I am not out of scope for the forums here.
Second, some background. I am a long time Windows Power Shell developer, moving to Swift because I don't like self loathing. :)
Currently I am trying to get my head around SwiftData, and experimenting with creating a Service to handle the actual SwiftData functionality, and a Manager to handle various tasks that relate to instances of the Model. I am doing this realizing that it MAY NOT be the best approach, but it gives me reps both producing code and thinking about how to solve a problem, which I think is useful even if the actual product in throw away. That said, I am hoping someone with more experience than I can comment on this approach, especially with respect to expanding to more models, more complex models, lots of data and a desire to use ModelActor eventually.
DataManagerApp.swift
import SwiftData
import SwiftUI
@main
struct DataManagerApp: App {
let container: ModelContainer
init() {
let schema = Schema([DataModel.self])
let config = ModelConfiguration("SwiftDataStore", schema: schema)
do {
let modelContainer = try ModelContainer(for: schema, configurations: config)
DataService.instance.assignContainer(modelContainer)
container = modelContainer
} catch {
fatalError("Could not configure SwiftData ModelContainer.")
}
}
var body: some Scene {
WindowGroup {
ContentView()
.modelContainer(container)
}
}
}
DataModel.swift
import Foundation
import SwiftData
@Model
final class DataModel {
var date: Date
init(date: Date) {
self.date = date
}
}
final class DataService {
static let instance = DataService()
private var modelContainer: ModelContainer?
private var modelContext: ModelContext?
private init() {}
func assignContainer(_ container: ModelContainer) {
if modelContainer == nil {
modelContainer = container
modelContext = ModelContext(modelContainer!)
} else {
print("Attempted to assign ModelContainer more than once.")
}
}
func addModel(_ dataModel: DataModel) {
modelContext?.insert(dataModel)
}
func removeModel(_ dataModel: DataModel) {
modelContext?.delete(dataModel)
}
}
final class ModelManager {
static let instance = ModelManager()
let dataService: DataService = DataService.instance
private init() {}
func newModel() {
let newModel = DataModel(date: Date.now)
DataService.instance.addModel(newModel)
}
}
ContentView.swift
import SwiftData
import SwiftUI
struct ContentView: View {
@Environment(\.modelContext) var modelContext
@State private var sortOrder = SortDescriptor(\DataModel.date)
@Query(sort: [SortDescriptor(\DataModel.date)]) var models: [DataModel]
var body: some View {
VStack {
addButton
List {
ForEach(models) { model in
modelRow(model)
}
}
.listStyle(.plain)
}
.padding()
.toolbar {
ToolbarItem(placement: .topBarTrailing) {
addButton
}
}
}
}
private extension ContentView {
var addButton: some View {
Button("+ Add") {
ModelManager.instance.newModel()
}
}
func modelRow(_ model: DataModel) -> some View {
HStack {
Text(model.date.formatted(date: .numeric, time: .shortened))
Spacer()
}
}
}
I am trying to get my head around SwiftData, and specifically some more "advanced" ideas that I have not seen covered in the various tutorials.
Specifically, I have a class that includes a collection that may or may not contain elements. For now I am experimenting with a simple array of Date, and I don't know if I should make it an optional, or an empty array. Without SwiftData in the mix it seems like it's probably programmers choice, but I wonder if SwiftData handles those two scenarios differently, that would suggest one over the other.