In our use case, there is a Background Mac App (running on Mac M1) that is responsible for receiving data from a companion iOS App via WebSocket connection (client-side Apple Swift API, Vapor4 server side API) and perform computations using Metal Compute APIs and our custom kernels.
In order to optimize execution time of these compute kernels we are looking for a way to profile their execution time i.e. how much combined GPU execution time (compute and memory accesses) is taken by each instance? As may be obvious, our primary focus is not the waiting time spent in the kernel scheduling queues before execution begins, but this may be helpful as an extra.
We are not sure whether Instruments in XCode will be helpful in above scenario (partly in iOS, partly 3rd party WebSocket API, and partly background Mac App (command line App))? Also, is Metal frame capturing method dependent on presence of Metal graphics APIs and hence will not work for Background Apps? Can we get desired info using GPU Counter Sample Buffers, or are we looking at the wrong places?
Any assistance wrt above (measurement of Metal compute kernel execution times in the context of a Mac Background App) will be highly appreciated.
Post
Replies
Boosts
Views
Activity
As the description of XPC says that it can be used for inter process communication mechanism, what exactly the inter process means?
Can it be used to create shared memory between any type of processes (ex. Swift Process (App) and any other language process) or it is for strictly App to App (swift to swift) communication?
For several reasons it is required to call the python functions from parallel threads, currently PythonKit library is being used for that.
While one python module is running and if I try to execute the other module from other thread than it is giving EXC_BAD_ACCESS error. this error from internal code of PythonKit.
Please guide me if I am missing something.
For some reasons it is required to send binary files from iOS app to MacOS Command line tool (CLT) app for the computation of data and also send back the computed results from maxOS CLT app to the iOS app.
a) what is preferred way (using tools & framework from apple)?
b) If it's not easy to achieve this using apple tools and frameworks; will "Vapor" be useful for this task?
To use some libraries which are easily available in python it is required to run the python files and get output from it from the MacOS app, What is the preferred way (from apple) to call python functions from swift?
While trying to set texture (with memory less storage) from a MTLComputeCommandEncoder, getting an error that Memoryless storage mode cannot be used with MTLComputeCommandEncoder (parallel computation)
Documentation,(https://developer.apple.com/documentation/metal/setting_resource_storage_modes/choosing_a_resource_storage_mode_in_ios_and_tvos) is not clear regarding this.
Can memoryless storage mode used for this case? If yes, How?
Consider a MTLTexture of type 2DArray having some number of slices.
To calculate histogram of a specific slice of this texture,
how to pass only the reference of a single slice of texture to MPSImageHistogram shader?
In general, How to slice a texture in swift environment?
Texture.makeTextureview() is not the preferred way because it creates a new texture consuming more memory and time.
There some MTLTextures of which histogram is to be calculated, this textures changes at every loop iteration so calculation of histogram is carried out in loop using MPSImageHistogram.
There is a single command queue on which a new command buffer is created at every loop iteration to perform histogram calculation.
Memory footprints (from allocation instrument) keeps on increasing due to new command buffer creation in the loop.
The question is how to clear the allocated memory of the command buffer once it's executed?
Or is there any way for restructuring the calculation scheme?
In short, how to deallocate the memory consumed by metal objects like command buffer, compute pipeline, encoders etc.
Indirect Command Buffers can be useful for repetitive commands where only the input buffers/Textures may be changing (correct this if wrong!).
Apple documentation
(https://developer.apple.com/documentation/metal/indirect_command_buffers/encoding_indirect_command_buffers_on_the_cpu) has only given the example for objective-c and not for the Swift!
Can anyone point to to any example or tutorial of using ICB from Swift environment?
While profiling an app (which has metal codes for gpu acceleration) Time Profiler Instrument gives the error as in the screenshot :
I want to know what kind of configuration is missing and how to configure it.I have also observed that Allocations, Leaks, VM Tracker has no issues.
I am creating a framework which includes several metal files, while creating default library using device?.makeDefaultLibrary() (when this framework is embedded in a project), application is crashing. It turns out that without specifying Bundle() to makeDefaultLibrary() it only searches for library in main bundle, but as per requirement compiler should search for library in embedded framework's bundle (a .metallib file is being generated while creating Framework).
I have tried specifying bundle as below:
A.
let frameworkBundle = Bundle(for: type(of: self))
let bundleLib = try device?.makeDefaultLibrary(bundle: frameworkBundle)
B.
let frameworkBundle = Bundle(identifier: "com.myframework")
let bundleLib = try device?.makeDefaultLibrary(bundle: frameworkBundle)
Still the application is crashing, I have also noticed that in both the above methods frameworkBundle is returned as nil.