I have achieved what you’re thinking of, with real-time texturing of 3D objects - scene color reconstruction. It only runs on devices with LiDAR and isn’t exportable, but it could give you some ideas.
Yes that is (partially) possible. You can project any geometry of an anchor back into the camera image to reason about the texture. However this requires multiple viewpoints and some form of higher-level logic to decide with projection to apply to which part of your geometry.
I took care of that problem, although it was extremely difficult to do so.
Post
Replies
Boosts
Views
Activity
You could build on previously done research with scene color reconstruction. I achieved the goal you’re asking about, and used it to reconstruct the real world on VR, making Google Cardboard an AR headset.
I already pulled off such an approach with scene color reconstruction and you can access it through ARHeadsetKit. It renders the true color of the objects instead of using semantic classification though (I’ve never found a use for the object categories).
It may not be well suited to your project, but there definitely is a tutorial that fits your criterion - ARHeadsetKit’s tutorial series.
Google killed the Google Cardboard project in March 2021 (alongside Swift for TensorFlow). The remnants of that code are all Objective-C and OpenGL, neither of which I’m a fan of. Their VR expeditions app keeps spinning nonstop on iPhone 12 and up, and I can guarantee it’s because they last updated in 2015.
However, I have a solution that might help. ARHeadsetKit is geared toward AR, but it can become VR if you make the entire world filled with virtual objects. Furthermore, you’d be the first personal to go through its tutorials while owning Google Cardboard!
I’ve always struggled to reach my target audience with ARHeadsetKit, and I’d be extremely excited if someone was willing to incorporate my past work into an app that goes on the iOS App Store or something like that. If you find this worthwhile and want to bypass the wait time of me responding on Apple Developer Forums, please contact me through the email linked on my GitHub profile.
Research on scene color reconstruction may give you some answers. In my opinion, you’re taking on way more than a single developer can handle, unless you have thousands of hours to spare.
Furthermore, depth buffers are only available on devices with a LiDAR scanner. That’s a small portion of your consumer market, so keep that in mind. Your eagerness to experiment with AR is on the right track though.
Definitely a major player, with a dedicated $3000 AR headset at the end of this year.
ARHeadsetKit‘s tutorial series - you don’t even need iOS or Swift experience to try them. If you have 5 hours to spare, it covers a lot of content like a crash course.
No and I don’t think they’ll respond. My last bug report was for a 2-minute bug fix and it took then 3 months to respond. I even called them on the phone and pointed out their ignoring on a developer forums thread.
[this comment was accidentally placed here and relocated to a different position on the thread by the author]
I'm working on a Metal/DirectX backend for Swift for TensorFlow while simultaneously resurrecting that archived project. It should be released in a few months, and will be more optimized than Apple's PluggableDevice implementation. It will also allow training on an iPhone or iPad, so you can experience fast GPU-accelerated machine learning on an Apple chip even if you have a low-end Intel Mac. Furthermore, it will support training on the integrated GPUs on Intel Macs, which tensorflow-macos doesn't.
@BigfootLives if this sounds like a solution to your question, I can provide links to GitHub repositories about it.
I just published over a week worth of work on implementing Fast Fourier Transforms in Metal: MetalFFT. @CaptainHarlock worked with me throughout the process, and this thread is effectively resolved.
I have one more request for the MPS team, which is listed in my repository's README. I have no way of knowing whether any Apple engineers review a specific issue in the Feedback Assistant, and I especially do not want this one to be ignored. Graphics and Games Engineer, please relay this development directly to the MPS team (sorry for this being the third time you are asked that on this thread). I would like them to carry on my work and integrate it into Metal Performance Shaders, but we must establish communication first.
If this helps any, there might be a Metal port to Swift for TensorFlow in a few months. It would work without any prior configuration because you would just need to add a Swift package dependency to an Xcode project.
Another thing I'd like to see in MPS is support for encoding into indirect compute commands. I recently thought of plans for how to add a Metal backend to DL4S, a deep learning framework for Swift. This requires commands to be dispatched semi-eagerly, where you can't pre-compile them into graphs like with MPSGraph. Being able to utilize indirect command buffers in a JIT compiler like XLA (tensorflow.org/xla) would provide opportunities to reduce encoding overhead.
This isn't encouraged by Apple, but I found a way to load the raw MPS shaders by peering into a private Metallib directory accessible from public APIs. I'll go into as little detail as possible for obvious reasons, but it was possible to create compute pipeline states from MPS shaders. If I had studied them longer, I could have made an indirect command buffer workflow using them. However, there are numerous details about MPS's internals that I don't know, so I might accidentally do something unsafe. The reason I'm saying this is because it proves the MPS team can theoretically pull this off - they just need to expose a safe public API for it. There is also a precedent for unique features geared toward rare performance use cases - MTLCommandQueue.makeCommandBufferWithUnretainedReferences().
I ended up scrapping plans for ICBs in because I would need entirely custom shaders to securely execute GPU work, and Apple's MMX shaders far outperformed mine. With that restriction gone, I readily changed my plans to use MPS. For more context on how this played out, you can check out some of the closed issues under the DL4S Evolution repository. I later shifted my efforts to Swift for TensorFlow, so that repo shouldn't experience major updates in the future.
I'm debating whether I should jump-start MetalFFT now, while I wait for the S4TF project to gain momentum in the Swift community (also to help out @CaptainHarlock). I would structure its API similarly to MPS, but you need to input either a MTLComputeCommandEncoder or a MTLIndirectComputeCommand instead of kernel.encode(commandBuffer:, ...). Perhaps the completion of MetalFFT will help the MPS team better understand my suggestion about ICBs. To the Graphics and Games Engineer responding to this post - could you route the info about MetalFFT and ICBs to the MPS team?
There are two solutions. First, Metal has purgeable resources, so you could allocate an arbitrarily large cache and have the OS delete it when it exceeds available memory.
Second, I’m working on a Metal backend for Swift for TensorFlow, which should be multipurpose and might be interesting to you. I recommend going over the most recent issue under tensorflow/swift-apis on GitHub for more context. Also look at the closed issues under dl4s-team/dl4s-evolution for my last exploration into the topic.
If you’re talking about access to L1/L2 cache memory, I may have misunderstood you. Using threadgroup memory might serve your purposes - each 1024 threads can share and communicate through their own 32 KB of low-latency memory - that’s insanely large. It is used in matrix multiplication to reduce main memory accesses in Metal Performance Shaders and other approaches.
There is no equivalent to this on the CPU or ANE, although the ML accelerators (AMX) for the CPU have one insanely large 1024-word register (you can only access this indirectly through Accelerate). Apple has tried hard to hide this from the public, but you can find out about the AMX from a Google search.
I'm thinking of adding 1D, 2D, and 3D FFT transforms to an open-source project. They'll either end up in a Metal backend for Swift for TensorFlow, or in a related project. I am wondering whether the MPS team could use my open-source work to save time for themselves. Right now, the MPS team could postpone making the FFT shaders, using their time for another project. When I have open-sourced my implementation, they could use it as a reference, jump-starting their efforts and saving time.
From my experience with bug FB9653639, the Metal team is very slow to implement changes. In addition, they may need to rigorously test the shaders for bugs, which are very frequent and difficult to solve in GPGPU contexts. @CaptainHarlock my open-source effort might solve your needs before FFT shaders are added to MPS. We could discuss this more off of developer forums if it's time-sensitive - my GitHub account is "philipturner".