ZoGo996’s Profile | Apple Developer Forums

coreml convert flatten to reshape, but npu does not support reshape

I have a model that uses ‘flatten’, and when I converted it to a Core ML model and ran it on Xcode with an iPhone XR, I noticed that ‘flatten’ was automatically converted to ‘reshape’. However, the NPU does not support ‘reshape’. howerver, I got the Resnet50 model on apple models and performance it on XCode with the same iphone XR, I can see the 'flatten' operator which run on NPU. On the other hand, when I used the following code to convert ResNet50 in PyTorch and ran it on Xcode Performance, the ‘flatten’ operation was converted to ‘reshape’, which then ran on the CPU. ? So I dont know how to keep 'flatten' operator when convert to ml model ? coreml tool 7.1 iphone XR ios 17.5.1 from torchvision import models import coremltools as ct import torch import torch.nn as nn network_name = "my_resnet50" torch_model = models.resnet50(pretrained=True) torch_model.eval() width = 224 height = 224 example_input = torch.rand(1, 3, height, width) traced_model = torch.jit.trace(torch_model, (example_input)) model = ct.convert( traced_model, convert_to = "neuralnetwork", inputs=[ ct.TensorType( name = "data", shape = example_input.shape, dtype = np.float32 ) ], outputs = [ ct.TensorType( name = "output", dtype = np.float32 ) ], compute_units = ct.ComputeUnit.CPU_AND_NE, minimum_deployment_target = ct.target.iOS14, ) model.save("my_resnet.mlmodel") ResNet50 on Resnet50.mlmodel My Convertion of ResNet50

Machine Learning & AI Core ML iOS iPhone Core ML

1

0

558

Jun ’24

[swift]CVMetalTextureCacheCreateTextureFromImage, CVMetalTexture should use a variable to keep strong reference until GPU done

in Swift languange, CVMetalTextureCacheCreateTextureFromImage return CVMetalTexture, and CVMetalTexture is Swift class, so. it doesn't need to call CVBufferRelease manually. My question is : should I use a variable to keep strong reference until GPU finished (until addCompleteHandler callback ) ？ cvmetaltexturecachecreatetexture

Programming Languages Swift Metal Swift Core Video

0

550

Mar ’24

for 420v, camera output CVPixelBuffer, Y channel value exceed the range [16, 235]

Platfrom: iphone XR System: ios 17.3.1 using iphone front camera(normal camera), configure data output format to 'kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange' ('420v' (video range)) I found that Cb, Cr is inside [16, 240], but Y is outside range [16, 235], e.g 240, 255 It will lead that after convert to rbg, rgb may be negative number , and then clamp the r,g,b value between 0 and 255, finally convert clamped rgb back to yuv, yuv is different from origin yuv. The maxium difference of y channel will be 20. Both procssing by pure cpu and using metal shader will get this result CVPixelBuffer.h kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange = '420v', /* Bi-Planar Component Y'CbCr 8-bit 4:2:0, video-range (luma=[16,235] chroma=[16,240]). baseAddr points to a big-endian CVPlanarPixelBufferInfo_YCbCrBiPlanar struct */ // ... some code ... // config camra data output format NSDictionary* options = @{ (__bridge NSString*)kCVPixelBufferPixelFormatTypeKey : @(kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange), //(__bridge NSString*)kCVPixelBufferMetalCompatibilityKey : @(YES), }; [_videoDataOutput setVideoSettings:options]; // ... some code ... - (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection; { CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer); CVPixelBufferRef pixelBuffer = imageBuffer; CVPixelBufferLockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly); uint8_t* yBase = (uint8_t*)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0); uint8_t* uvBase = (uint8_t*)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1); int imageWidth = (int)CVPixelBufferGetWidth(pixelBuffer); // 720 int imageHeight = (int)CVPixelBufferGetHeight(pixelBuffer);// 1280 int y_width = (int)CVPixelBufferGetWidthOfPlane (pixelBuffer, 0); // 720 int y_height = (int)CVPixelBufferGetHeightOfPlane(pixelBuffer, 0); // 1280 int uv_width = (int)CVPixelBufferGetWidthOfPlane (pixelBuffer, 1); // 360 int uv_height = (int)CVPixelBufferGetHeightOfPlane(pixelBuffer, 1); // 640 int y_stride = (int)CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0); int uv_stride = (int)CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1); // 768 // check Y-plane if (TRUE) { for(int i = 0 ; i < imageHeight ; i++) { for(int j = 0; j < imageWidth ; j++) { uint8_t nv12pixel = *(yBase + y_stride * i + j ); if (nv12pixel < 16 || nv12pixel > 235) { // [16, 235] NSLog(@"%s: y panel out of range, coord (x:%d, y:%d), h-coord (x:%d, y:%d) ; nv12 %u " ,__FUNCTION__ ,j ,i ,j/2, i/2 ,nv12pixel ); } } } } CVPixelBufferLockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly); } // ... some code ... How to deal with this case ? Hope to get reply, Thanks

App & System Services Core OS iOS Metal Camera AVFoundation

3

0

856

Feb ’24

llvm-symbolizer not found in Xcode

after install XCode, llvm-dwarfdump llvm-objdump .. these can be found under "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/" but llvm-symbolizer is not FOUND. Should I install llvm by myself if I need llvm-symbolizer ? brew install llvm Error: llvm: the bottle needs the Apple Command Line Tools to be installed. You can install them, if desired, with: xcode-select --install If you're feeling brave, you can try to install from source with: brew install --build-from-source llvm

Developer Tools & Services Xcode Xcode LLVM

0

1

1.3k

Dec ’22

VM Stack category Persistent pthread_create & pthread_join

Device: Iphone XR(ios 14.2) XCode: 13.4.1 Tools: Instrument -- Allocations When I use 'Allocations' to check memory leak, I found that VM:Stack retains many pthread_create or pthread_join that have not been released. (pthread_create is call by POSIX API directly or std::thread) I make sure that each thread calls pthread_create and pthread_join in pair(no thread exited without thread_join). But ‘Allocation’ VM:Stack Category shows that something created in pthread_create and something created in pthread_join not released. So, that things created by pthread_create or pthread_join will recycle by system later ?

Developer Tools & Services Xcode Instruments Xcode

1

0

1.5k

Sep ’22

System Trace, Summary : System Calls no wall-clock time

XCode 13.4.1 Instrucment: System Trace. In Narrative, it show data with wall-clock time ,like "00:05.082.832 Called "psynch_cvwait" for 16.32 ms.". In "Summary : System Calls " it just summary cpu time of "psynch_cvwait" but no wall-clock time, but sometimes, I want to known the wall-clock time, and now I have to filter "psynch_cvwait" in Narrative , and add them manually. Is it possible to add sum of wall-clock time in "Summary : System Calls " ? Thanks

Developer Tools & Services Xcode Instruments Xcode

1

0

936

Jul ’22

equivalent formats across CoreVideo, Metal, and OpenGL

Following the document and demo mixing_metal_and_opengl_rendering_in_a_view section "Select a Compatible Pixel Format" only show MTLPixelFormatBGRA8Unorm as followed. if I want to use MTLPixelFormatRGBA8Unorm, how can I find the cvpixelformat and gl format which match MTLPixelFormatRGBA8Unorm?? Thanks in advance. // Table of equivalent formats across CoreVideo, Metal, and OpenGL static const AAPLTextureFormatInfo AAPLInteropFormatTable[] = { // Core Video Pixel Format, Metal Pixel Format, GL internalformat, GL format, GL type { kCVPixelFormatType_32BGRA, MTLPixelFormatBGRA8Unorm, GL_RGBA, GL_BGRA_EXT, GL_UNSIGNED_INT_8_8_8_8_REV }, #if TARGET_IOS { kCVPixelFormatType_32BGRA, MTLPixelFormatBGRA8Unorm_sRGB, GL_RGBA, GL_BGRA_EXT, GL_UNSIGNED_INT_8_8_8_8_REV }, #else { kCVPixelFormatType_ARGB2101010LEPacked, MTLPixelFormatBGR10A2Unorm, GL_RGB10_A2, GL_BGRA, GL_UNSIGNED_INT_2_10_10_10_REV }, { kCVPixelFormatType_32BGRA, MTLPixelFormatBGRA8Unorm_sRGB, GL_SRGB8_ALPHA8, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV }, { kCVPixelFormatType_64RGBAHalf, MTLPixelFormatRGBA16Float, GL_RGBA, GL_RGBA, GL_HALF_FLOAT }, #endif };

Graphics & Games General Metal Core Animation Core Video

5

0

1.5k

Jan ’22

MallocCheckHeap is stack address or symbol address ?

Target Platform: iphone xr XCode: 12.4 After setting "Enable Malloc Scribble" "Malloc Guard Edges" "Goard Malloc" in Diagnostics and "MallocCheckHeapEach=1" "MallocCheckHeapSleep=100" "MallocCheckHeapStart=100000" in Environment Variables Start up the app on iphone and I get the following information: xxxx(1394,0x16f933000) malloc: *** MallocCheckHeap: FAILED check at operation #7444968 Stack for last operation where the malloc check succeeded: 0x1aefaed70 0x1aefa2f94 0x112f20540 0x1e904e76c 0x1e905a5e8 0x1e9054bf4 0x1e9035fc0 0x1b5ec57c4 0x112f216c0 0x112f25000 0x112f24e7c 0x1b5ec5268 0x1b5ed1348 0x1b5ed0e40 0x1a0b103f8 0x1a0b0e9a4 0x1a07a751c 0x1a0aef310 0x1a07afb74 0x1a07b6d38 0x1a0b1511c 0x1a0b12b28 0x1a02c2cc8 0x1a02bbac4 0x1a02bc7b0 0x1a0336028 0x1a02bb3c0 0x1a0336b60 0x1a0335344 0x1a03354c0 0x112f1fbcc 0x112f216c0 0x112f29354 0x112f2a0f4 0x112f2b5e4 0x112f36644 0x1e901c804 0x1e902375c (Use 'atos' for a symbolic stack) xxxx(1394,0x16f933000) malloc: *** Will sleep for 100 seconds to leave time to attach xxxx(1394,0x16f933000) malloc: *** check: incorrect tiny region 44, counter=28255155 *** invariant broken for tiny block 0x13628fea0 this msize=0 - size is too small xxxx(1394,0x16f933000) malloc: *** set a breakpoint in malloc_error_break to debug xxxx(1394,0x16f933000) malloc: *** sleeping to help debug Q.1 "Stack for last operation where the malloc check succeeded" means what ? Q.2 the address is 'stack address' ? e.g 0x1aefaed70. Following the hints "(Use 'atos' for a symbolic stack) ", I get nothing for 0x1aefaed70 $atos -o ./DerivedData/Build/Products/Debug-iphoneos/xxxx.app.dSYM/Contents/Resources/DWARF/xxxx -arch arm64 -l 0x10225c000 0x10225c000 0x0000000100000000 (in xxxx) $atos -o ./DerivedData/Build/Products/Debug-iphoneos/xxxx.app.dSYM/Contents/Resources/DWARF/xxxx -arch arm64 -l 0x10225c000 0x1aefaed70 0x1aefaed70 (nothing) 0x10225c000 is load adress getting from AppDelegate after app start up. uint32_t numImages = _dyld_image_count(); for (uint32_t i = 0; i < numImages; i++) { const struct mach_header *header = _dyld_get_image_header(i); const char *name = _dyld_get_image_name(i); const char *p = strrchr(name, '/'); if (p && (strcmp(p + 1, "xxxx") == 0 || strcmp(p + 1, "libXxx.dylib") == 0)) { NSLog(@"module=%s, address=%p", p + 1, header); } } ```

Developer Tools & Services Xcode iPhone Xcode Sanitizers and Runtime Issues Xcode Code Diagnostics

2

0

1.2k

Nov ’21

Enable Malloc Scribble 0x55 0xAA change it ?

Following the page MallocDebug After Enable Malloc Scribble, free buffer will be set to 0x55, malloc buffer will be set to 0xAA. would I change 0x55 and 0xAA to another value ? e.g 0xFF ?

Developer Tools & Services Xcode Xcode Sanitizers and Runtime Issues Xcode Debugging Code Diagnostics

2

0

1.4k

Nov ’21

metallib is bigger than metal

I have a simple "vertex shader" in metal file. and then use metal.exe -std=ios-metal1.1 mios-version-min=8.0 -c test.metal -o test.air metalilb.exe test.air -o test.metallib metal.exe/metalib.exe is under folder "Metal Developer Toos"/ios/bin/ ("Metal Develop tools for windows") I found that the .metallib file(3434 bytes) is bigger than the origin metal file(995 bytes) is that right ?? how explain it ?

Graphics & Games General Metal wwdc21-10157

1

0

914

Nov ’21

Synchronization mechanism on Mixing Metal and OpenGL Rendering

ios: 14.2 iphone xr I download the demo from "https://developer.apple.com/documentation/metal/mixing_metal_and_opengl_rendering_in_a_view?language=objc" it play normally. and then i change the code as followed: the purpose is switching "mix render" and "just opengl rendering" and re-create AAPLMetalRenderer each time when re-entry "mix render" there come a 'bug' phenomenon: the first frame each time from "just opengl" to "mix render" will display the old picture ( it was the last picture when "mix render" to "just opengl " ) . the first frame it will re-create the AAPLMetalRenderer and call drawToInteropTexture, but it seems that the "InteropTexture" do not 'update' yet ( or opengl's draw do not wait for metal finish rendering to 'InteropTexture' ? ) so I have a question about how metal and opengl sync?? int counter = 0 ; bool currentMetal = false ; - (void)draw:(id)sender { [EAGLContext setCurrentContext:_context]; counter++; counter = counter % 180; if (counter < 90) { bool waitForFinish = false ; if (!currentMetal) // re-entry "mix render" { // create MetalRender _metalRenderer = nil; _metalRenderer = [[AAPLMetalRenderer alloc] initWithDevice:_metalDevice colorPixelFormat:AAPLOpenGLViewInteropPixelFormat]; [_metalRenderer useTextureFromFileAsBaseMap]; [_metalRenderer resize:AAPLInteropTextureSize]; } currentMetal = true ; [_metalRenderer drawToInteropTexture:_interopTexture.metalTexture waitForFinish:waitForFinish]; [_openGLRenderer draw]; } else { [_metalRenderer justUpdate]; // not metal render [_openGLRenderer justClear]; // just clean opengl's fbo currentMetal = false ; } glBindRenderbuffer(GL_RENDERBUFFER, _colorRenderbuffer); [_context presentRenderbuffer:GL_RENDERBUFFER]; } _openGLRenderer justClear is below: - (void) justClear { glBindFramebuffer(GL_FRAMEBUFFER, _defaultFBOName); glClearColor(0.5, 0.5, 0.5, 1); glClear(GL_COLOR_BUFFER_BIT); } _metalRenderer justUpdate is below: - (void)justUpdate { [self updateState]; }

Graphics & Games General Metal OpenGLES

3

0

1.5k

Nov ’21

ZoGo996

Post

Replies

Boosts

Views

Activity