It turns out Metal still supports a very straight-forward way to access non-2B-aligned element:
define the element layout first:
typedef struct __attribute__((__packed__))
{
unsigned char r;
unsigned char g;
unsigned char b;
} TryRGB888;
then compute works just as charm:
_tryBufferSize = _inputTexture.width * _inputTexture.height * 3; // CPU side memory alloc size, *3 for BGR888
...
device TryRGB888 *outBuffer [[buffer(0)]], // shader side
...
uint index = gid.y * inTexture.get_width() + gid.x;
outBuffer[index].r = 1;
outBuffer[index].g = 1;
outBuffer[index].b = 1;
Post
Replies
Boosts
Views
Activity
I was having a very similar problem. I was trying to use Metal to convert an BGRA8unorm format to a compact BGR888 format.
I created a MTLBuffer for compact BGR888:
_tryBufferSize = _inputTexture.width * _inputTexture.height * 3; // note *3 for BGR888
size_t aligned_size = ((_tryBufferSize + 4095) / 4096) * 4096;
int ret = posix_memalign(&_tryBufferRaw, 4096, aligned_size);
NSAssert(ret == 0 && _tryBufferRaw != NULL, @"posix_memalign failed.");
_tryBuffer = [_device newBufferWithBytesNoCopy:_tryBufferRaw
length:aligned_size
options:MTLResourceStorageModeShared
deallocator:^(void * _Nonnull pointer, NSUInteger length) { free(pointer); }];
and compute with: (inTexture BGRA8888 -> outBuffer BGR888) (reduced demo code)
kernel void
tryKernel(texture2d<half, access::read> inTexture [[texture(0)]],
device uchar3 *outBuffer [[buffer(0)]],
uint2 gid [[thread_position_in_grid]])
{
if((gid.x >= inTexture.get_width()) || (gid.y >= inTexture.get_height()))
{
return;
}
uint index = gid.y * inTexture.get_width() + gid.x;
outBuffer[index] = uchar3(1, 1, 1); // note uchar3 for BGR888
}
This code failed.
The following code is OK to run:
_tryBufferSize = _inputTexture.width * _inputTexture.height * 4;
...
device uchar4 *outBuffer [[buffer(0)]],
...
outBuffer[index] = uchar4(1, 1, 1, 1); // uchar2 also worked fine.
Then I came upon this 4-year old question.
MSLS 2.2 Vector Data Types specified (Table 2.3):
uchar2 size 2B
uchar3 size 4B
uchar4 size 4B
which is not the way I thought it was.