Sample code for MPSMatrixDecompositionCholesky?

Question

I am trying to get my code to use MPSMatrixDecompositionCholesky correctly but it is giving me an incorrect matrix result.

Any help would be greatly appreciated!

let device = MTLCreateSystemDefaultDevice()!

let commandQueue = device.makeCommandQueue()

let M = 2

let row = M * MemoryLayout<Float>.stride

let matlength = M * row

let mdesc = MPSMatrixDescriptor(

dimensions: M, columns: M, rowBytes: row, dataType: MPSDataType.float32)

let arrayA: [Float] = [ 2.0 , 1.0 , 1.0 , 2.0 ]

let buffA = device.makeBuffer(bytes: arrayA, length: matlength)

let matA = MPSMatrix(buffer: buffA!, descriptor: mdesc)

let arrayB: [Float] = [ 1.0 , 0.0 , 0.0 , 0.0 ]

let buffB = device.makeBuffer(bytes: arrayB, length: matlength)

let matB = MPSMatrix(buffer: buffB!, descriptor: mdesc)

let arrayX: [Float] = [ 0.0 , 0.0 , 0.0 , 0.0 ]

let buffX = device.makeBuffer(bytes: arrayX, length: matlength)

let matX = MPSMatrix(buffer: buffX!, descriptor: mdesc)

let arrayL: [Float] = [ 0.0 , 0.0 , 0.0 , 0.0 ]

let buffL = device.makeBuffer(bytes: arrayL, length: matlength)

let matL = MPSMatrix(buffer: buffL!, descriptor: mdesc)

let decomp = MPSMatrixDecompositionCholesky(device: device, lower: true, order: M )

let commandBuffer = commandQueue?.makeCommandBuffer()

decomp.encode(commandBuffer: commandBuffer!,

sourceMatrix: matA, resultMatrix: matL, status: buffX)

commandBuffer?.commit()

let rawPointerL = matL.data.contents()

let countL = matL.rows * matL.columns

let typedPointerL = rawPointerL.bindMemory(to: Float.self, capacity: countL)

let bufferedPointerL = UnsafeBufferPointer(start: typedPointerL, count: countL)

print(" ")

print(" ",bufferedPointerL[0], bufferedPointerL[1])

print(" ",bufferedPointerL[2], bufferedPointerL[3])

print(" ")

Metal

4k

Posted by

richardson_bay

Reply

Add a Comment

Answer 1

Just for fun I tried your code and it also gives the wrong output on my iMac running 10.13.3. It actually gives the exact same output whether padding or not is used. However, I noticed that the MPSMatrixDecompositionStatus result is -3, which is nonPositiveDefinite.

However, on my iPhone 7 running iOS 11.2 the status code is 0 and the output is:

1.41421  1.0
0.707107  1.22474

which does not seem correct because of that 1.0 in the upper-right corner (which should be a 0).

Posted by

kerfuffle

Add a Comment

Answer 2

Non positive definite? That is strange. That matrix has a positive determinant and the trace is positive.

I get the same answer on my MacBook Pro as you have found but not on my iMac.

I am running osMac 10.13.3 on all my machines and they are all 2017 'vintage'.

Just don't understand what I am doing wrong.

Still, I will assume I am incorrectly entering the matrix to the buffer and keep searching.

Posted by

richardson_bay

Add a Comment

Answer 3

According to the API description, the result is only written to the lower or uppor triangular portion of the result.

It looks like you're creating the array correctly for the given row byte padding.

Can you please file a bug report with your project and note which devices you've observed the failure on?

Posted by

jvoo

Add a Comment

Answer 4

Ok. Will try to do that.

Posted by

richardson_bay

Add a Comment

Answer 5

Here is the code I am running which yields these results:

1.41421 1.0

0.707107 1.22474

I don't get good results with MPSMatrixDescriptor.rowBytes

the results above employ MemoryLayout.stride

If I used the above results in a strictly mathematical matrix operation of L* L transposed it would give me the

incorrect A matrix, but if I 'filter' out the upper off diagonal and replace it by zero it would give me the correct answer for A.

My real issue is that my iMac doesn't give the same answer as my MacBook Pro does with the code below. (But I'll have to check that again.)

import MetalPerformanceShaders

let device = MTLCreateSystemDefaultDevice()!

let commandQueue = device.makeCommandQueue()

let M = 2

let row = M * MemoryLayout<Float>.stride

let mdesc = MPSMatrixDescriptor(dimensions: M, columns: M, rowBytes: row, dataType: MPSDataType.float32)

let matlength = M * row

let arrayA: [Float] = [ 2 , 1 , 1 , 2 ]

let buffA = device.makeBuffer(bytes: arrayA, length: matlength)

let matA = MPSMatrix(buffer: buffA!, descriptor: mdesc)

let arrayL: [Float] = [ 0 , 0 , 0 , 0 ]

let buffL = device.makeBuffer(bytes: arrayL, length: matlength)

let matL = MPSMatrix(buffer: buffL!, descriptor: mdesc)

let decomp = MPSMatrixDecompositionCholesky(device: device, lower: true, order: M )

let commandBuffer = commandQueue?.makeCommandBuffer()

let buff = device.makeBuffer(length: matlength)

decomp.encode(commandBuffer: commandBuffer!,

sourceMatrix: matA, resultMatrix: matL, status: buff)

commandBuffer?.commit()

commandBuffer?.waitUntilCompleted()

let rawPointerL = matL.data.contents()

let countL = matL.rows * matL.columns

let typedPointerL = rawPointerL.bindMemory(to: Float.self, capacity: countL)

let bufferedPointerL = UnsafeBufferPointer(start: typedPointerL, count: countL)

print(" ")

print(" ",bufferedPointerL[0], bufferedPointerL[1])

print(" ",bufferedPointerL[2], bufferedPointerL[3])

print(" ")

Posted by

richardson_bay

Add a Comment

Answer 6

I have filed a report.

Posted by

richardson_bay

Add a Comment

Answer 7

The best result I found (ignoring the value of 1.0 in the upper diagonal) was without the padding and when I used only MemoryLayout.stride rather than MPSMatrixDescriptor.rowBytes. Of course, I am using the smallest matrix size so it may work otherwise with different size matrices, different coding, and/or different osMac or iOS configurations.

Anyway with the current results, I will filter out the upper quadrant and attempt to pass it into MPSMatrixSolveCholesky.

Posted by

richardson_bay

Add a Comment

Answer 8

I replaced the 1.0 in the upper off diagonal of the Cholesky lower decomposition with a zero and ran the MPSMatrixSolveCholesky. The solution is correct for a left hand side B matrix of:

1 0

0 0

The solution for X is:

2/3 0

-1/3 0

Guess what? If you just use the lower decomposition without the zero replacement ... or with the 1.0 you will get the same result!

In fact, you can put any number in the upper off diagonal and it will give you the correct result!

Apple just ignores the upper off diagonal. I should have thought of this before. (But if you were to use this 'Apple decomposition' in matrix multiplication I think you would need those zeros in the upper off diagonal.)

Still, haven't gotten the correct answer on my iMac ... only the MacBook Pro ... so far.

-----------------------------------------

// Metal_LA : Cholesky factorization and solver test

// Metal Linear Algebra solver via Cholesky factorization

// This code solves: A * X = B for a 2 x 2 test matrix

// where A is M x M, X is M x M, and B is M x M and M = 2

import MetalPerformanceShaders

let device = MTLCreateSystemDefaultDevice()!

let commandQueue = device.makeCommandQueue()

// memory layout and matrix descriptor

let M = 2

let row = M * MemoryLayout<Float>.stride

let matlength = M * row

let mdesc = MPSMatrixDescriptor(

dimensions: M, columns: M, rowBytes: row, dataType: MPSDataType.float32)

// assign arrays

let arrayA: [Float] = [ 2.0 , 1.0 , 1.0 , 2.0 ]

let buffA = device.makeBuffer(bytes: arrayA, length: matlength)

let matA = MPSMatrix(buffer: buffA!, descriptor: mdesc)

let arrayB: [Float] = [ 1.0 , 0.0 , 0.0 , 0.0 ]

let buffB = device.makeBuffer(bytes: arrayB, length: matlength)

let matB = MPSMatrix(buffer: buffB!, descriptor: mdesc)

let arrayX: [Float] = [ 0.0 , 0.0 , 0.0 , 0.0 ]

let buffX = device.makeBuffer(bytes: arrayX, length: matlength)

let matX = MPSMatrix(buffer: buffX!, descriptor: mdesc)

var arrayL: [Float] = [ 0.0 , 0.0 , 0.0 , 0.0 ]

var buffL = device.makeBuffer(bytes: arrayL, length: matlength)

var matL = MPSMatrix(buffer: buffL!, descriptor: mdesc)

// Create the device 'kernel' to perform decomposition.

let decomp = MPSMatrixDecompositionCholesky(device: device, lower: true, order: M )

// Create a command buffer in the queue.

let commandBuffer = commandQueue?.makeCommandBuffer()

// Encode the kernel to the command buffer.

decomp.encode(commandBuffer: commandBuffer!,

sourceMatrix: matA, resultMatrix: matL, status: buffX)

// Create the device 'kernel' to perform the Cholesky solution.

let solve = MPSMatrixSolveCholesky(device: device,

upper: false, order: M, numberOfRightHandSides: 1 )

// Encode the kernel to the command buffer.

solve.encode(commandBuffer: commandBuffer!,

sourceMatrix: matL, rightHandSideMatrix: matB, solutionMatrix: matX)

// Commit the buffer and wait for completion.

commandBuffer?.commit()

commandBuffer?.waitUntilCompleted()

// bind Metal matrices to pointers for printing

// all matrices are the same size

let count = matA.rows * matA.columns

let rawPointerA = matA.data.contents()

let typedPointerA = rawPointerA.bindMemory(to: Float.self, capacity: count)

let bufferA = UnsafeBufferPointer(start: typedPointerA, count: count)

print(" ")

print(" Solutions to: A * X = B ")

print(" ")

print(" A matrix")

print(" ",bufferA[0]," ",bufferA[1])

print(" ",bufferA[2]," ",bufferA[3])

print(" ")

let rawPointerB = matB.data.contents()

let typedPointerB = rawPointerB.bindMemory(to: Float.self, capacity: count)

let bufferB = UnsafeBufferPointer(start: typedPointerB, count: count)

print(" B matrix")

print(" ",bufferB[0])

print(" ",bufferB[2])

print(" ")

let rawPointerL = matL.data.contents()

let typedPointerL = rawPointerL.bindMemory(to: Float.self, capacity: count)

let bufferL = UnsafeBufferPointer(start: typedPointerL, count: count)

print(" Lower decomposition from MPSMatrixDecompositionCholesky ")

print(" ",bufferL[0]," ", bufferL[1])

print(" ",bufferL[2]," ", bufferL[3])

print(" ")

let rawPointerX = matX.data.contents()

let typedPointerX = rawPointerX.bindMemory(to: Float.self, capacity: count)

let bufferX = UnsafeBufferPointer(start: typedPointerX, count: count)

print(" Solutions from MPSMatrixSolveCholesky ")

print(" ",bufferX[0])

print(" ",bufferX[2])

print(" ")

Posted by

richardson_bay

Add a Comment

Answer 9

> I don't get good results with MPSMatrixDescriptor.rowBytes

In case it wasn't clear, you'd also need to do this on the output matrix, so if the input buffer comes [ x, x, 0, 0, x, x, 0, 0 ] where x is real numbers and 0 is padding, then the output buffer is also [ x, x, 0, 0, x, x, 0, 0 ] and not [ x, x, x, x ]. In any case, it's an optional thing to do anyway and it doesn't change the (wrong) answer given by MPSMatrixDecompositionCholesky.

Posted by

kerfuffle

Add a Comment

Answer 10

Thank you. I'll make sure I try to set that up correctly. I tried posting my lastest code including the MPSMatrixSolveCholesky solution but it says it is being moderated. I don't know what else further I can post on this thread, but I'll try to clear up any issues with the differences of operation on devices or machines and post results. I am mostly concerned that I am not coding things correctly on my part, however, I did submit a bug report.

Posted by

richardson_bay

Add a Comment

Answer 11

I got back to my iMac and I checked it one last time. It doesn't seem to work for my simple 2 x 2 matrix with my coding.

But it does work on my MacBook Pro.

I then changed the size of the matrix to a lucky 13 x 13 matrix. It worked perfectly on the MacBook Pro but not the iMac.

So it consistently fails on the iMac if the matrix size is changed. Again, this is with my coding ... so that could be the incorrect. I will look at how I load the data into the buffers again.

Posted by

richardson_bay

Add a Comment

Answer 12

I added a couple of example files to the bug report. It is strictly a problem with the MPSMatrixDecompositionCholesky on the iMac (I am running it on a 2017 iMac). When I input the correct lower decomposition matrix for the MPSMatrixSolveCholesky (bypassing the MPSMatrixDecompositionCholesky) the iMac returns the correct solution matrix.

Posted by

richardson_bay

Add a Comment

Answer 13

Can you tell us the GPU in your iMac and the GPU in your MacBook Pro?

Posted by

amunshi

Add a Comment

Answer 14

The Intel Iris Pro 650 in the 2017 MacBook Pro.

The Radeon 555 in the 2017 iMac.

Something happens with the calculation on the 555. At first it seems to give the same answer as with 650 ... for the first row ... but then the second row of the decomposition doesn't make sense: inf -inf

Wait, I just had a thought ... I've been performing only the lower decomposition ... maybe I should try the upper?

Posted by

richardson_bay

Add a Comment

Answer 15

Nope. Doesn't give the correct answer whether setting the lower or the upper decomposition. I keep thinking I am not loading the buffer correctly for the 555 compared to the 650. Maybe it is these trivially small matrices I am using! I haven't tried the code on the iPhone but kerfuffle (above) says it works on the iPhone 7 with iOS 11.2 .

Posted by

richardson_bay

Add a Comment

Sample code for MPSMatrixDecompositionCholesky?

Replies