It seems like Apple Silicon (even M1/M2 MAX) devices can only use a certain percentage of their total unified memory for GPU/Metal. This seems to be a limitation related to: recommendedMaxWorkingSetSize
Which is quite odd because even M1 Mac Mini's or Macbook Airs run totally fine with 8GB of total memory for both the OS and GPU so why limit this in the first place?
Also seems like false advertising to me from Apple by not clearly stating this limitation.
I am asking this in regards to the following open source project (but of course more software will be impacted by the same limitation): https://github.com/ggerganov/llama.cpp/pull/1826
another resource I've found: https://developer.apple.com/videos/play/tech-talks/10580/?time=546
If anyone has any ideas on how these limitations can be overcome and how to get apps to use more Memory for GPU (Meta)l I (and the open source community) would be truly grateful! thanks in advance!