kechan’s Profile | Apple Developer Forums

Reply to Error: command buffer exited with error status.

I am TF macos 2.9, and TF metal 0.5, M2 Max 96gb I ran into this issue using HF distilled Bert model to train on my dataset. My batch size is just 128 (less than 512 you reported, but impact sort of depends on the model). I suspect this may be a memory issue (or mismanagement/misalignment due to framework bugs). I will try to reduce the batch size and see if this improves. But even so, this may be quite a disappointment since i got 96gb to really push the batch size up in my local env.

Machine Learning & AI General

Apr ’23

Reply to GPU utilization decays from 50% to 10% in non-batch inference for huggingface distilbert-base-cased

I found out this has something to do with the variation in length of input tokens from one inference to the next. It doesn't seem to like receiving lengths that vary greatly, maybe this causes some sort of weird fragmentation in GPU memory?? Here's the code that only extract IMDB sentences that has >512 tokens. And it is able to sustain GPU utilization, with ~30it/s. from transformers import AutoTokenizer, TFDistilBertForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("distilbert-base-cased") model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-cased') from datasets import load_dataset imdb = load_dataset('imdb') print('starting collecting sentences with tokens >= 512') sentences = [sentence for sentence in imdb['train']['text'] if tokenizer(sentence, truncation=True, return_tensors='tf')['input_ids'].shape[-1] >= 512] print('finished collecting sentences with tokens >= 512') for k, sentence in tqdm(enumerate(sentences)): inputs = tokenizer(sentence, truncation=True, return_tensors='tf') output = model(inputs).logits pred = np.argmax(output.numpy(), axis=1) if k % 100 == 0: print(f"len(input_ids): {inputs['input_ids'].shape[-1]}") print: 7it [00:00, 31.12it/s] len(input_ids): 512 107it [00:03, 32.38it/s] len(input_ids): 512 ... ... 3804it [02:00, 31.85it/s] len(input_ids): 512 3904it [02:03, 32.50it/s] len(input_ids): 512 3946it [02:04, 31.70it/s]

Graphics & Games Metal

Apr ’23

Reply to M1 GPU is extremely slow, how can I enable CPU to train my NNs?

Got M2 Max here (2023), I tried to run inference (one by one, no batch) using huggingface "distilbert-base-cased" (after fine-tuning with my dataset). It runs 10it/s in the beginning, but after a few min, GPU utilization drops to less than 1%, and now it took >1s per it! that's huge disappointment. I don't know what I have done wrong. I tried to turn on an external fan thinking it may be heat throttling, but I don't see utilization going back up. How can I debug this?

Machine Learning & AI General

Apr ’23

Reply to Cannot run tensorflow-macos-2.10.0

Are you able to follow “Frameworks Engineer” and get this issue resolved?

Machine Learning & AI General

Mar ’23

Reply to AttributeError: module 'tensorflow.python.keras.layers' has no attribute 'RandomFlip'

As of Mar 2023, I was advised to use tensorflow-macos 2.9.0 and tensorflow-metal 0.5 This seems to have no such error: from tensorflow.keras.layers import RandomFlip, RandomRotation data_augmentation = tf.keras.Sequential([ RandomFlip("horizontal"), RandomRotation(0.1), ])

Machine Learning & AI General

Mar ’23

kechan

Post

Replies

Boosts

Views

Activity