RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 237414383616 bytes. Error code 12 (Cannot allocate memory)

Question

RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 237414383616 bytes. Error code 12 (Cannot allocate memory)

keloemma opened this issue 4 years ago · 3 comments

keloemma commented 4 years ago

Environment info

transformers version: 2.5.1
Platform: linux
Python version: 3.7
PyTorch version (GPU?): 1.4
Tensorflow version (GPU?):
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help

Model I am using (FlauBert):

The problem arises when trying to produce features with the model, the output which is generated causes the system run out of memory.

the official example scripts: (I did not change much , pretty close to the original)

import torch
from transformers import FlaubertModel, FlaubertTokenizer
# Choose among ['flaubert/flaubert_small_cased', 'flaubert/flaubert_base_uncased', 
#               'flaubert/flaubert_base_cased', 'flaubert/flaubert_large_cased']
modelname = 'flaubert/flaubert_base_cased' 

# Load pretrained model and tokenizer
flaubert, log = FlaubertModel.from_pretrained(modelname, output_loading_info=True)
flaubert_tokenizer = FlaubertTokenizer.from_pretrained(modelname, do_lowercase=False)
# do_lowercase=False if using cased models, True if using uncased ones

sentence = "Le chat mange une pomme."
token_ids = torch.tensor([flaubert_tokenizer.encode(sentence)])

last_layer = flaubert(token_ids)[0]
print(last_layer.shape)
# torch.Size([1, 8, 768])  -> (batch size x number of tokens x embedding dimension)

# The BERT [CLS] token correspond to the first hidden state of the last layer
cls_embedding = last_layer[:, 0, :]

My own modified scripts: (give details below)

def get_flaubert_layer(texte):

	modelname = "flaubert-base-uncased"
	path = './flau/flaubert-base-unc/'

	flaubert = FlaubertModel.from_pretrained(path)
	flaubert_tokenizer = FlaubertTokenizer.from_pretrained(path)
	tokenized = texte.apply((lambda x: flaubert_tokenizer.encode(x, add_special_tokens=True, max_length=512)))
	max_len = 0
	for i in tokenized.values:
		if len(i) > max_len:
			max_len = len(i)
	padded = np.array([i + [0] * (max_len - len(i)) for i in tokenized.values])
	token_ids = torch.tensor(padded)
	with torch.no_grad():
		last_layer = flaubert(token_ids)[0][:,0,:].numpy()
		
	return last_layer, modelname

The tasks I am working on is:

Producing vectors/features from a language model and pass it to others classifiers

To reproduce

Steps to reproduce the behavior:

Get transformers library and scikit-learn, pandas and numpy, pytorch
Last lines of code

# Reading the file 
filename = "corpus"
sentences = pd.read_excel(os.path.join(root, filename + ".xlsx"), sheet_name= 0)
data_id = sentences.identifiant
print("Total phrases: ", len(data_id))
data = sentences.sent
label = sentences.etiquette
emb, mdlname = get_flaubert_layer(data)  # corpus is dataframe of approximately 40 000 lines

Apperently this line produce something which is huge and which take a lot memory :
last_layer = flaubert(token_ids)[0][:,0,:].numpy()

I would have expected it run but I think the fact that I pass the whole dataset to the model is causing the system to break, so I wanted to know if it possible to tell the model to process the data set maybe 500 lines or 1000 lines at at a time so as to not pass the whole dataset. I know that , there is this parameter : batch_size which can be used but since I am not training a model but merely using it to produces embeddings as input for others classifiers ,
Do you perhaps know how to modify the batch size so the whole dataset is not treated. I am not really familiar with this type of architecture. In the example , they just put one single sentence but in my case I load a whole dataset (dataframe). ?

My expectation is to make the model treat all the sentences and then produced the vectors I need for the task of classification.

Answer 1 · 2021-04-26T13:55:47.000Z

I found the solution.

Answer 2 · 2021-04-26T14:18:32.000Z

May you indicate what was the problem ? (For people who could experiment the same problem). Thanks in advance,

Answer 3 · 2021-04-26T14:57:17.000Z

It was a problem link to "insufficient memory or space when using the model". I passed along small batches to the model to avoid this error: and i create a loop that goes over the I in range(0, len(padded), batch_size) and passes along the padded[i: i+batch_size] to the model, then concatenated the predictions back together.