Training a model without a GPU

Matburnx@sh.itjust.works · 11 months ago

Training a model without a GPU

rufus@discuss.tchncs.de · edit-2 11 months ago

How did you determine the dataset size? I mean if it’s just a few megabytes of French books, I’m not surprised you don’t get any results out of that. And it also depends how you feed it in and what parameters you choose for training and model architecture. There are several scientific papers researching for example the needed dataset size to corresponding parameter count of the model.

Once you choose the correct dataset size, have a look at your loss graphs. Do they converge? Did you run training long enough? I suppose it should take weeks (to months?) on an (old) laptop CPU before you see any results, even at that model size.