Monday, September 18, 2023

Finetuning

#Finetuning in deep learning can be thought of as adjusting an already trained instrument to suit new data better. Consider a simple example where you have a neural network with just one parameter that's designed to predict people's heights using mean squared error (MSE) loss. Ideally, the best prediction would be the average height. Let's say you train this model on a vast dataset, and now you have this pre-trained model. Intuitively, this network has "seen" many examples and has formed a generalized understanding of average height. Now, imagine introducing a new dataset. If you were to train the model solely on this new data, the model would lose its memory of the old data. The learned average might drastically shift if the new population's height is different from the old one. On the other hand, if you incorporate the knowledge from both datasets, your prediction would likely be somewhere between the averages of the two datasets. From a Bayesian standpoint, the ideal balance would give weights to the old and new datasets proportional to their sizes. So, if the new dataset has more data points, it would influence the prediction more heavily. However, if your primary interest is in understanding the new dataset, you might prioritize it and disregard the old data. The core challenge in fine-tuning lies here: how do you strike a balance between retaining useful knowledge from prior data and adapting to new information? This issue is crucial as models venture into new domains or deal with evolving data streams.

No comments:

AI

Despite the benefits of AI we are starving for humanity.