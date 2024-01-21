In a groundbreaking development, researchers from the University of Washington and Allen Institute for AI have unveiled a new method, dubbed proxy-tuning, which allows the adaptation of large language models without the necessity to modify their internal weights. This innovation is particularly beneficial for models bearing private weights, such as OpenAI's GPT-4, where direct fine-tuning is either not feasible or demands heavy resource investment.

The Principle of Proxy-Tuning

Proxy-tuning operates by contrasting the predictions of a smaller, fine-tuned language model with those of an untuned version. Subsequent adjustments are executed to the base model's outputs based on these disparities, effectively simulating the effects of direct tuning. This method thus enables the customization of language models while maintaining the advantages of their exhaustive pre-training.

Proxy-Tuning in Practice

In practical applications, proxy-tuning has demonstrated a significant enhancement in model performance. This innovative method has shown considerable success in datasets such as AlpacaFarm and GSM, where it scored an impressive 88.0% win rate on AlpacaFarm and 32.0% on GSM for a 70B parameter base model. Remarkably, it nullified toxicity to 0% in the Toxigen dataset and surpassed CHAT models in truthfulness in the TruthfulQA's open-ended scenario.

Closing the Performance Gap

With proxy-tuning, the performance gap was narrowed by 91.1% at the 13B scale and 88.1% at the 70B scale, evidencing its efficiency in augmenting model behavior without gaining access to internal model parameters. The researchers are encouraging model-producing organizations to disclose output probabilities to encourage wider use of such methods.

Simultaneously, the concept of Self-Rewarding Language Models is explored, where the language model itself provides its own rewards during training. Evidence shows that this approach elevates the model's instruction following capacity and its competency to provide high-quality rewards to itself. The fine-tuned model has already outshone existing systems on the AlpacaEval 2.0 leaderboard. This work paves the way for models capable of perpetually improving in both axes, marking a significant stride towards recursive self-improvement.