mastodontech.de ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Offen für alle (über 16) und bereitgestellt von Markus'Blog

Serverstatistik:

1,5 Tsd.
aktive Profile

#sampling

0 Beiträge0 Beteiligte0 Beiträge heute

Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models

arxiv.org/abs/2412.15287

arXiv.orgInference-Aware Fine-Tuning for Best-of-N Sampling in Large Language ModelsRecent studies have indicated that effectively utilizing inference-time compute is crucial for attaining better performance from large language models (LLMs). In this work, we propose a novel inference-aware fine-tuning paradigm, in which the model is fine-tuned in a manner that directly optimizes the performance of the inference-time strategy. We study this paradigm using the simple yet effective Best-of-N (BoN) inference strategy, in which a verifier selects the best out of a set of LLM-generated responses. We devise the first imitation learning and reinforcement learning~(RL) methods for BoN-aware fine-tuning, overcoming the challenging, non-differentiable argmax operator within BoN. We empirically demonstrate that our BoN-aware models implicitly learn a meta-strategy that interleaves best responses with more diverse responses that might be better suited to a test-time input -- a process reminiscent of the exploration-exploitation trade-off in RL. Our experiments demonstrate the effectiveness of BoN-aware fine-tuning in terms of improved performance and inference-time compute. In particular, we show that our methods improve the Bo32 performance of Gemma 2B on Hendrycks MATH from 26.8% to 30.8%, and pass@32 from 60.0% to 67.0%, as well as the pass@16 on HumanEval from 61.6% to 67.1%.

Favourite unusual real-life sounds used as a rhythmic element in music?

- Cough (6klop): 6klop-electronix.bandcamp.com/
- Running in gravel (Ez3kiel): ez3kiel.bandcamp.com/track/aki
- Table tennis (Flying Lotus and Laura Darlington): flyinglotus.bandcamp.com/track
- Basketball and shoe squeaks (Memory Tapes): youtube.com/watch?v=WacIBze4YT

What's yours? I adore those and want to find more. I also love randomly hearing music in the world when there isn't any.

Antwortete im Thread

Jet Set Radio

Year: 2000
By: Hideki Naganuma, Richard Jacques, Toronto

In the early 2000s, SEGA was in the middle of a creative golden age. Artsy experiments such as Rez would release alongside huge innovative and expensive masterpieces such as Shenmue. At the same time, the studio was dying, plagued by the poor sales of both the Saturn and the Dreamcast and terrible business practices. Jet Set Radio came out in this very peculiar moment in time. An arcade game about graffiti, youth and subculture, it popularized cel-shading (a technical way to make 3D graphics look like cartoons) and influenced the video game for decades to come — ever heard of Splatoon?

A big part of JSR’s attitude came from its crazy soundtrack, Hideki Naganuma’s very first work as lead composer. Taking inspiration from funk and big beat genres, and making use of advanced sampling techniques, he made a soundtrack that sounds like no other. Maybe because sampling was barely ever used in video games before (Sonic CD being a notable exception).

In-house composers Tomonori Sawada (under the alias Toronto) and Richard Jacques also contributed to the original soundtrack under the supervision of Naganuma, and SEGA added a few licensed tracks to the final game — making the titular “radio“ of the game a tasteful, underground blend of hip-hop, rock and electro.

Best picks

Let Mom SleepHumming the Bassline
Everybody Jump Around
Rock It On
Grace and Glory

Full soundtrack

on streaming services

Songlink/OdesliLet Mom Sleep by Hideki NaganumaListen now on your favorite streaming service. Powered by Songlink/Odesli, an on-demand, customizable smart link service to help you share songs, albums, podcasts and more.