mastodontech.de ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Offen für alle (über 16) und bereitgestellt von Markus'Blog

Serverstatistik:

1,5 Tsd.
aktive Profile

Today, I found a great presentation by @okias about running -accelerated on @postmarketOS (see: fosdem.org/2024/events/attachm). @okias showed a benchmark with a OnePlus 6 device. I wonder whether it would be possible to use even older hardware to run a functional . Is there maybe even a list of supported devices? 📱 🧠 🤓

@LukasBrausch @postmarketOS Thanks. Adreno 630 is a pretty powerful high-end mobile GPU and I was running only GPT2-level LLM. I think older GPUs exposing OpenCL could be still leveraged to accelerate some simpler tasks (e.g. translators etc.), but bigger LLM would be out of scope.

Lukas Brausch

@okias @postmarketOS I see. Thanks a lot for the clarification. What about the other extreme? What's the most potent that can currently be run on a phone (preferably using only free solutions, such as @postmarketOS)? To me, the idea of having an LLM on a phone instead of a large server farm is just very intriguing.

@LukasBrausch @postmarketOS I think someone mentioned some success regarding some simple small GPT-3. I didn't continue pursuing it, because I want to address a hack I had to do for in Mesa for Tinygrad to work. We also need to get the driver part finished and reviewed. I see the "future" more in smart translators and simple one-language speech recognition on these low-power devices.

@okias @postmarketOS Thanks a lot for your insights and the hard work you're doing here. Much appreciated. 👍 💪