view the rest of the comments
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
Is that still true though? My impression is that AMD works just fine for inference with ROCm and llama.cpp nowadays. And you get much more VRAM per dollar, which means you can stuff a bigger model in there. You might get fewer tokens per second compared with a similar Nvidia, but that shouldn't really be a problem for a home assistant. I believe. Even an Arc a770 should work with IPEX-LLM. Buy two Arc or Radeon with 16 GB VRAM each, and you can fit a Llama 3.2 11B or a Pixtral 12B without any quantization. Just make sure that ROCm supports that specific Radeon card, if you go for team red.
I’m also curious. I have also heard good things this past year about AMD and ROCm. Obviously not as close to Nvidia yet (or maybe ever) but considering the price I’ve been considering trying.