New ask Hacker News story: Ask HN: 2x Arc A770 or 1x Radeon 7900 XTX for llama.cpp

March 11, 2025

Ask HN: 2x Arc A770 or 1x Radeon 7900 XTX for llama.cpp
2 by danielEM | 1 comments on Hacker News.
Can't find "apple to apple" comparison on performance on QWQ 32b (4bit), can anyone help me with decision on which solution to pick? From what I dig so far it looks like dual Arc A770 is supported by llama.cpp. And saw some reports that llama.cpp on top of IPEX-LLM is fastest way for inference on intel card. On the other end there is more expensive 7900 XTX on which AMD claims (Jan '25) that inference is faster than on 4090. So - what is the state of the art as of today, how does one compare to another (apple to apple)? What is tokens/s diff?

Search This Blog

Call center services in india

New ask Hacker News story: Ask HN: 2x Arc A770 or 1x Radeon 7900 XTX for llama.cpp

Comments

Post a Comment

Popular posts from this blog

How can Utilize Call Center Outsourcing for Increase your Business Income well?

New ask Hacker News story: EVM-UI – visual tool to interact with EVM-based smart contracts

New ask Hacker News story: Ask HN: Should I quit my startup journey for now?