GPU-accelerated Llama3.java inference in pure Java using TornadoVM

github.com

46 points by pjmlp 3 days ago