Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can you elaborate a bit more why you chose llama.cpp? From their docs:

> "The main goal of llama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook"

I'm not interested in running an LLM on embedded devices; the NN I want to deploy is really tiny and used for audio signal processing. It extracts information out of a small chunk of audio. The code needs to run on various embedded ARM chips.

What would be the advantage of using llama.cpp for this?



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: