As for LLM, there is probably some cost constant added once it can fit on a single GPU, but should probably be almost linear.
As for LLM, there is probably some cost constant added once it can fit on a single GPU, but should probably be almost linear.