Application Programming Interface, are you talking about something on the internet? On a gpu driver? On your phone?
Then also, what’s the size model you’re using? Define with int32? fp4? Somewhere in between? That’s where ram requirements come in
I get that you’re trying to do a mic drop or something, but you’re not being very clear
Anyways, the important thing is the “TOPS” aka trillions of operations per second. Having enough ram in important, but if you don’t have a fast processor than you’re wasting ram while you can just stream it from a fast ssd.
One such cases is when your system can’t handle more than 50 tops, like the apple m systems. Try an old gpu, and enjoy 1000’s of tops