Mar 25, 2024

Derek Au on AI: "training is computation heavy, inferencing is computation light"

Inferencing is computation light; training is computation heavy.  Kind of like crypto, mining takes more resources than validation. 

At a very high level, what happens during the training of a model is that you ingest and analyze high volumes of data in order to derive a mathematical representation of the correlation of all of the data.  For example, in order to train a model to identify cats, you would expose the training algorithm to hundreds of thousands of pictures of cats.  You iterate the training of the algorithm across all of the pictures, the result of which converges to a mathematical equation, which becomes the trained model.  The training process is very computationally intensive because you are computing across the entire training set, i.e. every single one of the cat pictures.  GPUs are uniquely suited for these tasks because many of those computations can be performed in parallel. 

Inferencing is simply pattern matching.  It is the process of making predictions, decisions, or conclusions based upon the data given to the trained model.  You take an unknown picture, run it through the trained model, and it returns a probability distribution on whether the picture is a cat.  Inferencing is computationally light relative to training because the computation involved is only running the single picture in question through the trained model (as opposed to deriving a mathematical model of a cat by computing though hundreds of thousands of pictures of various cats). 

Inferencing can be done on existing CPUs, which are better suited for sequential computing.  It is fine for this task because you are only processing the image in question.  Still, there are a number of startups creating inferencing chips that aim to be faster, and more energy efficient (not for climate change, but for battery life!)

Over many years, Nvidia has developed its own proprietary programming language for its GPUs, to which the developer community has become accustomed.  Nvidia has the lead and mindshare on this.  I believe this is a competitive moat.  While competitors may introduce rival GPUs, will the programming community adopt them if the rival GPUs lack the familiar programming resources that programmers are accustomed to?

~ Derek Au, technology analyst, Orange Silicon Valley, February 14, 2024



No comments: