The Uses of AI

 

A key principle of value investing is that the investor should understand what the company does.  An analysis of generative AI is therefore an exploration of how investments in generative AI can be profitable in the future. It is also an exploration of how people make sense of their world. Taking these goals with AI, we first discuss how this AI can expand present practice, but impose its own limits. We then discuss profits.

 

Computers classify data patterns. The pattern that generative AI seeks is the “appropriate” next answer word in response to a question, both analyzed from the standpoint of individual word context and individual word meaning (value). What is important in both cases is context. To give an example from, “Understanding Query, Key, Value in Transformers and LLMs” by Charles Chi. Consider the phrase: dog plays fetch. How does the computer understand the phrase? How does the reader understand this phrase? Assuming that we have described the process well, a reader understanding is also very useful to describe what generative AI does.

Every word, word fragment, or punctuation sign in a query (reduced to a series of numeric tokens) performs the function of asking a question and responding by providing (keys) and (values) for each word. Keys tell the query what the context is of a particular word and values tell the exact meaning of a particular word. Human cognition is hierarchical and therefore more efficient. Human cognition starts with the verb and uses grammatical syntax (the way the language is organized) to expand meaning (as determined by high school sentence diagramming). Machine cognition, in contrast, processes all the tokens in parallel, “…tokens are both seeking relevant context and offering their own context to others…” We also illustrated the transformer process in our essay, “The Democratization of Generative AI.”

 

 

Clearly the human processing of syntax is more efficient than the bottom up generation of vectors which the query does. Each token (word fragment) in the query produces three vectors, the Q, K,V vectors which are properly generated first using optimized error correction and then a special softmax function to generate attention weights determining which tokens provide the most relevant context of the question. 

To determine the context of the phrase dog plays fetch in the Python computer language, the author takes the (vector direction) scalar dot product between “plays” and the keys of other tokens, this determines the attention “plays” pays to “dog” and “fetch”.

1)    # “plays” is the querying token, dog is the key token. Note that the computer operates only on token numbers, rather than alphabetic characters.

2)    similarity_plays_dog = np.dot(query_plays, key_dog) ¬ note

3)    similarity_plays_fetch = np.dot(query_plays, key_fetch)

4)    print(similarity_plays_dog)

5)    0.406…

6)    print(similarity_plays_fetch)

7)    0.154…

He then calculates the attention that “plays” pays to “dog” and “fetch,” from the similarity vectors.

1)     similarity_scores=np.array([similarity_plays_dog, similarity_plays_fetch])

2)     attention weights= np.exp(similarity_scores)/np.sum(np.exp(similarity_scores))

3)    #attention weights for “dog” and “fetch” with respect to ‘plays’:

4)    [0.562…,0.437…]

5)    Clearly, “plays” will pay attention to ”dog”, the actor.

“The updated context vector for ‘plays’ is a contextualized representation, enriched with relevant information from the other tokens ‘dog’ and ‘fetch.’ These tokens are passed downstream for further processing and refinement.” Thus context becomes relevant in generative AI, in both the question and the answer. Note that vector similarities in both the question and answer are determined mathematically, by the dot vector product, or if given only three dimensions by geometric cosine similarity.

 

Now to get to the main point of this research article. Humans make sense of the world by using three modes of reasoning: Abductive, specific case by case reasoning; Deductive, general case reasoning; and Inductive, bottom-up reasoning (with the aid of the top down H0 concept). An Apple research article dated 10/7/24 is titled, “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models.” It concludes that the models are capable only of the statistical vagaries of bottom-up reasoning. “LLMs exhibit notable variance when responding to different instantiations of the same question. Specifically, the performance of all models declines when only the numerical values in the question are altered in the GSM-Symbolic benchmark. Furthermore, we investigate the fragility of mathematical reasoning in these models and demonstrate that their performance significantly deteriorates as the number of clauses in a question increases. We hypothesize that this decline is due to the fact that current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data.”

This Apple article says that all bottom-up generative AI is fundamentally “illogical,” and certainly cannot be shown to be understanding of first premises and conclusions. But why then the Stanford and University of Washington S1 model, that claims vastly improved math performance, trained with 16 Nvidia H100s, costing less than $50 in cloud compute credits?

Logic is not intrinsic to Stanford’s S1 model, but is an add-on. According to the 2/10/25 paper, “From Brute Force to Brain Power:..” David Scott Lewis of “Executive Consulting”; Zaragoza, Spain summarizes the Stanford “‘s1K’ dataset, a meticulously curated set of 1,000 high-quality step-by-step reasoning examples drawn from challenging math, logic, and science problems. Fine-tuning on this compact dataset required only minutes of GPU time, demonstrating unprecedented sample- and cost-efficiency.” A second improvement occurs by simply, “…injecting the token ‘Wait’ when the model attempts to terminate its reasoning (too) early, users can prompt additional steps of chain-of thought. This simple intervention effectively boosts accuracy on difficult questions by letting the model self-correct initial errors.”

The first takeaway is that the field of generative AI, in the long run, is wide open to innovation, whose markets are not subject to moat-like margin protection. The second takeaway is that the applications are myriad, dependent upon human ingenuity to dream up, humans remaining in control. When asked about how AI would develop, Sam Altman CEO of ChatGPT, noted that when he developed Chat GPT-3 in 2022, developers in “The Playground” suddenly wanted to converse with the program and he thus fine-tuned the program to be more conversational. Generative AI cannot really do top-down logic, but can well simulate current general practice and extend it to new areas. Generative AI may not be able to conceive, as Einstein did, of what would happen to time and mass, if the speed of light would be held constant for all observers. But it can conceive of new, unexplored materials, medicines and procedures according to present theory.

Will generative AI be wildly profitable for its sponsors? As a general-use technology it may come into widespread use, but may not be generally profitable. The big four hyperscalers in Silicon Valley, according to the 2/7/25 Bloomberg expect to spend $325 billion in 2025, which will result in additional depreciation expenses of $54 billion per year. Add a 15% financial return on capital (mixing slightly time, accounting and financial analysis) we get a required cash flow of around $102 billon per year. Total revenue growth of the four is projected by Value LineÓ to be around 12% per year in 2025, if recession does not intervene and climate change does not get appreciably worse. This would be a growth in total revenues of around $168 billion per year. Required growth in generative AI revenues is way too close to total revenue growth. Although generative AI could be useful to many people, we don’t think it can be profitable for many years. To save resources, maybe a few training runs, costing millions, could be made per year. Many subsidiary models could be developed from these.

 

__

 

Does NVIDIA have a moat with its CUDA operating system? Likely yes, but it is decreasing. It is possible to get around the CUDA operating system, but that takes time and money. AMD, Amazon and Intel generative AI operating systems are still Works in Progress, whereas CUDA is mature. Nvidia is trying to extend its operating system from cloud-based Infrastructure to the Enterprise and is developing one for Robotics. AI progress is likely to be both quickly piecemeal and more slowly, (FAIR), deliberate. Eventually, and only eventually, it will make a macro difference; but there will also be other economic factors affecting earnings growth.