The Uses of AI
A key principle of value investing is that the
investor should understand what the company does. An analysis of generative AI is therefore an
exploration of how investments in generative AI can be profitable in the
future. It is also an exploration of how people make sense of their world.
Taking these goals with AI, we first discuss how this AI can expand present
practice, but impose its own limits. We then discuss profits.
Computers classify data patterns. The pattern that
generative AI seeks is the “appropriate” next answer word in response to a
question, both analyzed from the standpoint of individual word context and
individual word meaning (value). What is important in both cases is context. To give an example from, “Understanding
Query, Key, Value in Transformers and LLMs” by Charles Chi. Consider the
phrase: dog plays fetch. How does the computer understand the phrase?
How does the reader understand this phrase? Assuming that we have described the
process well, a reader understanding is also very useful to describe what
generative AI does.
Every word, word fragment, or punctuation sign in a
query (reduced to a series of numeric tokens) performs the function of asking a
question and responding by providing (keys) and (values) for each word. Keys
tell the query what the context is of a particular word and values tell
the exact meaning of a particular word. Human cognition is hierarchical and
therefore more efficient. Human cognition starts with the verb and uses
grammatical syntax (the way the language is organized) to expand meaning (as
determined by high school sentence diagramming). Machine cognition, in
contrast, processes all the tokens in parallel, “…tokens are both seeking
relevant context and offering their own context to others…” We also illustrated
the transformer process in our essay, “The
Democratization of Generative AI.”
Clearly the human processing of syntax is more
efficient than the bottom up generation of vectors
which the query does. Each token (word fragment) in the query produces three
vectors, the Q, K,V vectors which are properly generated first using optimized
error correction and then a special softmax function to generate attention
weights determining which tokens provide the most relevant context of
the question.
To determine the context of the phrase dog
plays fetch in the Python computer language, the author takes the (vector
direction) scalar dot product between “plays” and the keys of other tokens,
this determines the attention “plays” pays to “dog” and “fetch”.
1) #
“plays” is the querying token, dog is the key token. Note that the computer
operates only on token numbers, rather than alphabetic characters.
2) similarity_plays_dog
= np.dot(query_plays, key_dog) ¬ note
3) similarity_plays_fetch
= np.dot(query_plays, key_fetch)
4) print(similarity_plays_dog)
5) 0.406…
6) print(similarity_plays_fetch)
7) 0.154…
He then calculates the attention that “plays” pays to
“dog” and “fetch,” from the similarity vectors.
1) similarity_scores=np.array([similarity_plays_dog,
similarity_plays_fetch])
2) attention weights= np.exp(similarity_scores)/np.sum(np.exp(similarity_scores))
3) #attention
weights for “dog” and “fetch” with respect to ‘plays’:
4) [0.562…,0.437…]
5) Clearly,
“plays” will pay attention to ”dog”, the actor.
“The updated context vector for ‘plays’ is a
contextualized representation, enriched with relevant information from the
other tokens ‘dog’ and ‘fetch.’ These tokens are passed downstream for further
processing and refinement.” Thus context
becomes relevant in generative AI, in both the question and the answer. Note
that vector similarities in both the question and answer are determined
mathematically, by the dot vector product, or if given only three dimensions by
geometric cosine similarity.
Now to get to the main point of this research article.
Humans make sense of the world by using three modes of reasoning: Abductive, specific
case by case reasoning; Deductive, general case reasoning; and Inductive,
bottom-up reasoning (with the aid of the top down H0 concept). An
Apple research article dated 10/7/24 is titled, “GSM-Symbolic: Understanding the
Limitations of Mathematical Reasoning in Large Language Models.” It concludes
that the models are capable only of the statistical vagaries of bottom-up
reasoning. “LLMs exhibit notable variance when responding to different
instantiations of the same question. Specifically, the performance of all
models declines when only the numerical values in the question are altered in
the GSM-Symbolic benchmark. Furthermore, we investigate the fragility of
mathematical reasoning in these models and demonstrate that their performance
significantly deteriorates as the number of clauses in a question increases. We
hypothesize that this decline is due to the fact that current LLMs are not
capable of genuine logical reasoning; instead, they attempt to replicate the
reasoning steps observed in their training data.”
This Apple article says that all bottom-up generative
AI is fundamentally “illogical,” and certainly cannot be shown to be
understanding of first premises and conclusions. But why then the Stanford and
University of Washington S1 model, that claims vastly improved math
performance, trained with 16 Nvidia H100s, costing less than $50 in cloud
compute credits?
Logic is not intrinsic to Stanford’s S1 model, but is
an add-on. According to the 2/10/25 paper, “From Brute
Force to Brain Power:..” David Scott Lewis of “Executive Consulting”;
Zaragoza, Spain summarizes the Stanford “‘s1K’ dataset, a meticulously curated
set of 1,000 high-quality step-by-step reasoning examples drawn from
challenging math, logic, and science problems. Fine-tuning on this compact
dataset required only minutes of GPU time, demonstrating unprecedented sample-
and cost-efficiency.” A second improvement occurs by simply, “…injecting the
token ‘Wait’ when the model attempts to terminate its reasoning (too) early,
users can prompt additional steps of chain-of thought. This simple intervention
effectively boosts accuracy on difficult questions by letting the model
self-correct initial errors.”
The first takeaway is that the field of generative AI,
in the long run, is wide open to innovation, whose markets are not subject to
moat-like margin protection. The second takeaway is that the applications are
myriad, dependent upon human ingenuity to dream up, humans remaining in
control. When asked about how AI would develop, Sam Altman CEO of ChatGPT,
noted that when he developed Chat GPT-3 in 2022, developers in “The Playground”
suddenly wanted to converse with the program and he thus fine-tuned the program
to be more conversational. Generative AI cannot really do top-down logic, but
can well simulate current general practice and extend it to new areas.
Generative AI may not be able to conceive, as Einstein did, of what would
happen to time and mass, if the speed of light would be held constant for all
observers. But it can conceive of new, unexplored materials, medicines and
procedures according to present theory.
Will generative AI be
wildly profitable for its sponsors? As a general-use technology it may come
into widespread use, but may not be generally profitable. The big four hyperscalers in Silicon Valley, according to the 2/7/25
Bloomberg expect to spend $325 billion in 2025, which will result in
additional depreciation expenses of $54 billion per year. Add a 15% financial
return on capital (mixing slightly time, accounting and financial analysis) we
get a required cash flow of around $102 billon per year. Total revenue growth
of the four is projected by Value LineÓ
to be around 12% per year in 2025, if recession does not intervene and climate
change does not get appreciably worse. This would be a growth in total revenues
of around $168 billion per year. Required growth in generative AI revenues is
way too close to total revenue growth. Although generative AI could be
useful to many people, we don’t think it can be profitable for many years. To
save resources, maybe a few training runs, costing millions, could be made per
year. Many subsidiary models could be developed from these.
__
Does NVIDIA have a moat
with its CUDA operating system? Likely yes, but it is decreasing. It is
possible to get around the CUDA operating system, but that takes time and
money. AMD, Amazon and Intel generative AI operating systems are still Works in
Progress, whereas CUDA is mature. Nvidia is trying to extend its operating
system from cloud-based Infrastructure to the Enterprise and is developing one
for Robotics. AI progress is likely to be both quickly piecemeal and more
slowly, (FAIR), deliberate. Eventually, and only eventually, it will make a
macro difference; but there will also be other economic factors affecting
earnings growth.