In order to keep things simple while covering the basics, up till now I’ve written as if there is a single mathematical model governing the relationship of all words to each other. This is certainly not the case. There are potential models for every person in the world. For every philosophy, every religion, and every viewpoint. There are even valid models for every combination and permutation of every existing model. If a single model is a universe, the set of all models is the multiverse!
In upcoming posts I will explore how different models can interact with each other, merging or clashing. Models are also not static, but are continuously changing. However, in this post we will look at how even using the same training data can result in different models.
Let’s say we have two models, one with 1,000 dimensions, and another with 150, and we train them both using the book War and Peace. Both models will contain around 20,000 words. But the model with fewer dimensions will be lower resolution, and the words will lack the nuances of the model with more dimensions. If you reduce the resolution too much, the model becomes worthless, like a blurry picture where you can’t make out the details.
Because of diminishing returns, the 850 extra dimensions do not make the model 5 – 6 times more accurate. But if you include or omit an important dimension, it can make a huge difference. A perfect example of this is looking up into the night sky, and seeing two stars which appear to be right next to each other. But because of their distances from earth (the dimension we can’t see) they might really be very far apart.
This is what is known as a “paradigm shift.” A paradigm shift is when a single bit of information fundamentally changes your view of a concept or situation. Similarly, in a model, a single dimension can fundamentally change the meaning of a word.
Let’s take for example the words “pepper” and “jalapeño.” The two words are very similar, and share many dimensions, clustering them close together in the model with other vegetables and fruits. But if you add the dimension “spiciness” they become much further apart. Though not as far apart as “jalapeño” and “strawberry”.
Without this dimension, the prompt “create a recipe for two alarm chili” might end up substituting “jalapeño” with “bell pepper”, resulting in some disappointing results!
Leave a Reply