
Semantic projection recovers rich human knowledge of multiple object features from word embeddings (Grand et. al. 2022) is a very cool paper.
It shows that simple word embeddings don’t just stand in linear conceptual relations to each other (i.e. v(king) – v(man) + v(woman) = v(queen)), but that there are interpretable subspaces within the embedding spaces of even simple language models. What this means is that for two antonyms*, you can plot a line between their embedding vectors and interpret this line as a scale, a semantic subspace. You can then project your other embedding vectors to this subspace. To put it simply, if you take the embedding vectors for ‘large’ and ‘small’ and plot a line between them, it will turn out that vectors for animals in the embedding space tend to fall at the right spots along this line.

This is exciting because, again, the embedding vectors are generated by means of a simple language modelling task. Why should the task of predicting the next word in a sentence generate embeddings that contain this amount of implicit representational content? I also love that this paper came out years after these simple models were released. 8 years had passed between the release of the GloVe embeddings used and the official publication of the paper. It really shows the value of trying to understand what’s going on old models.
Anyway, to wrap up my recent spate of playing around with the Stanford Encyclopaedia of Philosophy, I wanted to see if there were any semantic subspaces in it. Specifically, I wanted to know if the model had an analytic – continental subspace. If I plotted a line between the vector for “analytic” and the vector for “continental” and used this vector to define a new subspace, would the more analytic philosophers be on one side and the continental philosophers be on the other.
This is what we have on the right hand side of this page. Needless to say, the gods of matplotlib continue to curse me, however, the general shape of this space is clear and I don’t think it is completely unreasonable. There are a few shocks; Whitehead is regarded as more continental than Heidegger and Barcan as more continental than Husserl. But generally, it gets something right about the vibes of the philosophers. I suspect that analytic philosophers are more spaced out on the scale due to the analytic bias in the dataset (I describe this here).
What interests me is how we might challenge the model’s claims when we disagree. Grand et. al. evaluate their subspaces against the ratings of people on MTurk and its easy to guess how human ratings might vary for features like danger, intelligence, and gender in ways informed by the biases of the groups surveyed. In our case, I might disagree with the model and claim that Millikan is more of an analytic philosopher than Rorty. But I could just as well imagine that the model has ‘discovered’ something and use this as a prompt to rethink how I situate philosophers in the canon. After all, I doubt that nature bestows an ordinal ranking of analyticity on philosophers. In any case, what would it mean to say that the model ‘discovered’ something? I suspect that using the model as a prop for these reflections rather than as a pure measuring device is the best way to handle some applications of machine learning to social phenomena. In fact, I think a more realist reading could be harmful. This is just one way among many that interpretability is a political issue.
*Technically they aggregate trios of antonyms for this, e.g. {big, large, huge}{small, little, tiny} and it doesn’t work for all antonyms.
[…] month, I wrote about a semantic subspace in the word2vec model of the Stanford Encyclopedia of philosophy. The idea, in short, is that you […]
LikeLike