Running this through R again to get its eigenvectors, the first two principal components are easily interpretable:
- PC1 (eigenvalue=7.3) separates Songhay into three low-level subgroups - Western, Eastern, and Northern, in that order - with an obvious longitude effect: it traces a line eastward all the way down the Niger river, jumps further east to In-Gall, and then proceeds back westward through the Sahara.
- PC2 (eigenvalue=1.1) measures the level of Berber/Tuareg influence.
The resulting cluster patterns have a strikingly shallow time depth; as in the Arabic example in my last post, this method's results correspond well to criteria of synchronic mutual intelligibility (Western Songhay is much easier for Eastern Songhay speakers to understand than Northern is), but it completely fails to pick up on the deeper historic tie between Northern Songhay and Western Songhay (they demonstrably form a subgroup as against Eastern). It's nice how the strongest contact influence shows up as a PC, though; it would be worth exploring how good this method is at identifying contact more generally.
* Strictly speaking, this may not quite count as PCA - I'm starting from a similarity matrix generated non-numerically, rather than turning the lexical data into binary numeric data and letting that produce a similarity matrix.
Update, following Whygh's comment below: here's what SplitsTree gives based on the same table: