If you've embeded a dataset with nomic-embed-text-v1.5 you can "process SAE" in the embed step.
This will then annotate each row with SAE features from https://enjalot.github.io/latent-taxonomy/articles/about
You can then explore essentially the concepts that the embedding model uses to represent each data point.
You can also filter by a particular SAE feature to see which rows strongly activate for that concept.
![Screenshot 2024-12-20 at 11 05 43 AM](https://private-user-images.githubusercontent.com/96189/397796441-1f52ffc3-9ccd-4c6d-89f5-474bdd156dd8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDAxMTIzNzYsIm5iZiI6MTc0MDExMjA3NiwicGF0aCI6Ii85NjE4OS8zOTc3OTY0NDEtMWY1MmZmYzMtOWNjZC00YzZkLTg5ZjUtNDc0YmRkMTU2ZGQ4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjIxVDA0Mjc1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUwNmNlOTllZjAwNzQyYjcyZGM5N2FhYzgyY2FiZTM5YTE2NzJlMjVlNWQ4OWQ2NjQ3YmZlNjIzMTVhMDNhNzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.XnNv_yOgfQ8hCdl1PDTGque9EkMtXia7xJs2cczqyPA)
![Screenshot 2024-12-20 at 11 05 51 AM](https://private-user-images.githubusercontent.com/96189/397796461-c568b020-aa83-47c9-ad3b-597bef6b6533.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDAxMTIzNzYsIm5iZiI6MTc0MDExMjA3NiwicGF0aCI6Ii85NjE4OS8zOTc3OTY0NjEtYzU2OGIwMjAtYWE4My00N2M5LWFkM2ItNTk3YmVmNmI2NTMzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjIxVDA0Mjc1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY2MzY0Y2NlYTA2YTViZDA2MzVmYzMxOGIzMWQyOGYxYmI3OTliZmJkMDc4ODQ0YjMyYjlhYjk1YmMwMTQ2N2EmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.WeRx-5YraVls545hRKUdQ7EazLO15AulkURHSdugx-8)
![Screenshot 2024-12-20 at 11 06 19 AM](https://private-user-images.githubusercontent.com/96189/397796471-9c49e9a7-c47b-41b2-adfb-7963237b0332.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDAxMTIzNzYsIm5iZiI6MTc0MDExMjA3NiwicGF0aCI6Ii85NjE4OS8zOTc3OTY0NzEtOWM0OWU5YTctYzQ3Yi00MWIyLWFkZmItNzk2MzIzN2IwMzMyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjIxVDA0Mjc1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTg5MjA5NWIzY2YzMDk3ZDY5MjBjNzJjYWRmOTI1MzFiZGY1ZmE1ZWQ4OGMyMDhlYTYxMzYzOTFmMGUwNTU3MWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.SgUXH4e33E-xvXTyaGlFbnJrH-h_XCAGF1LjCY2eTCQ)