How does text classification work?

Category

1. Text embeddings are generated for the open-ended responses using an AI embeddings model.

2. A k-means cluster analysis is conducted on the responses.

3. The k-means model is used to obtain a random sample of open-ended responses from each cluster.

4. A category name is generated for each sample using a generative AI model.

5. The k-means categories are replaced with the new category names and added to the dataset.

It’s also important to emphasize that you should review and validate the category names (and associations with open-ended responses) since an AI model cannot be held responsible for its work.