A lightweight, business‑ready analytics tool that transforms raw text into visual insights.
Upload customer feedback, support tickets, policies, CVs, or any text dataset — the app generates embeddings, reduces them to 2D, clusters them, and reveals hidden patterns.
Perfect for SMEs, councils, charities, and teams that need fast, AI‑powered text understanding without complex infrastructure.



This tool is intentionally small but delivers real business value:
pip install -r requirements.txt
streamlit run app.py
mini-embedding-explorer/
│
├── app.py # Streamlit UI
├── screenshots/
├── embeddings.py # Embedding + clustering engine
├── requirements.txt # Dependencies
├── README.md # Project documentation
└── examples/
└── feedback.csv
Embeddings
Text is converted into numerical vectors using a SentenceTransformer model.
Dimensionality Reduction
High‑dimensional vectors are compressed into 2D using PCA or UMAP.
Clustering
KMeans groups similar texts together.
Visualisation
The 2D points are plotted so humans can see patterns instantly.
Upload a CSV of customer comments
Select the text column
Choose PCA or UMAP
Pick number of clusters
Generate embeddings
Explore clusters visually
Download results
A business uploads:
feedback.csv - "Delivery was late again" - "Website login keeps failing" - "Customer service was excellent" - "Refund process is confusing"
The tool reveals clusters like:
Delivery issues
Website bugs
Positive service comments
Refund complaints
This helps teams prioritise improvements.
Advanced Embedding Models — Add support for larger or domain‑specific models (legal, financial, medical) to improve clustering accuracy for specialised industries.
Semantic Search Engine — Allow users to search their dataset using natural language queries powered by embeddings, turning the tool into a mini knowledge explorer.
Topic Labeling — Automatically assign human‑readable labels to clusters (e.g., “Delivery Issues”, “Refund Complaints”), making insights easier for non‑technical teams.
Interactive Cluster Editing — Let users merge, rename, or split clusters directly in the UI, enabling custom business workflows and cleaner reporting.