What is Unsupervised Machine Learning?
<aside>
💡
Unsupervised Machine Learning trains models using unlabeled data. The model finds patterns or groups in the data by itself without human-provided labels.
</aside>
Clustering (a key unsupervised technique)
<aside>
💡
Clustering groups unlabeled examples based on how similar they are to each other.
</aside>
- If the examples were labeled, this task would be called classification instead.
Single data point and notation
- A single data point (for example, a document) is written as $X_i$, where $i$ means the $i^{th}$ document or example.
Clusters and notation
- In clustering, clusters are unnamed (no human labels), so we usually number them.
- A cluster number is written as $Z_i$, where $i$ means the $i^{th}$ cluster.
- A cluster is defined by its center and its shape/spread.
- The center of a cluster is often called the centroid.

Human annotators