MNIST data set is applied for K-mean clustering using AWS sagemaker notebook.
This is an example procedure for a K-mean clustering using jupyter notebook on AWS sagemaker. I followed a tutorial shown in a github repository [1]
Summary of the procedure
- Download MNIST input data set and load it on S3.
- Load the input data and build K-mean model on sagemaker jupyter notebook.
- Get results of clustering classes and verifies. the model
MNIST Input Data Loading
- mnist.pkl.gz data file is found in the internet and downloaded. The file is directly uploaded to the notebook instance.
- using pickle module train and test data set are made.
- those input data is sent to the created S3 bucket object.

K-Mean model build
K-mean model attributes are set and model is built using the input data. Here cluster number 10 is set for the number from 0 to 9. The model deployment is made.

![]()
![]()
Showing the output cluster
Top cluster classes willl be found. validation data is shown indeed belong to the classified cluster

Code used in this example is show in [4]
Reference
[1] GitHub - mtm12/SageMakerDemo
[2] K-Means Algorithm - Amazon SageMaker
[3] https://sagemakerexamples.readthedocs.io/en/latest/introduction_to_applying_machine_learning/US-census_population_segmentation_PCA_Kmeans/sagemaker-countycensusclustering.html
[4] GitHub - dchoi/mnistKmean: mnistKmean



