diff --git a/src/unsupervised_learning.ipynb b/src/unsupervised_learning.ipynb
new file mode 100644
index 0000000..c060bde
--- /dev/null
+++ b/src/unsupervised_learning.ipynb
@@ -0,0 +1,563 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Unsupervised Learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "New R commands\n",
+    "- `prcomp(x, scale = TRUE)` performs PCA on data `x`\n",
+    "- `biplot(pr.out, scale=0)` plots scores and loadings of a PCA `pr.out` in one figure.\n",
+    "- `cumsum(x)` computes the cumulative sum of the elements of a numeric vector.\n",
+    "- `kmeans(x,k,nstart=)` run k-means clustering on data `x` with `nstart` restarts."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## PCA"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this lab, we perform PCA on the USArrests data set, which is part of\n",
+    "the base R package. The rows of the data set contain the 50 states, in\n",
+    "alphabetical order."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "states=row.names(USArrests)\n",
+    "states"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The columns of the data set contain the four variables."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "names(USArrests)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We first briefly examine the data. We notice that the variables have vastly\n",
+    "different means."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "apply(USArrests, 2, mean)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that the apply() function allows us to apply a function—in this case,\n",
+    "the mean() function—to each row or column of the data set. The second\n",
+    "input here denotes whether we wish to compute the mean of the rows, 1,\n",
+    "or the columns, 2. We see that there are on average three times as many\n",
+    "rapes as murders, and more than eight times as many assaults as rapes.\n",
+    "We can also examine the variances of the four variables using the apply()\n",
+    "function."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "apply(USArrests, 2, var)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Not surprisingly, the variables also have vastly different variances: the\n",
+    "UrbanPop variable measures the percentage of the population in each state\n",
+    "living in an urban area, which is not a comparable number to the number of rapes in each state per 100,000 individuals. If we failed to scale the\n",
+    "variables before performing PCA, then most of the principal components\n",
+    "that we observed would be driven by the Assault variable, since it has by\n",
+    "far the largest mean and variance. Thus, it is important to standardize the\n",
+    "variables to have mean zero and standard deviation one before performing\n",
+    "PCA.\n",
+    "\n",
+    "We now perform principal components analysis using the prcomp() function, which is one of several functions in R that perform PCA."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out=prcomp(USArrests, scale=TRUE)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "By default, the prcomp() function centers the variables to have mean zero.\n",
+    "By using the option scale=TRUE , we scale the variables to have standard\n",
+    "deviation one. The output from prcomp() contains a number of useful quantities."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "names(pr.out)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out$center"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out$scale"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The rotation matrix provides the principal component loadings; each column of pr.out$rotation contains the corresponding principal component\n",
+    "loading vector."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out$rotation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We see that there are four distinct principal components. This is to be\n",
+    "expected because there are in general min(n − 1, p) informative principal\n",
+    "components in a data set with n observations and p variables.\n",
+    "\n",
+    "This function names the loadings the rotation matrix, because when we matrix-multiply the\n",
+    "X matrix by pr.out$rotation, it gives us the coordinates of the data in the rotated\n",
+    "coordinate system. These coordinates are the principal component scores."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out$x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To see that we actually recover these scores by multiplying the data X with the loading matrix pr.out$rotation we scale the data and perform the matrix multiplication."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "USArrests.scaled = scale(USArrests)\n",
+    "USArrests.scaled %*% pr.out$rotation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can plot the first two principal components as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "biplot(pr.out, scale=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The scale=0 argument to biplot() ensures that the arrows are scaled to\n",
+    "represent the loadings; other values for scale give slightly different biplots\n",
+    "with different interpretations.\n",
+    "\n",
+    "Alternatively we could directly plot the scores alone.\n",
+    "We see clearly the scores of the states California, Nevada and Florida on the left."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot(pr.out$x[,1:2], xlim = c(-3, 3), ylim = c(-3, 3))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The principal components are only unique up to a sign change, so we could have also obtained the following picture with equivalent information (this figure can be found in the text book as Figure 10.1):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out$rotation=-pr.out$rotation\n",
+    "pr.out$x=-pr.out$x\n",
+    "biplot(pr.out, scale=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The prcomp() function also outputs the standard deviation of each principal component. For instance, on the USArrests data set, we can access\n",
+    "these standard deviations as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.out$sdev"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The variance explained by each principal component is obtained by squar-\n",
+    "ing these:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pr.var=pr.out$sdev^2\n",
+    "pr.var"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To compute the proportion of variance explained by each principal component, we simply divide the variance explained by each principal component\n",
+    "by the total variance explained by all four principal components:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pve=pr.var/sum(pr.var)\n",
+    "pve"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We see that the first principal component explains 62.0 % of the variance\n",
+    "in the data, the next principal component explains 24.7 % of the variance,\n",
+    "and so forth. We can plot the PVE explained by each component, as well\n",
+    "as the cumulative PVE, as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot(pve, xlab=\"Principal Component\", ylab=\"Proportion of Variance Explained\", ylim=c(0,1),type='b')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot(cumsum(pve), xlab=\"Principal Component\", ylab=\"Cumulative Proportion of Variance Explained\", ylim=c(0,1),type='b')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that the function cumsum() computes the cumulative sum of the elements of a numeric vector. For instance:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "a=c(1,2,8,-3)\n",
+    "cumsum(a)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## K-Means Clustering"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The function kmeans() performs K-means clustering in R . We begin with\n",
+    "a simple simulated example in which there truly are two clusters in the\n",
+    "data: the first 25 observations have a mean shift relative to the next 25\n",
+    "observations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "set.seed(2)\n",
+    "x=matrix(rnorm(50*2), ncol=2)\n",
+    "x[1:25,1]=x[1:25,1]+3\n",
+    "x[1:25,2]=x[1:25,2]-4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We now perform K-means clustering with K = 2."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "km.out=kmeans(x,2,nstart=20)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The cluster assignments of the 50 observations are contained in\n",
+    "km.out$cluster."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "km.out$cluster"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The K-means clustering perfectly separated the observations into two clusters even though we did not supply any group information to kmeans() . We\n",
+    "can plot the data, with each observation colored according to its cluster\n",
+    "assignment."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot(x, col=(km.out$cluster+1), main=\"K-Means Clustering Results with K=2\", xlab=\"\", ylab=\"\", pch=20, cex=2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here the observations can be easily plotted because they are two-dimensional.\n",
+    "If there were more than two variables then we could instead perform PCA\n",
+    "and plot the first two principal components score vectors.\n",
+    "\n",
+    "In this example, we knew that there really were two clusters because\n",
+    "we generated the data. However, for real data, in general we do not know\n",
+    "the true number of clusters. We could instead have performed K-means\n",
+    "clustering on this example with K = 3."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "set.seed(4)\n",
+    "km.out=kmeans(x,3,nstart=20)\n",
+    "km.out"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot(x, col=(km.out$cluster+1), main=\"K-Means Clustering Results with K=3\", xlab=\"\", ylab=\"\", pch=20, cex=2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When K = 3, K-means clustering splits up the two clusters.\n",
+    "\n",
+    "To run the kmeans() function in R with multiple initial cluster assignments, we use the nstart argument. If a value of nstart greater than one\n",
+    "is used, then K-means clustering will be performed using multiple random\n",
+    "assignments in Step 1 of the Algorithm on slide 35 (Algorithm 10.1 in the book), and the kmeans() function will\n",
+    "report only the best results. Here we compare using nstart=1 to nstart=20 ."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "set.seed(311)\n",
+    "km.out=kmeans(x,3,nstart=1)\n",
+    "km.out$tot.withinss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "km.out=kmeans(x,3,nstart=20)\n",
+    "km.out$tot.withinss"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that km.out$tot.withinss is the total within-cluster sum of squares,\n",
+    "which we seek to minimize by performing K-means clustering (Equation\n",
+    "10.11). The individual within-cluster sum-of-squares are contained in the\n",
+    "vector km.out$withinss.\n",
+    "\n",
+    "We strongly recommend always running K-means clustering with a large\n",
+    "value of nstart, such as 20 or 50, since otherwise an undesirable local\n",
+    "optimum may be obtained.\n",
+    "\n",
+    "When performing K-means clustering, in addition to using multiple ini-\n",
+    "tial cluster assignments, it is also important to set a random seed using the\n",
+    "set.seed() function. This way, the initial cluster assignments in Step 1 can\n",
+    "be replicated, and the K-means output will be fully reproducible."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "R",
+   "language": "R",
+   "name": "ir"
+  },
+  "language_info": {
+   "codemirror_mode": "r",
+   "file_extension": ".r",
+   "mimetype": "text/x-r-source",
+   "name": "R",
+   "pygments_lexer": "r",
+   "version": "3.6.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}