Given a set of $n$ points and their pairwise distances, the goal of clustering is to partition the points into a ``small'' number of ``related'' sets. Clustering algorithms are used widely to manage, classify, and summarize many kinds of data. In this dissertation, we study the classic \emph{facility location} and \emph{$k$-median} problems in the context of clustering, and formulate and study a new optimization problem that we call the \emph{online median} problem. For each of these problems, it is known to be \NP-hard to compute a solution with cost less than a certain constant factor times the optimal cost. We give simple constant-factor approximation algorithms for the facility location, $k$-median, and online median problems with optimal or near-optimal time bounds. We also study distance functions that are ``approximately'' metric, and show that such distance functions allow us to obtain a faster online median algorithm and to generalize our analysis to other objective functions, such as that of the well-known $k$-means heuristic. Given $n$ points, the associated interpoint distances and nonnegative point weights, and a nonnegative penalty for each point, the facility location problem asks us to identify a set of cluster centers so that the weighted average cluster radii and the sum of the cluster center penalties are both minimized. The $k$-median problem asks us to identify exactly $k$ cluster centers while minimizing just the weighted average cluster radii. We give a simple greedy algorithm for the facility location problem that runs in $O(n^2)$ time and produces a solution with cost at most $3$ times optimal. For the $k$-median problem, we develop and make use of a sampling technique that we call \emph{successive sampling}, and give a randomized constant-factor approximation algorithm that runs in $O(n(k+\log{n}+\log^2{n}))$ time. We also give an $\Omega(nk)$ lower bound on the running time of any randomized constant-factor approximation algorithm for the $k$-median problem that succeeds with even a negligible constant probability. In many settings, it is desirable to browse a given data set at differing levels of granularity (i.e., number of clusters). To address this concern, we formulate a generalization of the $k$-median problem that we call the \emph{online median problem}. The online median problem asks us to compute an ordering of the points so that, over all $i$, when a prefix of length $i$ is taken as a set of cluster centers, the weighted average radii of the induced clusters is minimized. We show that a natural generalization of the greedy strategy that we call \emph{hierarchically greedy} yields an algorithm that produces an ordering such that every prefix of the ordering is within a constant factor of the associated optimal cost. Furthermore, our algorithm has a running time of $\Theta(n^2)$. Finally, we study the performance of our algorithms in practice. We present implementations of our $k$-median and online median algorithms; our experimental results indicate that our approximation algorithms may be useful in practice.