Welcome to the DATA 1010 homepage! In this course, we develop mathematical ideas for data science in a visual and computation-oriented way. Class problem sets, weekly assignments, and other resources are available from the menu bar above.

Course Description

In this course we will introduce the mathematical methods of data science through a combination of computational exploration, visualization, and theory. We will learn scientific computing basics, topics in numerical linear algebra, mathematical probability (probability spaces, expectation, conditioning, common distributions, law of large numbers and the central limit theorem), statistics (point estimation, confidence intervals, hypothesis testing, maximum likelihood estimation, density estimation, bootstrapping, and cross-validation), and machine learning (regression, classification, and dimensionality reduction, including neural networks, principal component analysis, unsupervised learning, Bayesian methods, and graphical models).


For those interested in studying this material without being enrolled in the course, I recommend the following workflow: begin with the Data Gymnasia courses. Then work through the in-class exercise Jupyter notebooks and watch the follow-up videos. The weekly problem sets are also available, and practice exams will be posted as well.