This course includes one project, which will be due on TBD. You will create your project as a blog post and submit it via URL.
Your blog post will be an explainer for one idea which is adjacent to DATA 1010 topic sequence. Here are some ideas:
- Applications of SVD
- Image processing
- Genomic data
- An application of a matrix decomposition in numerical linear algebra (see this book, for example).
- Constrained optimization and the KKT conditions
- Gradient Descent Algorithms
- The Mersenne Twister PRNG
- Random matrices and Wigner’s semicircle law
- Common distributions and their applications (other than the ones we did in class)
- Hypergeometric distribution
- Benford’s law
- Cauchy distribution
- Applications of bootstrapping
- Maximum likelihood estimation and Fisher information
- Hypothesis testing in studies on a particular topic (e.g, mask-wearing and COVID-19). In other words, explain the hypothesis tests being used in some of the relevant studies in a way that would be comprehensible to someone just beginning to learn about hypothesis tests.
- No-free-lunch theorem: is it valuable?
These topics are just ideas; you’re welcome (and encouraged) to choose your own topic. You can look at the free book Mathematics for Machine Learning for lots more ideas. Feel free to ask on Piazza to get feedback about any particular ideas you might be considering. We will also add to the list above as the term progresses.
When you have chosen a topic, please write a few sentences to describe it and outline a plan for exposition. Submit here. All submissions should be received by November 6th at the latest.
Here are some options for blog hosting:
- Medium. The go-to option for this sort of thing. It has a clean, beautiful writing interface. One major disadvantage is that it doesn’t directly support mathematical typesetting. So you’re going to have to explore options which involve generating images and including them that way (which can be annoying when you want to edit). You can also use the Math Anywhere Chrome extension (which is going to be less work, but it means someone has to install that extension to read your article). If you take the latter option, add a note at the beginning of the article with instructions for downloading the extension.
- FastPages. From FastAI, this option allows you to author content in Jupyter notebook format (or .docx, or Markdown) and have it automatically deployed via GitHub to a blog hosted on a GitHub page. They’ve done the work of getting all the workflows organized, so all you have to do is a few initial setup steps and then upload your Jupyter Notebook to a specific directory in your repo. Check out this video walkthrough to see how it all works.
The main objectives to keep in mind as you write your article are readability, informativeness, and accuracy. You want to set the stage: explain the motivation and describe what you’re going to accomplish. Then get into the details, explain how the various components work in the context of an illustrative example, and also make some connections to real-world applications.
All of these are loose suggestions. The main goal is to tell a great story. The only specifically required element is some presentation of a nontrivial mathematical idea, even if it isn’t the main overall focus. It’s also very likely that you’re going to need to include some visuals to create a compelling reading experience.
A good example to look at for style and length is this article. It’s very much on the longer end of the spectrum at around 3600 words; I recommend targeting a length in the range 50% to 100% as long as this one (shorter than 1500 words would be too short). The other thing this article does which is not necessary is a code walkthrough; whether you want to do that is up to you.
After the blog posts are submitted, each person will be responsible for reading two other students’ blog posts and providing feedback on what they found clear, interesting, helpful, etc. The mapping between readers and blog posts will be chosen uniformly at random.
The review deadline will be two weeks after the blog post submission deadline. You should not expect the review to represent a very large time investment; the goal is to provide an incentive to do a good enough job with the blog post that it will be helpful to at least the two readers from the class (plus the peer feedback).