Python

Modeling Cycles of Grift with Evolutionary Game Theory

We are in a golden age of grift. Where adventurers once flocked to California or the Yukon because “there was gold in them thar hills,” the fastest way to get rich today is by fleecing suckers. We’ve got crypto rug pulls, meme stocks, nutritional supplements, MLMs—anything to make a quick buck.

The Art and Mathematics of Genji-Kō

November 26, 2024
Math
Visualization
History
Python

You might think it’s unlikely for any interesting mathematics to arise from incense appreciation, but that’s only because you’re unfamiliar with the peculiar character of Muromachi (室町) era Japanese nobles. There has never been a group of people, in any time or place, who were so driven to display their sophistication and refinement.

Let's Play Jeopardy! with LLMs

May 12, 2024
Python
Machine Learning
LLM

How good are LLMs at trivia? I used the Jeopardy! dataset from Kaggle to benchmark ChatGPT and the new Llama 3 models. Here are the results: There you go. You’ve already gotten 90% of what you’re going to get out of this article. Some guy on the internet ran a half-baked benchmark on a handful of LLM models, and the results were largely in line with popular benchmarks and received wisdom on fine-tuning and RAG.

Kaprekar's Magic 6174

February 25, 2024
Math
Python
Visualization

Kaprekar’s routine is a simple arithmetic procedure on four digit numbers which rapidly converges to the fixed point 6174, known as the Kaprekar constant. Unlike other famous iterative procedures such as the Collatz function, the ad hoc nature of the Kaprekar routine doesn’t hint at fundamental mathematical discoveries yet to be made.

Cracking Playfair Ciphers

September 13, 2023
Math
Python
Visualization

In 2020, the Zodiac 340 cipher was finally cracked after more than 50 years of trying by amateur code breakers. While the effort to crack it was extremely impressive, the cipher itself was ultimately disappointing. A homophonic substitution cipher with a minor gimmick of writing diagonally, the main factor that prevented it from being solved much earlier was the several errors the Zodiac killer made when encoding it.

ML From Scratch, Part 6: Principal Component Analysis

In the previous article in this series we distinguished between two kinds of unsupervised learning (cluster analysis and dimensionality reduction) and discussed the former in some detail. In this installment we turn our attention to the later. In dimensionality reduction we seek a function $f : \mathbb{R}^n \mapsto \mathbb{R}^m$ where $n$ is the dimension of the original data $\mathbf{X}$ and $m$ is less than or equal to $n$.

A Seriously Slow Fibonacci Function

July 6, 2019
Python
Math
Computer Science

I recently wrote an article which was ostensibly about the Fibonacci series but was really about optimization techniques. I wanted to follow up on its (extremely moderate) success by going in the exact opposite direction: by writing a Fibonacci function which is as slow as possible. This is not as easy as it sounds: any program can trivially be made slower, but this is boring.

ML From Scratch, Part 5: Gaussian Mixture Models

Consider the following motivating dataset: It is apparent that these data have some kind of structure; which is to say, they certainly are not drawn from a uniform or other simple distribution. In particular, there is at least one cluster of data in the lower right which is clearly separate from the rest.

Adaptive Basis Functions

Today, let me be vague. No statistics, no algorithms, no proofs. Instead, we’re going to go through a series of examples and eyeball a suggestive series of charts, which will imply a certain conclusion, without actually proving anything; but which will, I hope, provide useful intuition. The premise is this:

ML From Scratch, Part 4: Decision Trees

So far in this series we’ve followed one particular thread: linear regression -> logistic regression -> neural network. This is a very natural progression of ideas, but it really represents only one possible approach. Today we’ll switch gears and look at a model with completely different pedigree: the decision tree, sometimes also referred to as Classification and Regression Trees, or simply CART models.

A Fairly Fast Fibonacci Function

February 19, 2019
Python
C++
Math
Computer Science

A common example of recursion is the function to calculate the $n$-th Fibonacci number: def naive_fib(n): if n < 2: return n else: return naive_fib(n-1) + naive_fib(n-2) This follows the mathematical definition very closely but it’s performance is terrible: roughly $\mathcal{O}(2^n)$. This is commonly patched up with dynamic programming. Specifically, either the memoization:

ML From Scratch, Part 3: Backpropagation

In today’s installment of Machine Learning From Scratch we’ll build on the logistic regression from last time to create a classifier which is able to automatically represent non-linear relationships and interactions between features: the neural network. In particular I want to focus on one central algorithm which allows us to apply gradient descent to deep neural networks: the backpropagation algorithm.

ML From Scratch, Part 2: Logistic Regression

In this second installment of the machine learning from scratch we switch the point of view from regression to classification: instead of estimating a number, we will be trying to guess which of 2 possible classes a given input belongs to. A modern example is looking at a photo and deciding if its a cat or a dog.

ML From Scratch, Part 1: Linear Regression

To kick off this series, will start with something simple yet foundational: linear regression via ordinary least squares. While not particularly exciting, linear regression finds widespread use both as a standalone learning algorithm and as a building block in more advanced learning algorithms.

ML From Scratch, Part 0: Introduction

Motivation As an apprentice, every new magician must prove to his own satisfaction, at least once, that there is truly great power in magic. —The Flying Sorcerers, by David Gerrold and Larry Niven How do you know if you really understand something? You could just rely on the subjective experience of feeling like you understand.

Craps Variants

July 11, 2018
Python
Math
Statistics

Craps is a suprisingly fair game. I remember calculating the probability of winning craps for the first time in an undergraduate discrete math class: I went back through my calculations several times, certain there was a mistake somewhere. How could it be closer than $\frac{1}{36}$? (Spoiler Warning If you haven’t calculated these odds for yourself then you may want to do so before reading further.

Semantic Code

April 30, 2008
Python
Philosophy

se-man-tic (si-man’tik) adj. 1. Of or relating to meaning, especially meaning in language. Programming destroys meaning. When we program, we first replace concepts with symbols and then replace those symbols with arbitrary codes — that’s why it’s called coding. At its worst programming is write-only: the program accomplishes a task, but is incomprehensible to humans.