HapMap Population PCA

Two-dimensional visualization of genotype data, with samples from ten different ethnic populations collected by the HapMap Consortium:
        ASW: African ancestry in Southwest USA
        CEU: Utah residents of Northern and Western European ancestry
        CHB: Han Chinese in Beijing, China
        CHD: Chinese in Metropolitan Denver, Colorado
        GIH: Gujarati Indians in Houston, Texas
        JPT: Japanese in Tokyo, Japan
        LWK: Luhya in Webuye, Kenya
        MEX: Mexican ancestry in Los Angeles, California
        MKK: Maasai in Kinyawa, Kenya
        TSI: Toscani in Italia
        YRI: Yoruba in Ibadan, Nigeria
The image uses the Single Nucleotide polymorphism (SNP) data available here (~1,500,000 SNPS). It is given by a multi-dimensional scaling of the samples, using the total number of allele mismatches as a distance metric. A scatterplot of the first two MDS components is shown. Click here for a scatterplot of the first and third, and here for a scatterplot of the second and third components. For a three-dimensional view of the data, download this Matlab figure file.


Google Scholar Word Clouds

These word clouds were generated automatically by an R script that pulls information from Google Scholar, as described in this Simply Statistics blog post. The left cloud shows my common co-authors, the right cloud shows words that commonly appear in my publications and citations.


Beer Word Clouds

Separate word clouds for two brands of beer are shown below. Follow this link to see similar word clouds for over 14,000 different beers. These word clouds were generated automatically by scraping over 1.5 million reviews from BeerAdvocate.com. Words that appear more often in reviews for a beer appear larger, but some commonly used words are filtered out. See this R script for the code I used to create these images.



Prime pair bounds

The chart below displays the smallest proven bound on the gap between pairs of prime numbers that occur infinitely often. Beginning with Yitang Zhang's initial bound of 70 billion in May 14th, 2013, the mathematics community has steadily pushed the bound lower. Progress may lead to a proof of the twin prime conjecture -- that there are infinitely many pairs of prime numbers with a difference of 2. See this article for a good summary. This chart shows the points at which the smallest known bound has changed, who deserves credit, and a link to their argument. The log-scale chart underneath can be used to zoom in on a particular date range. Created using this google chart api, using data mostly from this polymath page.