Posts

Let’s continue our discussion about Richard Kerby’s data on race/gender diversity in venture capital from Sunday. I did a touch more cleaning of the data — exact details of which are left as an exercise for the reader — leaving us with a nice tibble. x <- read_rds(url("https://www.davidkane.info/files/blog_files/vc.rds")) x$title <- as.factor(x$title) I have assigned this object to x, which is my preferred name for whatever the main object of analysis is within an R session.

CONTINUE READING

Richard Kerby wrote about diversity within venture capital. Interesting stuff. Even better, Kerby made his data public. Sadly, the data is a fairly non-tidy Excel file. Purpose of this post is to process it into something a little nicer. raw <- read_csv("https://www.davidkane.info/files/blog_files/kerby.csv", # Column names are a mess. So, after running the simple read_csv() # command on the raw file, I use spec(x) to get the default column # types and then use the col_types argument to set them by hand.

CONTINUE READING

I made this SEC comment 12 years ago. Alas, no one cared then and no one cares now. Still think it is a great idea! The SEC should pass a regulation requiring that all publicly traded companies allow their shareholders to vote on the following (binding) resolution each year. “The total compensation of both the CEO and the CFO shall not exceed $1 million in the coming fiscal year.” Those who dislike government meddling in business have little to complain of here since the government isn’t telling any business how to set salaries.

CONTINUE READING

11 years ago, R made its first appearence at Boston’s Fenway Park.

CONTINUE READING

I discussed “Patient–physician gender concordance and increased mortality among female heart attack patients” by Brad N. Greenwood, Seth Carnahan, and Laura Huang (henceforth GCH) the other day. Now that their Supplementary Information pdf is available, it is easy to see that their claims about “quasirandom” assignment are nonsense. To review, GCH claim that female physicians are better at treating female heart-attack victims than male physicians are. Their evidence is that female patients have higher survival rates when treated by female physicians.

CONTINUE READING

Let’s use the ncaahoopR package to gather some data about Harvard Men’s Basketball from the 2017-2018 season. Unfortunately, the ncaahoopR package is not very flexible (and/or I don’t understand blogdown well enough to control the package’s output), so there was no way for me to control the output messages. Apologies! knitr::opts_chunk$set(echo = TRUE) suppressMessages(library(tidyverse)) suppressMessages(library(ncaahoopR)) # One tricky aspect of creating posts with non-trivial run times and which rely # on outside data sources is that you don't want to re-run the analysis each # time you save a draft of the post, which is what happens with blogdown.

CONTINUE READING

“Patient–physician gender concordance and increased mortality among female heart attack patients” by Brad N. Greenwood, Seth Carnahan, and Laura Huang was just published in PNAS. It is getting lots of positive press. From The Atlantic: Women More Likely to Survive Heart Attacks If Treated by Female Doctors And male doctors do better when they have more female colleagues. They showed that women are more likely to die when treated by male doctors, compared to either men treated by male doctors or women treated by female doctors.

CONTINUE READING

Is the Earth warming? Let’s take a look at some satellite temperature measurements from the University of Alabama in Huntsville. Roy Spencer provides monthly updates — e.g., here for June 2018 — using the UAH data. The purpose of this post is to replicate/improve his main image. Global area-averaged lower tropospheric temperature anomalies (departures from 30-year calendar monthly means, 1981-2010). The 13-month centered average is meant to give an indication of the lower frequency variations in the data; the choice of 13 months is somewhat arbitrary… an odd number of months allows centered plotting on months with no time lag between the two plotted time series.

CONTINUE READING

knitr::opts_chunk$set(echo = TRUE) options(width = 90) The World Cup starts today! The tournament which runs from June 14 through July 15 is probably the most popular sporting event in the world. if you are a soccer fan, you know that learning about the players and their teams and talking about it all with your friends greatly enhances the experience. In this post, I will show you how to gather and explore data for the 736 players from the 32 teams at the 2018 FIFA World Cup.

CONTINUE READING