I have written a couple of software packages for Stata. Please feel free to use and let me know if you find any bugs. Also, thanks to Andrew Heiss for the suggestion that I just add “z” at the end of program names.
Separate Signal from Noise
The program -caterpillar-, co-authored with Paul von Hippel, takes a set of estimates and standard errors. The estimates may represent the effects of different programs or the results of different studies (as in a meta-analysis). The program outputs a "caterpillar plot" containing point estimates (sorted in ascending order), along with 95% pointwise confidence intervals, Bonferroni-corrected 95% confidence intervals, and an estimate of the null distribution. The null distribution represents what the distribution of estimates would look like under homogeneity - i.e., if there were no differences between the effects, and the estimates differed only because of random estimation error. The program also prints summary statistics on the plot, including Cochran's Q test (with degrees of freedom and p value), a method-of-moments estimate of the heterogeneity standard deviation (tau), and the Higgins-Thompson estimate of the reliability (rho). All calculations described in von Hippel and Bellows (2018). Install it using -ssc install caterpillar, all-.
Match Names to Nicknames
The program -nicknamez- allows matching of two variables using names and common nicknames. It is meant to pair with -reclink- or -matchit-, two extremely useful user-written programs for matching between messy string data. However, most string matching programs rely on string distance, which may not capture good matches between names/nicknames or closely related names. For example, “Kitty” is a common nickname for “Katharine.”* I have scraped common names/nicknames and names/closely related names from https://www.behindthename.com/ based on a recommendation from my sister, who spends a great deal of time thinking about names. I then wrote a simple wrapper called -nicknamez- that matches between these common names and nicknames/closely related names. To learn more, use -net describe nicknamez, from("https://www.laura-bellows.com/s/")-. To install, type -net install nicknamez, from("https://www.laura-bellows.com/s/")-. Right now, -nicknamez- uses a datafile of names and nicknames/closely related names also stored on this website to match. If you’d like to download the original file and do the matching yourself, or add nicknames, type -net get nicknamez, from(“"https://www.laura-bellows.com/s/")-. That will download the datafile of ~28,000 name/nickname matches to your current directory.
* Katharine is my partner’s mother’s name, and one of her nicknames is, indeed, Kitty. My partner also calls her mother “Franny” and “Franella.” As these are not common nicknames for Katharine as identified by Behind the Name, they are not included in this dataset. Since I primarily am matching early education teachers (the vast majority of whom are women), I do not typically deal with “Junior” or “Trip,” often nicknames for the II or III. Possible future iterations of -nicknamez- will also flag “Junior” or “Trip.”
Additional Name Matches
I also wrote a very simple program, -namez-, that matches between two full names to identify cases in which a firstname and lastname have been switched, one observation has two names and the other has three (but two out of three names match), or one observation has a first initial and the other has a first name. I have found it useful it looking through a large number of fuzzy matched names. To learn more, use -net describe namez, from("https://www.laura-bellows.com/s/")-. To install, type -net install namez, from("https://www.laura-bellows.com/s/")-.