https://www.dropbox.com/s/uvvsjriwdm6u7jm/SPR-Propensity-pc-workshop-slides.pdf

This is likely the best and most comprehensive and easy to follow discussion on propensity scores that I have seen yet. Multiple examples are considered.

Starting on page 12, the author goes into a deeper discussion on matching, including nearest neighbor matching. Page 20 provides the best evidence for matching (and a description of it) that I've seen, comparing both un-matched and matched controls to the "treatment" (drug use) group.

Most matchings consider a larger control group than a treated group. Gu and Rosenbaum (1993: 413) note that optimal algorithms and greedy (go through the treatment group G1 only once and assign best matches) algorithms pick roughly the same controls, but may not assign them to the best matches between the two groups.

Research Question: Our matching algorithm, how would it perform under a greedy assumption vs the current "smallest first" nearest-neighbors approach? Again, what would an optimal (smallest total sum of distances) approach look like? What about increasing the size of the control group (assume Gk) to 2n participants and matching to the best n? How about allowing multiple controls/matches per treated---how will that affect our outcome? [These questions may need actual data. Perhaps use Stuart's data from this talk, if available, to compare?]

Consider

- Existing packages for matching [twang (McCaffrey), Matching (Sekhon), MatchIt (Ho)]
- Paper References: (Smith 1997, Rubin and Thomas 2000)
- Multilevel settings (see slide 155, p38)
- MatchIt R package for matching (http://gking.harvard.edu/matchit)
- Stuart's website: www.biostat.jhsph.edu/~estuart