Improving Reproducibility of Respondent Driven Sampling through Adaptive Design
Respondent driven sampling (RDS) is a recruitment method for hard-to-sample populations that are rare in number and/or elusive due to highly-stigmatized or illicit behaviors. For these groups, traditional probability sampling loses its feasibility, because it requires prohibitively high screening costs to locate eligible persons, and, even when eligible persons are located, their desire to hide results in false negatives. Based on the premise that people of similar traits form some type of social networks, RDS exploits the existing networks for recruitment and has been applied to numerous studies. For example, the National HIV Behavioral Surveillance by CDC uses RDS for the people who inject drugs (PWID) component.
Unlike traditional sampling, where researchers sample and recruit participants, RDS asks participants to recruit other eligible persons from their social networks. The use of organic social networks for sampling is an innovative feature of RDS. This, however, comes with one major challenge. In order for RDS to work, participants need to cooperate with recruitment requests. This cooperation issue has profound implications for inferences as well as design of RDS. First, RDS inferences rest on a set of assumptions that recruitment follows memory-less Markov chain (e.g., a chain’s overall characteristics are not dependent on its seed’s characteristics) and reaches equilibrium. This requires recruitment chains formed by individual seeds to be sufficiently long. Noncooperation results in short chains, leading this assumption unmet. Existing RDS estimators are largely blind to this reality and, hence, limited in producing generalizable knowledge. Second, due to noncooperation, RDS fieldwork may not progress as expected. While examples of these are plentiful, they are reported anecdotally and rarely make to the literature. Hence, RDS data collection progress is extremely difficult to predict at the design stage, and when the progress deviates from expectations, researchers are left to make unplanned design changes (e.g., increase the amount of incentives) on the spur of the moment in hopes of making RDS work. This approach is not replicable, the science is suspect, and the missteps are repeated.
This study attempts to improve operational and statistical reproducibility of RDS by proposing adaptive-RDS (A-RDS) as a design framework and to provide practical tools on which researchers rely for successful implementation of RDS, where success is measured through recruitment cooperation.
Michael R Elliott, James Robert Wagner, Sunghee Lee