Cross-National Comparative Designer
Related skills. This skill composes with hypothesis-building (per-country predictions and estimands), survey-design (question wording, acquiescence, sensitivity), conjoint-design (origin-country stimuli, power), methods-reporting (per-country CONSORT, APSA/JARS/DA-RT), and pre-registration-writing (study-level PAP with per-country tiers).
Reference files. reference/example-4-country-immigration.md -- worked four-country immigration-conjoint cascade covering case-selection logic, shared attributes with origin-country calibration, the full TRAPD translation protocol, per-country SESOI/predictions, and a pooled-vs-per-country analysis table.
Instructions
1. Case Selection and Theoretical Justification
- Per-Country Predictions Before Case Selection (Why -> If-Then): Before finalizing the country set, articulate, for each candidate case, the specific pattern of results that would support the theoretical claim versus disconfirm it. Cross-national leverage comes from the contrast between predicted country-level patterns, not from running the same experiment in multiple places and describing what happens. If a case cannot be tied to a distinct, falsifiable prediction, it is decorative rather than diagnostic. See
hypothesis-buildingfor the Why -> If-Then funnel and estimand specification. - Variation by Design: Cases should be selected to vary on theoretically relevant dimensions (e.g., regime type, institutional structure, immigration history, partisan polarization), not for convenience or data availability. Each case should serve a specific theoretical function identified in the step above.
- Case Selection Table: Produce a summary table listing each country, its relevant contextual features (e.g., immigration salience, institutional architecture, regime type), the per-country prediction, and the specific motivation for its inclusion. This table belongs in the PAP and should be referenced in the narrative.
- Avoid the "Most Similar" Trap: Cross-national designs are often most informative when cases are deliberately diverse. Including cases that vary on regime type (e.g., democracy vs. hybrid regime) or institutional structure (e.g., parliamentary vs. presidential) enables stronger tests of whether a theoretical mechanism generalizes beyond a single political context.
- Typical Case Count (3--5): For survey experiments with ~2,500 respondents per country, 3--5 cases is a common range in published work that balances theoretical leverage with practical feasibility (budget, translation, vendor logistics). Fewer than 3 limits comparison; more than 5 strains resources without proportional analytical gain. Treat this as a planning heuristic rather than a rule and tie the final count to the per-country predictions above.
- Information Environment Variation: The information environment differs across countries -- media systems, political knowledge levels, and the accessibility of policy information vary. A treatment that is "informative" in one context may be redundant or incomprehensible in another. Document the expected information environment per country and consider how it affects treatment interpretation (Druckman 2022).
2. Instrument Localization
- Conceptual Equivalence over Literal Translation: The goal is not identical wording across countries but equivalent conceptual meaning. A "skilled worker visa" means H-1B in the US, E-7 in South Korea, S Pass in Singapore. Use country-specific referents that carry the same conceptual weight. Word choice constitutes a framing treatment and must be calibrated per country, not just translated -- terms like "unauthorized," "undocumented," and "illegal" carry different connotations across linguistic and political contexts (Stantcheva 2023, citing Merolla et al. 2013). The Cross-Cultural Survey Guidelines (CCSG 2016) are the canonical source for cross-cultural comparability, including translation and adaptation procedures.
- Qualitative Treatment Framing: For information experiments fielded across countries, consider using qualitative descriptions rather than exact statistics. Qualitative information "can help make the treatment more homogeneous when doing an experiment in several countries or settings" because exact numbers carry different meanings in different contexts (Stantcheva 2023).
- Anti-Acquiescence Measurement: Acquiescence bias varies by culture. Use item-specific scales (not agree-disagree or true-false formats), balanced scales with equal numbers of positive and negative options, and randomized question direction to ensure measurement equivalence across countries (Stantcheva 2023).
- Coarse Measurement as Principled Commitment: When implementing the same conceptual measure across diverse contexts, accept that some precision is lost in translation. Treat this as a principled commitment to comparability rather than a limitation -- coarse measurement that is consistent across contexts is preferable to precise measurement that means different things in different places (Sniderman 2018).
- Institutional Referents: Localize all political actors (president, cabinet, ministry), legislative bodies (Congress, Bundestag, National Assembly, Parliament), international organizations (UN, EU, OECD, ASEAN), and policy instruments (visa types, permit categories). Document all localizations in a country-specific table.
- Cultural Calibration of Stimuli: For attributes like "country of origin" that signal cultural proximity or distance, the specific countries must be calibrated per respondent country. "Culturally distant" means different things in the US (e.g., Somalia) than in South Korea (e.g., Yemen) or Singapore (e.g., Afghanistan). Document the calibration rationale.
- Ecological Validity Check: Every experimental stimulus should be plausible in the respondent's national context. A vignette about EU directives makes sense for German respondents but not Singaporean ones. Test each localized version against the question: "Would a newspaper in this country plausibly report this?" Note that being "outside the lab" does not automatically confer ecological validity -- it requires deliberate calibration of stimuli to real-world conditions in each country (Mutz 2011). Where feasible, benchmark stimuli against real-world behavioral outcomes in at least one case; Hainmueller, Hangartner, & Yamamoto (2015) validate immigration-conjoint preferences against actual Swiss naturalization referendum votes, demonstrating that vignette and conjoint designs can recover behaviorally meaningful effects when localized carefully.
3. Composition and Origin Country Selection
- Origin Countries as Group Cues: When immigrant origin countries are used as experimental stimuli (common in immigration conjoint studies; see Hainmueller, Hopkins, & Yamamoto 2014 for the foundational AMCE framework and its original application to immigration attitudes), select origins that: (a) are recognizable to respondents, (b) vary on the intended dimension (e.g., cultural proximity), and (c) do not carry overwhelming confounding associations (e.g., ongoing war, recent political crisis) unless those associations are part of the theoretical design. See
conjoint-designfor the attribute-architecture rules that govern origin-country levels. - Nesting Origins within Domains: If the experiment crosses policy domain (e.g., labor vs. asylum) with immigrant composition, it may be necessary to use different origin countries for different domains within the same respondent country. This preserves ecological validity (asylum seekers plausibly come from different countries than labor migrants) but means the composition attribute cannot be cleanly separated from domain. Document this nesting and acknowledge it in the analysis.
- Avoiding Origin Overlap: Within a single respondent country, avoid reusing the same origin country acros