ADVERSARIAL REVIEW · EXCHANGE PEER REVIEW PROTOCOL

Claude Adversarial Review

Paper: kingmaker-gedanken-2026.html · Hash: 0553a4cb30b9 · Date: 2026-05-26 12:10

Adversarial Peer Review: "The Kingmaker Paradox"

EXECUTIVE SUMMARY

This paper makes an intellectually interesting argument that deserves engagement, but it contains a core mathematical flaw that undermines the central quantitative claim, several critical unstated assumptions that do the heavy lifting silently, selective engagement with the empirical literature, and reasoning leaps that would not survive peer review in political science. The paper is strengthened, not weakened, by confronting these directly.

FINDING 1: The Core Equation Is Mathematically Misleading

Specific Claim: "Net Swing = BVAP × (2C − 1)" and "At 85% cohesion, the term (2C − 1) = 0.70 — meaning 70% of the bloc's population translates directly into net swing."

The Problem: This formula computes the net partisan margin contributed by the Black bloc, expressed as a share of the total voting-age population. It is not a "swing" in the conventional electoral sense unless you specify what it is being compared against.

The formula conflates two different things:

The contribution of the Black bloc to the margin
The change in outcome from one scenario to another (a true swing)

More critically, the formula omits the denominator. Net swing as a share of total votes cast requires dividing by turnout-adjusted total population. If the district is 20% BVAP and 80% non-Black, and non-Black voters split 55-45 Republican, the non-Black contribution is 80% × (0.55 − 0.45) = 8 points against the Black bloc's preferred candidate. The Black contribution at 85% cohesion is 20% × 0.70 = 14 points. Net outcome: approximately +6 for the Black-preferred candidate—not "+14 decisive."

The paper presents the Black bloc's gross contribution as if it were a net electoral outcome, ignoring the opposing vector from non-Black voters.

For the model to show "decisiveness," the non-Black vote must be approximately split (roughly 50-50). The paper does assume this implicitly when it says "In a district where the non-Black population splits 70-30, a 10-point Black swing changes a 70-30 result to 60-40." But this concession—buried in Section 7—actually disproves the headline claim that the model works in 100% of competitive races. Competitive races are defined precisely as ones with close non-Black splits. The paper never verifies that cracked districts will have competitive non-Black splits.

Counter-evidence: In Alabama CD-7 and neighboring districts, non-Black voters have split 65-35 or more for Republicans in recent cycles (Alabama Secretary of State returns, 2022-2024). At those splits, a 26% BVAP at 85% cohesion produces: 26% × 0.70 = 18.2 points Black contribution, minus 74% × (0.65−0.35) = 22.2 points Republican non-Black advantage. Net result: −4 points. The Black bloc loses decisively.

Severity: CRITICAL

What the paper must do: Incorporate non-Black partisan baseline vote shares into every district-level calculation. The table in Section 4 must show (a) expected non-Black partisan split, (b) resulting net outcome, and (c) the threshold non-Black split at which the Black bloc becomes pivotal. The current presentation is arithmetically incomplete.

FINDING 2: The "100% of Competitive Races" Claim Is Circular

Specific Claim: "Every scenario is decisive in every competitive race... The bloc does not merely influence elections. It determines them."

The Problem: The paper defines "competitive" as "decided by fewer than 7 points in 2024" and then shows the bloc produces swings exceeding 7 points. This is circular. The 33 "competitive" 2024 races were competitive with the actual electorate composition that existed in 2024, which did not include a coordinated 20-30% Black bloc in those districts. You cannot retroactively inject a new demographic bloc into 2024 results without modeling how that injection changes the district's composition, which changes who else is in the district, which changes the partisan baseline.

Furthermore, if cracked Black voters are redistributed into districts that were previously safe Republican districts (a common cracking strategy), those districts are by definition not in the "33 competitive races" pool. The paper cherry-picks the 33 most competitive 2024 races and asks whether a 20% swing-bloc would be decisive in them—but the cracked districts may not map onto those 33.

Counter-evidence: The actual proposed maps in Alabama (following Milligan remand and subsequent Callais reversal) concentrate cracked Black voters in districts with Republican baselines of R+15 or higher (Alabama Legislative Services Agency, 2025 proposed maps). These are not among the 33 competitive races.

Severity: Critical

What the paper must do: Match each of the 7 cracked districts to the actual new districts they would create and look up the partisan baseline of those specific new districts. The "100% of competitive races" finding is analytically severed from the actual geography of cracking.

FINDING 3: The "Bidirectional Threat" Assumption Does the Entire Work and Is Empirically Implausible

Specific Claim: "The 2024 shift from 92% to 80% Democratic cohesion among Black voters is not erosion — it is the beginning of the credible threat the model requires."

Multiple problems:

First, the 2024 Black vote share for Trump (approximately 12-18% depending on source: NBC Exit Polls 2024, AP VoteCast 2024) was historically unusual and driven significantly by young Black men specifically—it is not a durable ideological shift. Hajnal and Lee (2011, Why Americans Don't Join the Democratic Party) and more recently Fraga (2018, The Turnout Gap) document that Black partisan cohesion is structurally reinforced by policy stakes that are racially asymmetric. Treating a single-cycle fluctuation as the foundation for a permanent bidirectional threat mechanism is analytically thin.

Second, the kingmaker model requires not merely that voters could defect, but that both parties believe they will if demands are not met. This credibility problem is severe. The Republican Party's recent legislative history on voting rights, affirmative action (Students for Fair Admissions, 2023), and criminal justice creates a strong prior that concessions would not be forthcoming or durable. Rational Black voters discounting Republican commitments would not establish credible threat; they would establish rational asymmetric loyalty—which is exactly what currently exists.

Third, the historical precedents cited (FDP, DUP, Ra'am) all operated in systems with proportional representation or confidence-vote parliamentary accountability—mechanisms that make kingmaker threats automatically credible because coalition formation is literally the government-formation mechanism. In the U.S. House, there is no equivalent forced negotiation. A candidate wins a district plurality and then joins a caucus. The community's leverage over that candidate after the election is limited to the next primary—which returns us to party loyalty dynamics.

Counter-evidence: Frymer, P. (1999). Uneasy Alliances: Race and Party Competition in America. Princeton University Press. Frymer argues that the "captured constituency" problem is structural in two-party systems and that bidirectional threats by Black voters have historically been neutralized by party mechanisms. This is the single most important counter-argument in the political science literature and the paper does not cite it.

Severity: Critical

What the paper must do: Engage with Frymer directly. Model the credibility conditions under which both parties would genuinely negotiate. Address the structural difference between parliamentary coalition formation and U.S. plurality-district elections.

FINDING 4: The Parliamentary Analogies Are Systemically Inapt

Specific Claim: "The kingmaker model is not hypothetical. It is the dominant strategy for minority political blocs in parliamentary democracies worldwide."

The Problem: The FDP, DUP, and Ra'am examples all operate under parliamentary systems where:

Government formation requires a majority coalition explicitly negotiated between parties
Confidence votes mean the government falls if the kingmaker withdraws
Party discipline is enforced through whipping systems
Proportional or near-proportional representation means a 3% vote share equals roughly 3% of seats

None of these conditions exist in U.S. single-member plurality districts. In the U.S.:

A district-level Black bloc can elect a candidate, but cannot withdraw confidence in the government after election
The Speaker is elected once per Congress; the Black bloc's leverage on House organization is limited to the organizational vote (January), not ongoing
Individual House members face no removal mechanism between elections short of resignation or expulsion
The Republican and Democratic parties are not coalition partners who negotiated their alliance—they are mass parties with primaries

The Banzhaf Power Index, cited approvingly, was developed for weighted voting bodies (legislatures, corporate boards) and its application to mass electorates is contested. Straffin (1977, Management Science) and Felsenthal & Machover (1998, The Measurement of Voting Power) both note that Banzhaf calculations assume equally likely coalition configurations—an assumption that fails catastrophically when voters have stable partisan identities.

Counter-evidence: Hajnal, Z. & Lee, T. (2011). Why Americans Don't Join the Democratic Party. Cambridge. Documents why minority bloc-bargaining strategies have repeatedly failed to translate parliamentary-style leverage into the two-party American context.

Severity: Major

What the paper must do: Either (a) identify U.S.-specific examples of district-level bloc bargaining that produced durable policy concessions, or (b) explicitly theorize the institutional translation mechanism from parliamentary to U.S. congressional context. The analogies as stated do not support the conclusion.

FINDING 5: The Wasted Vote Analysis Contains a Definitional Error

Specific Claim: "A 'wasted' vote is any vote cast above the 51% threshold needed to win. In a seat won 80-20, 29 of the winner's 80 votes accomplished nothing."

The Problem: This is a non-standard and misleading definition of wasted votes. The efficiency gap literature (Stephanopoulos & McGhee 2015, cited in this paper) defines wasted votes as votes cast for a losing candidate plus votes cast above the 50%+1 threshold for a winning candidate. Both parties waste votes; the efficiency gap measures the asymmetry in wasting.

Under the paper's definition, a party that wins 51-49 has 0% waste. But if that party also loses 49-51 in the adjacent district, it has wasted 49% of votes there. The correct analysis of whether redistribution improves efficiency must account for both districts, not just the safe seat.

More importantly, the "wasted vote" framing assumes that the votes cast in a safe seat could have been cast elsewhere. But voters do not choose which district to live in based on electoral strategy. Cracking does not move voters to competitive districts—it moves district lines so that the same voters are in different electoral contexts. This is a crucial distinction the paper blurs.

Severity: Significant

What the paper must do: Use the standard efficiency gap definition. Reframe the analysis to distinguish between (a) votes that are strategically suboptimal and (b) votes that could have been reallocated. These are different claims.

FINDING 6: Turnout Modeling Is Internally Inconsistent

Specific Claim: "Black voter turnout drops 15-20 points from presidential to midterm elections (63% to 45%)... At 45% turnout, a 25% BVAP district effectively has 11.25% Black voters participating."

Problem 1: The paper simultaneously argues (a) the kingmaker model is "most powerful in midterms" and (b) turnout collapse is the "most serious operational risk." These cannot both be true in a way that supports the thesis. If midterm turnout produces only 11.25% effective Black participation, the net swing at 85% cohesion is 11.25% × 0.70 = 7.875 points. The paper says this is "still decisive in races under 8 points"—but then the model works only in the very closest races, not in "100% of competitive races."

Problem 2: The Brennan Center source cited—"Alabama's Racial Turnout Gap Hit 16-Year High in 2024"—actually documents that the racial turnout gap is growing, meaning Black turnout is falling relative to white turnout. This is counter-evidence to the thesis, cited as supporting evidence.

Problem 3: The "sub-representative model" is asserted to solve midterm mobilization, but no evidence is provided that sub-representatives have historically sustained midterm turnout. The Baltimore Political Advisory Council and Savannah Political Guidance Committee examples are from the 1960s, a period of peak movement mobilization that is not generalizable to contemporary conditions.

Counter-evidence: Fraga, B. (2018). The Turnout Gap: Race, Ethnicity, and Political Inequality in a Diversifying America. Cambridge University Press. Documents that turnout gaps are structural and resistant to organizational interventions in the absence of contested candidates and mobilization infrastructure. The sub-representative model would need to be significantly theorized against this literature.

Severity: Major

What the paper must do: Run the full model under midterm turnout assumptions for each district. Show which districts remain "decisive" under realistic midterm participation rates. Do not assert that the sub-representative model solves the mobilization problem without empirical support for that mechanism.

FINDING 7: The "Power Multiplier of 2.1x" Is Undefined and Unverifiable

Specific Claim: "The power multiplier is 2.1x."

The Problem: This number appears without derivation. The paper says 7 seats become 15 seats, which is a ratio of 2.14—so the "power multiplier" appears to simply be 15/7. But this is not a power multiplier; it is a seat count multiplier. Power and seat count are not the same thing.

If the 7 current seats have zero leverage over House control (as the paper argues), their power is approximately 0. If 15 kingmaker seats are pivotal, their power is positive. Dividing positive by approximately-zero gives an undefined or infinite multiplier, not 2.1x. The paper cannot have it both ways: if current power is near zero (the "representation without leverage" argument), then any positive power in the new model is an infinite improvement, not 2.1x. If current power is positive (and comparable), then the 7 seats have more leverage than the paper admits.

Severity: Significant

What the paper must do: Define the power metric explicitly. Use Banzhaf indices or a clearly specified utility function. Show the calculation. "2.1x" as presented is not a valid quantitative result.

FINDING 8: Cherry-Picked Cohesion Data

Specific Claim: "Empirical cohesion in Deep South: 85-92% (Gingles litigation, ecological inference)."

Problem 1: The 85-92% figure comes from Thornburg v. Gingles litigation and subsequent VRA cases—contexts in which plaintiffs were actively trying to demonstrate high cohesion to establish vote dilution. This creates a systematic selection bias: litigation cohesion estimates are drawn from districts and elections where polarization was high enough to motivate litigation.

Problem 2: The paper uses 2024 cohesion estimates that show a shift to ~80% (the "credible bidirectional threat" argument in Section 6). But if 2024 cohesion was 80%, the central estimate of 85% is potentially outdated, and the conservative floor of 75% may be the new central estimate. The paper cannot simultaneously argue that 2024 cohesion decline signals strategic sophistication and use pre-2024 cohesion rates as the baseline for the kingmaker calculation.

Problem 3: Cohesion in VRA litigation measures historical voting patterns in existing majority-minority districts. The kingmaker model requires sustained cohesion in new electoral contexts—cracked districts where Black voters are a minority interacting with new candidate slates, new party messaging, and different mobilization incentives. There is no empirical evidence that 85% cohesion would persist in these new contexts.

Counter-evidence: Kousser, J.M. (2008). "The Strange, Ironic Career of Section 5 of the Voting Rights Act." Texas Law Review, 86(4). Documents variability in Black cohesion across district types and electoral contexts.

Severity: Major

What the paper must do: Distinguish between cohesion in majority-minority districts and expected cohesion in new minority-plurality contexts. Model cohesion degradation as a sensitivity variable, not just an acknowledged risk.

FINDING 9: Ignored Counter-Literature on Minority Concentration and Descriptive vs. Substantive Representation

Specific Claim: "Representation turned out to be largely symbolic."

This is an empirical claim about which there is a substantial contested literature the paper does not engage:

Whitby, K.J. (1997). The Color of Representation. University of Michigan Press. Finds that Black representatives in majority-minority districts secured measurably more federal expenditures for their districts than white representatives in similar districts.
Canon, D.T. (1999). Race, Redistricting, and Representation. University of Chicago Press. Documents that Black representatives deliver substantively different legislative behavior on issues of concern to Black constituents.
Griffin, J.D. & Newman, B. (2008). Minority Report. University of Chicago Press. Shows that senators and representatives are more responsive to racial minority constituents when those constituents are concentrated.
Broockman, D.E. (2013). "Black Politicians Are More Intrinsically Motivated to Advance Blacks' Interests." American Journal of Political Science, 57(3). Finds that the representational benefit of descriptive representation is larger than the paper assumes.

The claim that majority-minority representation is "largely symbolic" is a strong empirical assertion, and the paper cites no primary political science literature to support it. The dismissed literature on substantive representation is a major gap.

Severity: Major

What the paper must do: Engage with the descriptive-vs-substantive representation literature. If the claim is that material outcomes for Black communities have not improved sufficiently, that must be demonstrated with outcome data, not asserted.

FINDING 10: The ArXiv Citation Is Used Misleadingly

Specific Claim: "ArXiv:2604.01340 (2024). Proves minority welfare is nonmonotonic in voter concentration."

Problems:

1. An ArXiv preprint is not peer-reviewed. Characterizing it as "proves" is inappropriate.

2. The paper provides no details of the model, assumptions, or scope conditions of this finding.

3. "Nonmonotonic" means welfare first rises then falls (or vice versa) with concentration—it does not straightforwardly "prove" that lower concentration is better. The optimal concentration point in such a model depends entirely on model parameters.

4. The citation appears in the reference list but there is no corresponding in-text discussion, making it impossible to evaluate how the finding was actually used.

Severity: Significant

What the paper must do: Describe the model and its assumptions. Clarify what "nonmonotonic" means in context and at what concentration level welfare is maximized. Do not characterize a preprint as proving anything.

FINDING 11: The Callais Case Is Described as Settled When It Was Decided Two Months Prior

Specific Claim: "Louisiana v. Callais (April 2026) rendered Section 2 vote-dilution claims inoperable."

Problem: The paper is dated May 2026. A Supreme Court decision from April 2026 would be approximately 30 days old at time of writing. The implementation of such a ruling—including lower court responses, state legislative action, and DOJ enforcement posture—would be entirely unclear. Characterizing it as having "rendered Section 2 inoperable" is a legal conclusion that would require substantial doctrinal analysis that is not provided. The paper treats a brand-new decision as if its implications are fully determined.

Additionally, the paper's legal analysis of Alexander v. SC NAACP (2024) contains an oversimplification. The Court did not hold that racial gerrymander claims are "nearly impossible"—it held that when race and party are correlated, plaintiffs must provide specific evidence that race was the predominant factor. This is a higher evidentiary burden, not an absolute bar.

Severity: Significant

What the paper must do: Qualify legal characterizations with appropriate uncertainty given the recency of Callais. Provide proper legal analysis or cite legal scholarship on the doctrinal implications.

FINDING 12: Selection Bias in International Comparisons

Specific Claim: The FDP, DUP, and Ra'am are offered as proof that the kingmaker model works.

Cherry-picking problem: The paper selects three successful cases of minority bloc kingmaking without noting:

The German FDP ceased to be a kingmaker after falling below the 5% electoral threshold in 2013, was excluded from government for four years, and failed to return to the Bundestag in 2025. The kingmaker model collapsed when the bloc's credibility was tested.
The UK DUP's 2017 arrangement was a confidence and supply agreement, not a coalition—and it collapsed in 2019, with Northern Ireland subsequently losing significant influence as a result of the DUP's overreach.
Ra'am's inclusion in the Bennett-Lapid coalition was historically exceptional and is not representative of Arab-Israeli party influence historically. Arab parties were excluded from governing coalitions for decades before 2021.

The selection of only successful cases without noting failures or reversals is textbook cherry-picking of comparative evidence.

Counter-evidence: McGarry, J. & O'Leary, B. (2006). "Consociational Theory, Northern Ireland's Conflict, and its Agreement." Government and Opposition, 41(1). Documents conditions under which minority bloc leverage is sustained versus lost.

Severity: Major

What the paper must do: Present a balanced survey of cases where kingmaker strategies succeeded and failed. Identify the scope conditions that distinguish success from failure and evaluate whether those conditions hold in the U.S. context.

FINDING 13: The "Gedanken Experiment" Framing Does Not Excuse Empirical Rigor

Specific Claim: The paper repeatedly describes itself as a "gedanken experiment" and notes it was completed in 90 minutes using AI research assistants.

Problem: The paper makes quantitative empirical claims ("The results are unambiguous," "100% of competitive races," "2.1x power multiplier") and policy recommendations for a real community facing real redistricting. The gedanken framing is used to simultaneously claim analytical rigor (quantitative tables, formulas, citations) and preemptive immunity from standards of empirical rigor ("this does not make the analysis correct"). A paper that makes definitive quantitative claims about real political outcomes cannot disclaim empirical standards.

The disclosure that the analysis was completed in 90 minutes using AI agents, while admirably transparent, raises serious concerns about:

Whether cited sources were actually read or were summarized by AI with potential hallucination
Whether the demographic data came from actual redistricting proposals or AI-generated approximations
Whether the legal characterizations were reviewed by anyone with legal expertise

The citation "Computation source: scripts/output/kingmaker_calc.py" is not a verifiable source. The citation "Research completed with Claude (Anthropic)" does not meet any academic citation standard.

Severity: Major

What the paper must do: Either (a) remove the definitive quantitative claims and present the paper explicitly as a speculative framework paper, or (b) verify all data sources independently and remove the 90-minute origin story from the methods section. The current framing is inconsistent—it cannot be both a rigorous quantitative paper and a 90-minute AI-assisted gedanken experiment.

FINDING 14: The Paper's Most Important Empirical Assumption Is Never Stated

Unstated Assumption: The entire kingmaker model assumes that Republican candidates in cracked districts would negotiate with and potentially earn Black voters' support in exchange for policy concessions.

This requires assuming that:

1. Republican candidates are willing to make binding pre-election commitments to Black community organizations

2. Republican party leadership would permit such commitments (given the party's recent positions on voting rights, affirmative action, diversity programs)

3. Republican-primary voters in these districts—which are by design R+15 or higher—would not punish candidates for making such commitments

4. Black community organizations have the institutional capacity to hold Republican politicians accountable post-election

None of these assumptions is discussed. The paper treats "credible bidirectional threat" as if it requires only the willingness of Black voters to vote Republican, ignoring that the threat is bidirectional: it also requires Republicans to credibly believe they can win Black votes by making concessions—which requires their primary electorate and party structure to permit it.

Counter-evidence: Tesler, M. (2016). Post-Racial or Most-Racial? Race and Politics in the Obama Era. University of Chicago Press. Documents the racialization of policy preferences that makes cross-racial coalition bargaining structurally difficult.

Severity: Critical

What the paper must do: Model the Republican Party's incentive structure explicitly. Identify specific policy domains where Republican candidates in these districts would have both incentive and capacity to negotiate with Black community organizations. Without this, the "bidirectional threat" is not credible and the entire model fails.

SUMMARY TABLE

Finding	Claim Challenged	Severity

|---|---|---|

1	Net swing formula ignores non-Black vote vector	Critical
2	"100% of competitive races" is circular	Critical
3	Bidirectional threat assumption	Critical
4	Parliamentary analogies are systemically inapt	Major
5	Wasted vote definition is non-standard	Significant
6	Turnout modeling internally inconsistent	Major
7	"2.1x power multiplier" is undefined	Significant
8	Cohesion data is cherry-picked	Major
9	Ignores substantive representation literature	Major
10	ArXiv preprint mischaracterized	Significant
11	Callais legal characterization premature	Significant
12	International comparisons cherry-picked	Major
13	Gedanken framing inconsistent with empirical claims	Major
14	Republican negotiation assumption unstated	Critical

WHAT WOULD MAKE THIS PAPER STRONG

The core intuition—that dispersed voters in competitive districts might have more leverage than concentrated voters in safe districts—is a genuine and underexplored question in redistricting scholarship. The paper's most original contribution is framing this as a testable proposition. To fulfill that promise:

1. Fix the math: Show net outcomes including non-Black partisan baselines for each specific new district.

2. Map to real geography: Match cracked voters to actual proposed new districts and look up their partisan histories.

3. Engage Frymer: The captured-constituency problem is this paper's most dangerous counter-argument and must be addressed.

4. Model the Republican side: The bidirectional threat requires Republican incentive compatibility, not just Black voter willingness.

5. Distinguish parliamentary from congressional mechanisms: The entire leverage theory needs a U.S.-specific institutional account.

6. Present the model honestly: As a speculative framework with important conditional findings, not as a paper with "unambiguous results."

The question this paper asks deserves serious analysis. The current version is not that analysis yet.