A First Principles Framework for Climate Governance: Core Functions from Cybernetic Theory

A First Principles Framework for Climate Governance: Core Functions from Cybernetic Theory
Article Cover_A First Principles Framework for Climate Governance

Executive Summary

Global climate governance suffers from what recent scholarship identifies as "theoretical fragmentation"—a proliferation of descriptive frameworks that fail to provide normative guidance for system design. We address this gap by developing the Theoretical Climate Action Framework (TCAF), which derives six necessary governance questions from cybernetic first principles. Drawing on Ashby's Law of Requisite Variety and Beer's Viable System Model, we demonstrate through logical necessity tests that any governance system capable of maintaining viability must address: (1) boundary definition and perception, (2) autonomy and supervision balance, (3) inter-unit coordination, (4) environmental intelligence, (5) normative policy coherence, and (6) reflexive learning. We establish these six questions as a theoretically robust and functionally necessary set—each is indispensable, and together they provide sufficient diagnostic coverage for system survival. Applied recursively across three scales (micro, meso, macro), this generates an 18-question diagnostic matrix. We distinguish trust and integrity as transversal mechanisms rather than additional questions, operating across all 18 cells through capacity-building, transparency, and institutional design. Empirical validation demonstrates that observed failures map precisely onto specific question deficits: the Kyoto Protocol's 38% to 11% coverage collapse exemplifies Q5 failure, voluntary carbon markets' 80-90% ineffectiveness rate reflects Q6 failure, and methodological fragmentation across standards represents Q3 failure. Conversely, the Montreal Protocol's success demonstrates how addressing all six questions enables effective governance. This axiomatic approach moves climate governance scholarship from inductive description toward deductive prescription, providing theoretical foundations for evaluating and designing robust multi-scale institutions.

Keywords: climate governance, cybernetics, viable systems, institutional design, axiomatic method, theoretical integration

1. Introduction

1.1 The Triple Crisis of Global Climate Governance

The inadequacy of current climate governance manifests across three interdependent dimensions. At the physical level, converging assessments project warming trajectories fundamentally overshooting the Paris Agreement's 1.5°C target. Climate Action Tracker's 2024 assessment indicates 2.7°C warming under current policies, while the IEA World Energy Outlook 2024 projects 2.4°C, and UNEP's 2024 Emissions Gap Report estimates 2.6°C by 2100. The implementation gap between current policies and the 1.5°C pathway stands at 22-27 GtCO₂e annually according to UNEP 2024. Existing and planned fossil fuel infrastructure commits approximately 850 GtCO₂ to future emissions—far exceeding the remaining carbon budget of 420-500 GtCO₂ estimated by IPCC AR6. The required emissions reduction rate has escalated from 7.6% per year (if starting in 2020) to approximately 9-15% per year if action is delayed to 2030, reaching politically implausible levels.

At the institutional level, international cooperation reveals systematic regression. The Kyoto Protocol's coverage collapsed from 41 countries representing 38% of global emissions in its first commitment period (2008-2012) to merely 34 countries covering 11% in its second period (2013-2020). Within ostensibly functional systems, integrity has eroded dramatically. Rigorous assessment of REDD+ and renewable energy offset projects found that approximately 80-90% of carbon credits delivered no real emission cuts (West et al., 2020; Guardian/SourceMaterial investigation, 2023). The Kariba REDD+ project alone involved 15.2 million improperly issued credits, with 10.3 million already retired by companies to meet climate commitments.

At the theoretical level, comprehensive reviews across multiple domains identify "theoretical fragmentation" as the field's central pathology. Studies of climate risk and corporate innovation document scholarship "characterized by theoretical fragmentation and empirical inconsistency" (Hahn et al., 2024). Research on grassroots climate movements shows that despite increased attention to power dynamics, "theoretical fragmentation persists," reproducing epistemological biases (Patterson et al., 2023). Urban environmental governance scholarship explicitly states that "theoretical fragmentation prevents the development of integrated frameworks that address social, economic, and environmental priorities collectively" (Bulkeley et al., 2024). These three crises—physical, institutional, and theoretical—are mutually reinforcing. Theoretical fragmentation produces inconsistent policy designs, which generate institutional failures, which worsen physical outcomes.

1.2 The Problem of Theoretical Fragmentation

Current climate governance scholarship resembles a "theoretical Babel"—multiple communities speaking distinct conceptual languages with limited capacity for meaningful synthesis. Three dominant approaches have emerged, each capturing important aspects while remaining incomplete.

Polycentric climate governance (PCG), building on Ostrom's work, emphasizes multiple autonomous decision centers. However, the 2024 special issue on "Empirical Realities of Polycentric Climate Governance" delivered sobering assessments. Tobin et al. (2024) found that "empirical testing of whether polycentricity actually assists climate mitigation remains extremely limited." Kellner et al. (2024) demonstrated that PCG systematically neglects power structures, particularly "design power." Morrison et al. (2024) identified the core deficit: without "overarching rules," polycentric systems risk devolving into ineffective fragmentation.

The regime complex approach accepts institutional multiplicity as inevitable (Keohane & Victor, 2011). These descriptive taxonomies excel at explaining variation but offer limited guidance for designing better systems. As Agon (2024) demonstrated through comparison with COVID-19 responses, traditional international law exhibits profound rigidity when confronting cross-sectoral crises.

Experimentalist governance proposes learning-oriented processes (Sabel & Zeitlin, 2017). While productively emphasizing adaptive capacity, critics note it lacks normative force and structural specification. Jordan et al. (2018) found that polycentric systems often degrade into chaotic patchworks without cumulative learning.

The fundamental limitation common to all three approaches is their inductive character—deriving frameworks from observing existing systems rather than deducing what any viable system must possess. Lederer, Walker, and Winden (2025) articulated this deficit precisely: "Polycentric governance provides an ethical foundation (autonomy), experimentalist governance provides a learning mechanism (trial and error), but they both lack a structural skeleton to guarantee the system's long-term survival. Without an appropriate cybernetic architecture, polycentricity easily degenerates into anarchy."

1.3 Our Contribution: From Fragmentation to First Principles

This paper develops TCAF through axiomatic derivation from cybernetic principles. We ground our approach in Ashby's Law of Requisite Variety and von Bertalanffy's conception of open systems requiring boundaries. From these axioms, we derive six necessary questions: boundary definition and perception (Q1), autonomy balanced with supervision (Q2), coordination and oscillation damping (Q3), adaptation through intelligence (Q4), policy coherence and identity (Q5), and reflexive learning (Q6).

Our central claim is that these six questions constitute a theoretically necessary and functionally sufficient set. We establish this through redundancy tests showing that no question can be eliminated without loss of distinct function, and necessity tests demonstrating that each addresses a unique failure mode. The framework's power lies in its recursive structure: governance operates simultaneously across at least three scales (micro, meso, macro), generating an 18-question diagnostic matrix. We distinguish trust and integrity as transversal mechanisms operating across all questions through three pillars—capacity building, transparency, and institutional design.

Our approach represents a methodological shift from inductive description to deductive design, joining a tradition in social science of using axiomatic methods to reveal necessary structures. Kenneth Arrow derived fundamental limits on collective decision-making from axioms, not from surveying voting systems. Alexander Hamilton designed the American federal system through "a priori deduction from assumed first principles." Our application aims to provide not merely explanation of existing failures but a blueprint for constructing viable systems.

Empirical validation comes from demonstrating that observed failures map onto specific question deficits. The Kyoto collapse exemplifies Q5 failure (normative identity dissolution). The VCM's 80-90% ineffectiveness demonstrates Q6 failure (absence of reflexive mechanisms). The five-fold EU-China carbon price differential reflects Q3 failure (lack of coordination mechanisms). Conversely, the Montreal Protocol's success in phasing out ozone-depleting substances demonstrates how addressing all six questions enables effective global governance.

Research Trajectory. This paper establishes theoretical foundations. Subsequent research will apply TCAF to evaluate specific governance systems, develop empirically-validated diagnostic tools, and extend the framework across diverse institutional contexts. The immediate agenda includes systematic evaluation of the EU's Carbon Border Adjustment Mechanism using the 18-question matrix, comparative analysis across multiple governance systems to validate the framework's explanatory power, and development of operational assessment protocols calibrated through case studies.

1.4 Roadmap

Section 2 establishes theoretical foundations, presenting cybernetic axioms and defending axiomatic reasoning in social science. Section 3 systematically derives each of the six necessary questions from first principles. Section 4 demonstrates the functional necessity and sufficiency of our six-question set through logical tests. Section 5 extends the framework to multiple scales, presenting the 18-question matrix. Section 6 addresses trust and integrity as transversal mechanisms. Section 7 discusses theoretical contributions and scope boundaries. Section 8 concludes by identifying how this theoretical foundation enables future empirical research.

2. Theoretical Foundations

2.1 Cybernetic Axioms: The Physics of Governance

We ground TCAF in two foundational principles representing the most parsimonious statements about organized system persistence.

Axiom 1: Law of Requisite Variety (Ashby, 1956). For a system to successfully regulate its environment, the regulator's variety (complexity) must be at least as great as the variety of disturbances. Formally: V(R) ≥ V(D), where V(R) is regulator variety and V(D) is disturbance variety. In climate governance, the environment—comprising physical dynamics, economic structures, technological trajectories, and social movements—exhibits enormous variety. A governance system with insufficient internal variety will fail to control outcomes. This explains why monocentric solutions tend toward brittleness: they possess too little internal variety to match environmental heterogeneity.

However, Ashby's law also implies a ceiling on necessary complexity. A governance system need not match environmental variety at the molecular level but at the level of functionally significant patterns. This motivates our search for a minimal set of necessary functions—the smallest number of distinct regulatory mechanisms whose combined variety suffices for system viability.

Axiom 2: Open Systems Require Boundaries (von Bertalanffy, 1968). Any system exchanging matter, energy, or information with its environment must possess boundaries distinguishing self from not-self. Without boundaries, the system cannot maintain organization and collapses into thermodynamic equilibrium—synonymous with dissolution for living and social systems. The boundary need not be physical but must exist and be actively maintained through metabolic or organizational work.

For governance systems, boundary specification is thus not a technical detail but an existential prerequisite. Defining which emissions "belong to" which actors, which harms fall under which jurisdictions, or which lands qualify for carbon credits are all boundary-drawing exercises with ontological import. The climate crisis is partly a boundary crisis: atmospheric carbon dioxide respects no political borders, supply chains diffuse responsibility across continents, and future generations lack present representation.

These axioms jointly imply that governance is fundamentally a problem of organized complexity—requiring sufficient variety to match environmental perturbations (Axiom 1) and sufficient coherence to preserve system identity (Axiom 2). The six questions we derive represent the minimal set of functions necessary to achieve this balance.

Figure 1 illustrates how these two axioms generate the functional architecture of any viable governance system.

2.2 Governance as Organized Complexity

Translating cybernetic principles into governance contexts requires recognizing that governance systems are evolved socio-political structures where power, legitimacy, and meaning-making are constitutive elements. We employ cybernetic principles not as templates but as diagnostic tools for identifying necessary functions that any governance system—regardless of institutional embodiment—must fulfill.

"Control" in cybernetics refers to maintaining essential variables within viable ranges despite environmental perturbations. A democratic assembly controls when it steers collective outcomes through deliberation. A market controls when price signals coordinate resource allocation. A social norm controls when internalized values shape individual behavior. The cybernetic question is not whether to control (without control, systems disintegrate) but what functions must be performed for control to succeed.

We follow Beer's (1972, 1981) Viable System Model in conceptualizing governance as requiring multiple interacting subsystems, each performing distinct necessary functions. However, we depart from Beer's specific five-system architecture because it mixes functional requirements with structural solutions. Our six questions represent pure functional requirements, agnostic about institutional implementation. For instance, Q3 (coordination) specifies that some mechanism must damp oscillations between autonomous units, but it does not prescribe whether this occurs through markets, hierarchies, networks, or hybrid forms.

This abstraction serves a strategic purpose. Climate governance scholarship has often conflated specific institutional forms (UN Framework Convention, carbon markets, city networks) with the functions they purportedly serve, making diagnosis difficult and lesson transfer across contexts problematic. By separating function from form, TCAF enables precise diagnosis: a coordination failure (Q3) may occur not because markets are inherently flawed but because specific market designs lack adequate information flows or enforcement mechanisms.

2.3 Methodological Position: Axiomatic Reasoning in Social Science

Our deductive approach places us within a methodological tradition that has proven powerful for identifying necessary conditions and impossibility results. The strongest precedent comes from social choice theory. Kenneth Arrow (1951) specified axioms that any "fair" aggregation should satisfy (unrestricted domain, non-dictatorship, Pareto efficiency, independence of irrelevant alternatives) and demonstrated deductively that no procedure could simultaneously satisfy all. This impossibility theorem transformed democratic theory by revealing structural constraints that no amount of institutional creativity could overcome. The power of Arrow's method lay precisely in its abstraction from empirical variation to identify necessary logical relationships.

Political theory offers another precedent. Contemporary scholarship recognizes that Alexander Hamilton's Federalist Papers employed "a priori deduction from assumed first principles" in designing the American federal system. Hamilton did not empirically survey existing confederations to inductively derive optimal arrangements; he reasoned from axioms about human nature and power to deduce necessary institutional safeguards. The resulting Constitution embedded cybernetic principles—checks and balances, federalism, separation of powers—that function as requisite variety generators and mutual damping mechanisms, though Hamilton did not use this vocabulary.

The case for axiomatic methods in climate governance rests on four arguments. First, the problem's urgency demands prospective design rather than retrospective learning from accumulated failures over centuries. We cannot afford to wait for climate institutions to evolve naturally through trial and error. Second, the problem's complexity exceeds human capacity for intuitive reasoning about system behavior. Axiomatic derivation helps identify non-obvious requirements and unintended interactions. Third, the problem's normative dimensions require explicit specification of values and constraints, which axioms provide more transparently than ad hoc institutional borrowing. Fourth, the problem's global scope necessitates principles general enough to be universal yet specific enough to be actionable.

Our axiomatic approach does not claim to produce determinate institutional blueprints. The six questions specify functional requirements, not structural forms. Multiple institutional configurations might fulfill these requirements, and local contexts will appropriately shape specific implementations. What the method provides is a minimal checklist: any governance system failing to address all six questions, in some manner, will prove non-viable. This is a diagnostic tool as much as a design template.

3. Deriving the Six Necessary Questions

We now execute the core theoretical contribution: deriving the minimal set of questions any viable climate governance system must answer. For each question, we provide: (1) theoretical derivation from first principles, (2) specification of the distinct failure mode when the question is neglected, and (3) empirical evidence from actual governance breakdowns validating the question's necessity.

Figure 2 traces the logical derivation of each question from the foundational axioms, demonstrating their theoretical necessity.

3.1 Q1: How Do We Define Boundaries and Perceive the System?

Derivation. Axiom 2 directly implies that any governance system must first specify what it governs. For climate systems, this question has multiple dimensions: physical (which greenhouse gases, which sinks, which territories), temporal (which time horizons, which discount rates), and ontological (what counts as mitigation versus adaptation, what defines additionality in offsets). These boundary choices are constitutive decisions determining what becomes visible and governable.

Ashby's Law further implies the system must possess sensory mechanisms capable of perceiving perturbations within defined boundaries with requisite variety. A governance system blind to emissions in certain sectors cannot regulate effectively, regardless of policy instruments. The perception subsystem must match the variety of emission sources and the sophistication of actors who might game the rules.

Failure Mode: Ontological Politics and Monitoring Blind Spots. When Q1 is inadequately addressed, governance systems suffer from two interrelated pathologies. First, boundary disputes become sites of irresolvable conflict when different actors operate with incommensurable ontologies. Second, monitoring gaps allow systematic non-compliance to persist undetected.

Satellite observations reveal that urban methane emissions are systematically underestimated by 1.4 to 2.6 times in traditional ground-based inventories (NASA/Harvard, 2024). This is not random measurement error but systematic bias: ground methodologies define "boundaries" based on administrative convenience rather than physical reality. The Buenos Aires landfill contributes 49% of Argentina's managed landfill emissions but was likely underreported because boundary definitions excluded certain off-site emission sources.

The REDD+ controversy illustrates the ontological dimension. European carbon accounting treats forests as quantifiable carbon stocks—objects to be measured and traded. Many Indigenous communities understand forests as Earth beings with spiritual significance that cannot be commodified. This is not a disagreement about measurement techniques but about fundamental ontology—what forests are. When the Kariba REDD+ project proceeded with boundaries defined according to Western ontology, local communities experienced ontological violence: "only a small amount of money trickled down" despite project sales exceeding $18 million.

Empirical Validation. The systematic methane underestimation constitutes Q1 failure: the perception system possessed insufficient variety to match urban emission complexity, leading to control failure. The resolution required satellite remote sensing—a perception system with requisite variety to detect diffuse area sources. The 15.2 million false Kariba credits stemmed partly from boundary manipulation: incorrectly estimating future deforestation baselines artificially inflated the boundary of "additional" sequestration.

3.2 Q2: How Do We Balance Autonomy and Supervision?

Derivation. Ashby's Law implies that centralized control inevitably fails because no single regulator possesses enough variety to match environmental complexity. This necessitates delegating autonomy to lower-level units with local knowledge and adaptive capacity. However, pure autonomy without supervision leads to a different failure: local optimization undermining global objectives, or outright defection from collective commitments. Q2 emerges from this tension: governance systems must simultaneously grant sufficient autonomy for variety generation and impose adequate supervision to maintain systemic coherence.

In Beer's terminology, this is the relationship between System 1 (autonomous operational units) and System 3* (sporadic audit channels verifying System 1's self-reports). The balance is delicate: excessive supervision crushes local initiative and innovation, while insufficient supervision enables free-riding and fraud. The governance system must possess mechanisms that trigger supervision proportionate to detected anomalies—light-touch monitoring under normal conditions, intensive scrutiny when irregularities surface.

Failure Mode: Autonomy Without Accountability. When Q2 is inadequately addressed, under-supervision allows autonomous units to defect without consequence, while over-supervision stifles necessary innovation.

The voluntary carbon market exemplifies Q2 failure through under-supervision. The system grants projects enormous autonomy to self-design methodologies and self-report results, with verification outsourced to third-party auditors having financial incentives to approve projects (disapprovals mean lost business). The result is systematic fraud: approximately 80-90% of assessed carbon credits delivered no real emission reductions (West et al., 2020). BeZero detected Kariba problems in January 2023 based on satellite data analysis, but South Pole did not suspend credit sales until after public warnings—suggesting supervision channels lacked independence from supervised units.

Empirical Validation. The VCM's 80-90% ineffectiveness directly measures Q2 failure: the system granted autonomy (project developers choose methodologies) without adequate supervision (independent verification with enforcement power). Specific failures illustrate the pattern: wind power projects achieved minimal offset effectiveness, improved forest management similarly low rates (West et al., 2020). Verra's eventual cancellation of Kariba credits proves supervisory mechanisms existed in principle but were not activated proactively—a classic Q2 failure mode.

3.3 Q3: How Do We Coordinate and Damp Oscillations Between Units?

Derivation. When a system comprises multiple autonomous units (as Q2 requires), interactions inevitably generate conflicts, oscillations, and potential cascades. Without coordination mechanisms, the system either fragments into incompatible pieces or experiences destructive oscillations as units respond to each other's actions without dampening. Beer's System 2 addresses this: coordination functions must resolve routine conflicts, standardize interfaces, and prevent local disturbances from propagating systemically.

Q3 is distinct from Q5 (policy coherence) because it operates at execution level: "Given that we have decided on policy, how do we ensure multiple implementing units do not work at cross-purposes?" Requisite variety implies coordination cannot rely solely on central commands (which would violate Q2's autonomy requirement) but must include horizontal mechanisms: shared standards, reciprocal adjustments, and protocols allowing units to anticipate and accommodate each other's actions.

Failure Mode: Methodological Fragmentation and Market Segmentation. When Q3 fails, systems exhibit proliferation of incompatible standards preventing interoperability and price fragmentation creating arbitrage opportunities and coordination traps.

The MRV systems crisis illustrates the first pathology. UNFCCC, Verra, and Gold Standard employ fundamentally different methodologies for calculating REDD+ baselines. The same forest project would be assigned radically different emission reduction values depending on which standard applies. This is not healthy pluralism but dysfunctional fragmentation: buyers cannot compare projects, regulators cannot aggregate outcomes, and systematic bias goes undetected because there is no common reference frame.

Carbon market fragmentation illustrates the second pathology. In 2024, EU ETS carbon prices averaged approximately €65 per ton while China's market averaged €13 per ton—a five-fold differential. This price gap is not merely an inefficiency; it represents coordination failure preventing market linkage. Absent coordinated standards for measurement, verification, and enforcement, cost-effective abatement cannot flow to where it delivers most reduction per dollar.

Empirical Validation. Fragmentation of carbon accounting methodologies directly manifests Q3 failure: multiple actors operate autonomously (Q2) but lack coordinating mechanisms (Q3) to ensure methods are commensurable. The consequence is that "additionality" for the same project can be quantified as vastly different values, making system-level accounting impossible. The €65 versus €13 price differential quantifies the economic cost of Q3 failure at macro scale: roughly 80% of potential gains from trade are lost to coordination deficits.

3.4 Q4: How Do We Adapt Through Intelligence and Foresight?

Derivation. Climate governance operates in fundamentally uncertain and non-stationary environments. Physical climate dynamics involve tipping points and irreversibilities. Technological possibilities evolve rapidly. Political coalitions shift. Social movements mobilize or demobilize. Axiom 1 implies static systems cannot control dynamic environments—regulators must possess adaptive capacity to track environmental changes.

Beer's System 4 (Intelligence) fulfills this function: scanning the external environment, modeling future scenarios, and feeding forward projections to inform present decisions. This differs from Q6 (reflexive learning from past errors) because it concerns anticipating novel futures rather than correcting past mistakes. The requisite variety principle implies the intelligence function must match the variety of possible environmental futures—hence requiring distributed sensing, diverse scenario analysis, and mechanisms to incorporate dissenting or marginal perspectives that may detect early warning signals.

Failure Mode: Adaptive Capacity Deficits and Surprise Vulnerability. When Q4 is inadequately addressed, systems cannot detect emerging threats until after critical thresholds are crossed and cannot update strategies in light of changing circumstances, leading to persistent application of obsolete approaches.

The Paris Agreement's NDCs illustrate both failures. The mechanism was designed to be adaptive—countries submit increasingly ambitious targets every five years based on stocktake assessments. However, the 2023 Global Stocktake revealed that approximately 115 countries failed to update NDCs, and even updated NDCs remain insufficient for 1.5°C. This suggests Q4 failure: intelligence gathering (Global Stocktake) produced clear signals that current policies are inadequate, but feed-forward mechanisms to translate intelligence into revised action plans are structurally weak.

India's transparency reporting crisis exemplifies Q4 failure at national scale. The country delayed its Biennial Transparency Report to 2026 due to fragmented data across 30+ agencies. This is not merely bureaucratic inefficiency but an intelligence system lacking requisite variety: to perceive and project emissions across a complex economy, the sensing network must integrate data from energy, transport, agriculture, industry, and land use sectors. India's fragmented system cannot generate the unified intelligence needed for adaptive policy.

Empirical Validation. The failure of 115 countries to update NDCs after the 2023 Global Stocktake is direct evidence of Q4 failure: environmental intelligence was gathered (stocktake), but the governance system could not translate it into adaptive action (updated commitments). The growing gap between stated ambition and required trajectory—7.6% annual reduction needed from 2020 baseline, escalating to 9-15% if delayed—quantifies the cost of Q4 failure: each year of delayed adaptation increases the required future reduction rate, eventually reaching politically impossible levels.

3.5 Q5: How Do We Maintain Policy Coherence and Identity?

Derivation. A governance system is not merely operational functions (Q1-Q4) but an entity with identity, purpose, and normative commitments. Beer's System 5 (Policy) sets boundaries within which all other functions operate, defining the system's raison d'être and resolving tensions between competing objectives. For climate governance, this includes: What is the ultimate temperature target? How do we balance mitigation versus adaptation? What principles govern burden-sharing?

Q5 is necessary because without it, the governance system lacks an attractor state—a stable equilibrium toward which activities converge. Lower-level systems (Q2-Q4) can function efficiently in service of multiple incompatible goals simultaneously, causing the entire system to thrash between them without making progress. Ashby's Law implies the policy function must absorb the variety of conflicting values and interests in society, synthesizing them into coherent direction the whole system can pursue.

The key challenge is maintaining identity over time despite changing circumstances (which Q4 detects) and learning from errors (which Q6 enables). An adaptive system changing too readily loses its normative core and becomes rudderless; one changing too slowly becomes brittle and disconnected from reality. Q5 must balance continuity (maintaining commitments despite difficulties) with evolution (updating goals as understanding deepens).

Critics might argue Q5 (normative identity) represents political will failures rather than cybernetic control failures. We contend this distinction is artificial: in social systems, normative coherence is a control variable. Just as Beer's System 5 sets the "identity" that bounds all other system functions, political will is not exogenous but emerges from institutional design. The Kyoto collapse demonstrates that even technically sound operational systems (Q1-Q4) disintegrate without normative attractor states (Q5) that make continued participation individually rational for key actors.

Failure Mode: Legitimacy Collapse and Normative Fragmentation. When Q5 fails, systems experience loss of collective identity, leading to defection cascades where actors no longer perceive themselves bound by shared commitments.

The Kyoto Protocol's collapse provides textbook evidence. During the first commitment period (2008-2012), 41 countries representing 38% of global emissions shared collective identity as parties to a binding agreement with quantified targets. However, the normative foundation was fragile, depending on all major emitters accepting their assigned roles. When the United States never ratified, Japan withdrew, Canada exited after facing non-compliance penalties, and Russia declined a second commitment after benefiting from "hot air" credits, collective identity fractured. By the second period (2013-2020), only 34 countries covering 11% remained—a catastrophic legitimacy collapse. The 38% to 11% decline quantifies Q5 failure: the governance system could not maintain normative attractiveness as a shared enterprise once key actors redefined identities outside the regime.

Common But Differentiated Responsibilities (CBDR) illustrates Q5 functioning at ethical architecture level. CBDR establishes that while all countries share responsibility for climate action, historical emissions and development needs justify differentiated commitments. This normative principle serves as System 5's core identity, absorbing competing claims about fairness and synthesizing them into a framework enough actors can accept for cooperation to proceed.

Empirical Validation. Kyoto's participation collapse directly measures Q5 failure: normative identity dissolved, rendering operational mechanisms (Q1-Q4) irrelevant because the policy framework no longer commanded allegiance. Countries left not due to technical failures in monitoring or coordination but because they no longer identified with the regime's normative framework. The "hot air" problem—where Russia's economic collapse created surplus emission permits—further illustrates Q5 failure: the policy framework (1990 baseline) produced perverse outcomes undermining ethical coherence.

3.6 Q6: How Do We Enable Reflexive Learning and Error Correction?

Derivation. All governance systems operate under Knightian uncertainty: they face futures that cannot be fully specified in advance and must make decisions based on incomplete models. This guarantees errors will occur—not merely implementation mistakes but fundamental design flaws becoming apparent only through operation. Q6 addresses the meta-question: how does the system learn about its own inadequacies and correct them?

Von Foerster's second-order cybernetics (cybernetics of observing systems) provides theoretical foundation. First-order control loops (Q1-Q5) regulate specific variables, but second-order loops observe and modify the control loops themselves. Argyris and Schön's (1978) distinction between single-loop learning (adjusting actions to meet fixed goals) and double-loop learning (questioning the goals themselves) maps onto this: Q6 specifically concerns double-loop learning—capacity to revise fundamental assumptions when evidence accumulates of their flaws.

The reflexive function differs from Q4 (intelligence) in temporal orientation and epistemological stance. Q4 looks forward to anticipate changes; Q6 looks backward to diagnose errors. Q4 operates in planning mode; Q6 in evaluation mode. Most crucially, Q6 must be institutionalized—requiring standing procedures that regularly surface problems, trace them to root causes, and implement corrections even when doing so challenges powerful interests or comfortable assumptions.

Failure Mode: Locked-in Dysfunction and Systemic Fraud Persistence. When Q6 is inadequately addressed, systems perpetuate dysfunctions long after evidence of failure accumulates. Unlike other questions where failure produces immediate crisis, Q6 failure enables gradual decay: each unreflected error compounds, small frauds normalize into systemic corruption, and obsolete assumptions calcify into institutional dogma.

The voluntary carbon market's persistent 80-90% ineffectiveness rate is the signature of Q6 failure. This is not sudden breakdown but sustained dysfunction visible for years. Independent assessments repeatedly found credits did not represent real reductions. Studies of REDD+ showed systematic overestimation. Analysis of renewable energy offsets revealed projects would have been built anyway, lacking true additionality. Despite accumulating evidence, the VCM lacked reflexive mechanisms to question foundational methodologies. Verification agencies, having financial stakes in approving projects, did not trigger system-wide reviews even when aware of problems. When BeZero placed Kariba on its watch list in January 2023 based on satellite analysis, the system's response was delayed and defensive: South Pole suspended sales only after external pressure, and Verra required months to cancel credits.

The absence of institutionalized reflexivity meant lessons were not transferred across projects. The same methodological errors (inflated baselines, incorrect permanence assumptions, phantom additionality) recurred in project after project, with each scandal treated as isolated rather than evidence of systemic design flaws. This is the hallmark of Q6 failure: the system has sensors (Q1), intelligence (Q4), and coordination (Q3), but lacks meta-capacity to question whether these functions are correctly configured.

Empirical Validation. The VCM operated for over a decade with 80-90% ineffectiveness before major corrective action, quantifying Q6 failure's cost: millions of tons of claimed reductions were false, purchased by companies to justify continued emissions. The Kariba case is particularly probative: after 15.2 million false credits were issued, with 10.3 million already retired, the system's response was not comprehensive methodology review but merely canceling excess credits from this one project. This reactive rather than reflexive response proves Q6 inadequacy. Had the system possessed functioning reflexive loops, satellite detection of deforestation discrepancies would have triggered examination of all REDD+ methodologies, not just Kariba.

4. Demonstrating Functional Necessity and Sufficiency

Having derived six necessary questions, we now establish through logical tests that they constitute a theoretically robust set: each question is indispensable (necessity), and no question is redundant (sufficiency). This dual demonstration grounds our claim that Q1-Q6 represent the irreducible functional architecture of viable governance.

4.1 Necessity Tests: Unique Failure Modes

To demonstrate necessity, we show each question addresses a distinct failure mode not covered by any other question. The empirical evidence in Section 3 provides initial validation; here we strengthen the argument through counterfactual reasoning demonstrating that other questions cannot compensate for any single question's absence.

Q1 Necessity: Boundary Crises Are Irreducible. Urban methane underestimation (1.4-2.6x) is Q1 failure—incorrect boundaries for emission attribution. Could other questions compensate? Q2 (autonomy/supervision) cannot help if underlying perception is blind to sources. Q3 (coordination) cannot align actors around measurements that do not exist. Q4 (intelligence) cannot forecast trends in unobserved variables. Q5 (policy) cannot maintain legitimacy when accomplishments are illusory. Q6 (learning) cannot correct errors whose existence is undetected. The monitoring system must first perceive phenomena correctly (Q1) before other governance functions can operate effectively. This establishes Q1's logical priority and functional necessity.

Q2 Necessity: Autonomy-Supervision Balance Cannot Be Derived. The VCM's combination of project autonomy with weak verification created space for 80-90% ineffectiveness. Other questions could not prevent this: Q1 (boundary) was arguably satisfied—emissions were measured using accepted methodologies; the problem was fraudulent application. Q3 (coordination) existed through common standards (Verra, Gold Standard), but standards were gamed. Q4 (intelligence) detected problems (BeZero's satellite analysis), but without Q2's supervision channel, intelligence did not trigger action. Q5 (policy) provided clear normative direction (additionality requirements), but without supervision, policy was evaded. Q6 (learning) could not function because fraud was systematically concealed. Only Q2's specific function—balancing autonomy with proportionate supervision—could have prevented this failure mode.

Q3 Necessity: Coordination Is Geometrically Distinct from Other Functions. Fragmentation of carbon accounting methodologies across UNFCCC, Verra, and Gold Standard represents Q3 failure—lack of horizontal coordination between peer systems. This differs from Q2 (vertical supervision within each system). Each standard supervises its own projects (Q2), but they lack coordination with each other (Q3). Q1 (boundary) is satisfied within each system's domain. Q4 (intelligence) might detect divergence between systems, but knowing systems diverge does not itself create coordination. Q5 (policy coherence) might state systems should align, but without Q3's mechanisms (shared protocols, regular reconciliation), alignment does not occur. Q6 (reflexive learning) might recognize fragmentation as problematic, but learning creates knowledge, not coordination. The function of actually aligning multiple autonomous systems requires Q3's specific architecture.

Q4 Necessity: Forward Intelligence Cannot Be Substituted. India's inability to adapt transparency reporting due to fragmented data across 30+ agencies illustrates Q4 necessity. Agencies perceive emissions within domains (Q1), operate with some autonomy (Q2), and coordinate to some degree (Q3). Policy direction is clear (Q5): India committed to enhanced transparency under Paris. Yet the system cannot adapt its internal architecture to meet new requirements because no function integrates signals across agencies to project future needs and redesign structures. Q6 (learning from errors) is reactive; Q4 (anticipating changes) is proactive. Without Q4, systems only recognize inadequacy after failure, when adaptation comes too late.

Q5 Necessity: Normative Coherence Is Not Reducible to Operations. The Kyoto Protocol had functioning operational systems: MRV (Q1) monitored emissions, countries had autonomy (Q2), the Clean Development Mechanism provided coordination (Q3), negotiations occurred regularly suggesting intelligence gathering (Q4). Yet the system collapsed because its normative foundation (Q5) fractured when major players withdrew. This demonstrates Q5 is not reducible to operational efficiency. Even technically well-functioning systems disintegrate if actors cease identifying with purposes. Conversely, actors might maintain commitment to shared purpose (Q5) despite operational dysfunction, working to fix problems rather than defect. Q5 provides the attractor keeping actors engaged through difficulties.

Q6 Necessity: Reflexive Learning Addresses Unknown Unknowns. One might argue Q4 (intelligence/adaptation) obviates Q6 (reflexive learning): sufficiently intelligent systems would anticipate problems before occurrence. However, this conflates two epistemologically distinct functions. Q4 operates in the space of known unknowns—we know we do not know how technology will evolve, so we scan for signals. Q6 operates in unknown unknowns—we do not know that baseline methodologies are systematically biased until evidence accumulates. The VCM had Q4 function (tracking technology, policy changes) but lacked Q6 (questioning whether additionality frameworks themselves were sound). No amount of forward intelligence would reveal that current methodologies enabled fraud; only reflexive evaluation of outcomes against assumptions could expose this.

4.2 Sufficiency Tests: No Pair Collapses to One

To demonstrate sufficiency (no redundancy), we examine whether any two questions could be combined without loss of distinct function.

Q2 vs Q3: Geometric Orthogonality. Q2 governs vertical relationships (higher levels supervising lower levels within operational domains); Q3 governs horizontal relationships (peer units at same level coordinating laterally). The VCM illustrates: Q2 failure occurred between Verra and individual projects (vertical); Q3 failure occurred between Verra, Gold Standard, and UNFCCC methodologies (horizontal). These are geometrically distinct relationships requiring different mechanisms. Vertical supervision uses audit, enforcement, and appeal procedures. Horizontal coordination uses standardization, mutual recognition agreements, and conflict resolution protocols. Collapsing them would obscure which dimension is failing and what remedy is required.

Q4 vs Q6: Temporal and Epistemological Distinction. They update different aspects through different epistemologies. Q4 updates the system's model of its environment based on external sensing; Q6 updates the system's model of itself based on internal evaluation. Q4 is prospective (what might happen); Q6 is retrospective (what went wrong). Q4 assumes basic architecture is sound and adapts it to changing circumstances; Q6 questions the architecture itself. The IPCC performs Q4 (projecting climate scenarios and policy pathways) but does not perform Q6 (evaluating whether UNFCCC's governance architecture is fundamentally sound). These functions require different institutional forms: Q4 needs scientific expertise and scenario modeling capacity; Q6 needs meta-institutional review authority with power to question foundational assumptions.

Q5 vs Others: Level Distinction. While Q5 sets ultimate purposes, it does not specify operational requirements. A system might have crystal-clear policy goals (Q5 satisfied) but still fail operationally because it cannot perceive what it governs (Q1), cannot balance autonomy and supervision (Q2), cannot coordinate multiple actors (Q3), cannot adapt to environmental change (Q4), or cannot learn from operational errors (Q6). Kyoto had explicit policy framework (temperature stabilization, differentiated responsibilities), yet failed through Q5's own dysfunction (legitimacy collapse), not through subordinate operational failures. This proves Q5 operates at a distinct logical level, not as a meta-function encompassing others.

Having established through necessity tests that each question is indispensable and through sufficiency tests that no question is redundant, we conclude Q1-Q6 constitute a theoretically robust and functionally necessary set grounded in cybernetic principles.

5. Multi-Scale Recursion: From Six Questions to Eighteen

5.1 Why At Least Three Scales?

The six questions apply at a single level of analysis, but climate governance is irreducibly multi-scale. Emissions occur through molecular processes, are aggregated through infrastructure systems, shaped by corporate strategies, regulated by national policies, and coordinated through international agreements. A framework applicable at only one scale would possess insufficient variety to govern this complexity.

Theoretical Foundation: Simon's Near-Decomposability. Herbert Simon's (1962) principle of near-decomposability provides thermodynamic foundation for multi-scale organization. Complex systems that persist over time exhibit hierarchical structure where subsystems are more tightly coupled internally than externally. This architecture emerges not from design preference but from necessity: flat systems with all components equally interconnected cannot maintain stability because disturbances propagate instantly throughout the system. Hierarchical structure acts as requisite variety management—containing local perturbations within subsystems while allowing controlled information flow between levels.

Simon demonstrated this through the parable of two watchmakers, Tempus and Hora. Tempus assembled watches from elementary components directly; any interruption required starting over completely. Hora assembled watches from stable subassemblies combined into larger subassemblies. When interrupted, Hora lost only work on the current subassembly. Over time, Hora vastly outproduced Tempus. The moral: hierarchical decomposition is prerequisite for building complex systems in environments subject to perturbations.

Frequency Separation Principle. Simon showed that near-decomposable hierarchies naturally separate into distinct temporal scales. High-frequency phenomena (daily operations) are handled at lower levels through rapid feedback loops. Low-frequency phenomena (strategic orientation) are handled at higher levels through slower but more comprehensive integration. This frequency separation is functionally necessary: attempting to control high-frequency disturbances from high levels creates fatal delays, while allowing low levels to set long-term direction causes incoherence.

In climate governance, we observe similar separation. Facility-level emission controls operate on daily timescales (monitor combustion, adjust operations). Sectoral policies operate on yearly to decadal timescales (update building codes, phase out coal plants). International agreements operate on decadal to generational timescales (set century-scale temperature targets, negotiate burden-sharing principles). These are not arbitrary distinctions but reflect the natural frequency spectrum of the governed system.

Empirical Validation. Review of ecosystem service assessments and polycentric governance studies finds that "typically 2-4 key scales suffice to capture cross-scale dynamics" (Ostrom, 2010). More scales add little explanatory power while increasing coordination costs. Fewer scales cannot match environmental variety. For climate governance specifically, three scales—micro (projects/facilities), meso (sectors/regions), and macro (nations/international)—capture functionally distinct roles while remaining tractable.

Why Not Two Scales? A two-level system (local + global) fails because it lacks the meso scale performing critical translation functions. The micro scale possesses high variety (diverse technologies, contexts, actors) but limited authority. The macro scale possesses high authority (treaty obligations, global targets) but limited variety (cannot specify implementation details for every context). The meso scale (sectoral associations, regional blocs, technology platforms) bridges this gap, aggregating micro variety into patterns recognizable at macro level while decomposing macro directives into actionable meso strategies. The EU ETS operates precisely at this meso scale—too large to directly regulate individual facilities, too small to set global policy, but perfectly positioned to coordinate national policies and create market infrastructure.

5.2 The 18-Question Matrix: Recursive Application

Applying six questions recursively across three scales generates an 18-question diagnostic matrix (6 questions × 3 scales). Crucially, the same logical question manifests differently at each scale due to level-specific properties—what counts as a "boundary" differs fundamentally between a facility, a sector, and a nation.

Figure 3 presents the complete 18-question diagnostic matrix, showing how each question manifests distinctively at each governance scale.

Micro Scale (Projects, Facilities, Local Governments): At this scale, actors are embedded in tangible physical and social contexts. Q1 concerns physical system definition: which emission sources are included in facility inventory, which lands comprise REDD+ project boundary, which activities constitute a city's transportation sector. Q2 concerns balance between operational flexibility for site managers and oversight by sectoral regulators. Q3 addresses how nearby facilities or competing projects align activities to avoid conflicts (competition for biomass feedstock, coordination of transportation investments). Q4 involves monitoring local conditions (fuel prices, weather patterns) and adapting operations. Q5 establishes the facility or project's contribution to broader goals (corporate sustainability commitments, municipal climate plans). Q6 entails operational learning—adjusting processes based on performance data.

Meso Scale (Sectors, Regions, Technology Platforms): This scale aggregates micro actors into functional categories while remaining distinct from macro-scale sovereign authority. Q1 concerns sectoral definition: does aviation include international flights, does the power sector include distributed generation, do regional carbon markets include imported electricity? Q2 balances sector self-regulation against regulatory oversight—voluntary industry commitments versus mandatory standards. Q3 addresses inter-sectoral alignment: ensuring renewable energy subsidies do not conflict with transport electrification policies. Q4 involves technology forecasting and diffusion tracking within sectors. Q5 establishes sectoral decarbonization pathways aligned with national targets. Q6 entails industry-wide learning from demonstration projects and pilot programs.

Macro Scale (Nations, International Agreements): At this scale, governance confronts sovereignty, treaty obligations, and global coordination. Q1 concerns attribution of responsibility: production-based versus consumption-based accounting, territorial jurisdiction over transboundary emissions (shipping, aviation, supply chains). Q2 involves tension between national sovereignty and international monitoring—the Enhanced Transparency Framework under Paris represents current instantiation. Q3 addresses alignment between national policies to enable market linkage, technology transfer, and collective action (club approaches, sectoral agreements). Q4 includes global stocktakes, IPCC assessments, and scenario modeling. Q5 sets ultimate normative frameworks: temperature targets, equity principles (CBDR), responsibility for historical emissions. Q6 involves regime evaluation and amendment processes—capacity to revise the Paris rulebook based on implementation experience.

Hierarchical Emergence: Each Scale Exhibits Novel Properties. A critical feature of this recursive structure is that higher levels are not simple aggregations of lower levels but exhibit emergent properties that cannot be reduced to component behaviors. Q5 (policy coherence) at micro scale might involve a corporation's values and brand identity. At macro scale, Q5 involves cosmopolitan ethics and global justice principles that cannot be reduced to corporate values—they emerge from interactions among sovereign states operating in an anarchic international system. Similarly, Q3 (coordination) at micro scale might mean neighboring factories sharing infrastructure. At macro scale, Q3 involves international climate finance and technology transfer—emergent coordination mechanisms with no micro-scale analog.

This emergence justifies treating each scale as possessing its full complement of six questions rather than assuming higher-level questions automatically derive from lower-level ones. The 18-question matrix is not redundant but captures functionally distinct governance requirements at each scale.

6. Trust and Integrity as Transversal Mechanisms

6.1 Why Not Q7? The Category Difference

One might propose adding "trust and integrity" as a seventh question: "How do we ensure actors comply with commitments and information is reliable?" We demonstrate this would be a category error. Trust and integrity are not additional governance functions on par with Q1-Q6 but transversal mechanisms enabling all six questions to operate effectively. They are to governance what metabolism is to organisms—not an additional organ but the process allowing all organs to function.

The Logical Test: Transversal Relationships. If trust/integrity were Q7, we should specify what unique failure mode it addresses not covered by Q1-Q6. However, examination of trust failures reveals they always manifest through one of the existing six questions:

  • Trust failure in MRV systems manifests as Q1 failure (perception systems report false data)
  • Trust failure in verification agencies manifests as Q2 failure (supervision is compromised)
  • Trust failure in standardization manifests as Q3 failure (coordination breaks down when standards are not credible)
  • Trust failure in scientific assessment manifests as Q4 failure (intelligence is contaminated by bias or capture)
  • Trust failure in treaty commitments manifests as Q5 failure (normative coherence dissolves when promises are not kept)
  • Trust failure in oversight institutions manifests as Q6 failure (reflexive learning cannot occur if audits are corrupt)

Trust does not occupy a distinct logical position but operates across all positions. This is precisely what defines a transversal mechanism: it enhances or impedes functioning of other components without constituting a separate component itself.

The Architectural Test: Location in the System. If trust/integrity were Q7, where would it reside in governance architecture? Q1 sits at the perceptual boundary, Q2 in autonomy-supervision interfaces, Q3 in lateral coordination channels, Q4 in environmental scanning functions, Q5 in normative frameworks, Q6 in meta-evaluation loops. Trust, by contrast, cannot be localized—it is a property of relationships throughout the system: between monitored entities and monitors (Q1), between autonomous units and supervisors (Q2), between coordinating peers (Q3), between intelligence gatherers and decision-makers (Q4), between citizens and policymakers (Q5), between evaluators and evaluated (Q6).

Attempting to add trust as Q7 would require either: (a) creating a seventh structural position with unclear function, or (b) acknowledging "trust" actually refers to relationship quality across all six positions, in which case it is misclassified as a question rather than recognized as a transversal enabler.

6.2 The Three Pillars: Capacity, Transparency, and Institutional Design

Having established that trust and integrity are transversal rather than additional, we must specify how they operate. We identify three pillars through which trust and integrity are built and maintained across the 18-question matrix: capacity building, transparency, and institutional design. Each pillar addresses a different dimension of the trust problem.

Pillar 1: Capacity Building. Trust is undermined when actors are unable to fulfill roles, even if willing. Verification agencies cannot provide reliable oversight if they lack satellite monitoring access or technical expertise in forest carbon accounting. National governments cannot report accurate emissions if statistical agencies are underfunded and fragmented, as in India's case of 30+ agencies without integration. Capacity building creates the technical substrate on which trustworthy governance operates.

At micro scale, capacity building might involve training facility managers in emissions monitoring protocols. At meso scale, it involves developing sectoral expertise in technology assessment and benchmark establishment. At macro scale, it includes supporting developing countries to establish robust MRV systems—the purpose of climate finance allocations to transparency capacity. The EU ETS maintains high integrity partly because member states have invested decades in building statistical infrastructure capable of facility-level monitoring.

Pillar 2: Transparency. Even capable actors may defect if actions are not observable by others. Transparency creates common knowledge: not only do actors know the state of affairs, but they know others know, enabling coordination and enforcement. However, transparency is not automatic disclosure but carefully designed visibility. Excessive transparency demands can overwhelm developing countries, transforming transparency from accountability mechanism into control reinforcing power asymmetries (Van Deursen, 2024).

Three scales require different transparency regimes. Micro-scale transparency often involves facility-level reporting to regulators (not necessarily public disclosure, to protect commercial confidentiality). Meso-scale transparency includes sectoral benchmarking and technology performance data shared among industry participants. Macro-scale transparency centers on national reporting under the Enhanced Transparency Framework, creating visibility for international review without infringing sovereignty. The VCM's integrity crisis partly reflects radical transparency deficit: project methodologies were opaque, baseline calculations were not externally verifiable, and buyers could not easily audit whether purchased credits represented real reductions.

Pillar 3: Institutional Design. Trust is not merely cultural or psychological phenomenon but is structured by institutions creating alignment between individual incentives and collective interests. Ostrom's work demonstrated communities sustainably manage common resources not through internalized norms alone but through institutional rules making cooperation individually rational. Buchanan and Keohane (2006) extend this to global governance: institutions gain integrity when procedures are designed to minimize conflicts of interest and enable accountability.

At all three scales, institutional design addresses the core trust problem: how to align incentives so actors gain more from cooperating than defecting, even when cooperation is costly and defection is difficult to detect. The EU ETS achieves this through ex-ante penalties for non-compliance exceeding the cost of compliance. The VCM failed this design test: verification agencies had financial incentives to approve projects (disapproval meant lost revenue), creating structural corruption. Proper institutional design would separate verification financing from verification outcomes—e.g., through pooled fees distributed independently of approval rates, or mandatory rotation of verifiers preventing capture.

Scale-Differentiated Trust Mechanisms. A crucial insight is that trust mechanisms must be scale-appropriate. Macro-scale governance (international treaties) relies heavily on transparency and soft enforcement through reputation, because sovereignty limits hard enforcement. Micro-scale governance (facilities) can employ direct supervision and sanctions. Meso-scale governance (sectors, regions) uses hybrid mechanisms: industry associations combine peer monitoring with regulatory oversight.

The MRV data reveals a concerning pattern: trust strength is inversely related to scale. The EU ETS achieves near 100% verification coverage with legal enforcement at meso-macro level. The VCM operates at micro level with only 88% coverage (Verra 72%, Gold Standard 16%) and voluntary compliance. This inversion suggests a design error: micro actors have more opportunities for detailed fraud precisely where oversight is weakest, while macro actors face intense scrutiny for actions harder to verify. Optimal design would strengthen micro-level supervision (where detection is technically feasible through satellite monitoring and on-site inspection) and accept greater flexibility at macro level (where sovereignty constraints bind).

7. Discussion

7.1 Theoretical Contributions

This paper makes three primary contributions to climate governance scholarship. First, we provide a deductively derived framework specifying necessary conditions for governance viability, moving beyond descriptive taxonomies dominating current literature. By deriving six questions from cybernetic axioms rather than inducing them from observed institutions, we generate a framework with both explanatory and prescriptive power. The distinction is fundamental: descriptive frameworks catalog what exists and explain variation; prescriptive frameworks derived from first principles specify what must exist for success and identify necessary functions any viable system must fulfill.

Second, we resolve the apparent paradox between three influential approaches—polycentric governance, regime complexity, and experimentalist governance—by showing they address different necessary questions but none provides complete architecture. Polycentric governance emphasizes Q2 (autonomy) and Q3 (coordination) but underspecifies Q5 (policy coherence) and Q6 (reflexive learning), explaining why polycentric systems sometimes achieve effective coordination but often devolve into fragmentation without cumulative progress. Experimentalist governance emphasizes Q4 (intelligence through experimentation) and Q6 (learning from trials) but assumes rather than derives Q1 (boundary definition), Q2 (autonomy-supervision balance), and Q3 (coordination mechanisms). Regime theory addresses Q5 (normative frameworks and legitimacy) but treats institutional multiplicity (Q3 failure) as explanandum rather than diagnosing what coordination mechanisms are required. Our framework synthesizes these insights by identifying which questions each tradition answers well and which it neglects, explaining both their complementarity and their individual limitations.

Third, we establish methodological precedent for applying axiomatic reasoning to institutional design in climate governance. Following Arrow's social choice theory and Hamilton's constitutional design, we demonstrate that social systems can benefit from principled design based on formal axioms, provided we recognize social designs must accommodate agency, contestation, and evolutionary adaptation in ways physical designs need not. The framework provides intellectual resources for institutional designers while acknowledging it cannot substitute for political work of coalition-building and norm entrepreneurship.

Comparison with Ostrom's IAD Framework. Our approach shares Ostrom's concern with institutional design for collective action but differs in methodology and derivation (Table 1). Ostrom's Institutional Analysis and Development framework derives eight design principles from empirical study of successful commons governance: clearly defined boundaries, proportional equivalence between benefits and costs, collective choice arrangements, monitoring, graduated sanctions, conflict resolution mechanisms, minimal recognition of rights to organize, and nested enterprises for larger systems. These principles are inductively generalized from cases.

Table 1: TCAF vs Ostrom's IAD Framework

Dimension

Ostrom IAD

TCAF

Key Difference

Theoretical Basis

Institutional Economics

Cybernetics

Inductive vs Deductive

Core Elements

8 Design Principles

6 Necessary Questions

Empirical generalization vs Logical derivation

Derivation Method

Comparative case studies

Axiomatic reasoning

Pattern discovery vs First principles

Scale Treatment

Nested Enterprises

Recursive Application

Similar insight, different formalization

Reflexivity

Implicit in monitoring

Explicit (Q6)

TCAF uniquely specifies explicit second-order learning mechanisms

Applicability

Local/regional commons

Multi-scale global governance

TCAF designed for planetary scale coordination

TCAF extends Ostrom's insights by providing axiomatic foundations explaining why certain design features are necessary. For instance, Ostrom's "clearly defined boundaries" maps to our Q1, but we derive this from von Bertalanffy's open systems axiom rather than observing successful cases. This deductive approach allows TCAF to diagnose failures in domains lacking successful exemplars to study—crucial for unprecedented challenges like global climate governance where we cannot wait for evolutionary selection to identify viable institutional forms.

7.2 Scope and Future Research Directions

Theoretical Scope Boundaries. This paper establishes TCAF as a theoretically necessary framework but acknowledges important scope limitations that define future research directions.

First, TCAF specifies functional requirements but does not prescribe unique institutional forms. Q3 (coordination) can be fulfilled through markets, hierarchies, networks, or hybrid arrangements. Context-specific institutional design requires empirical analysis examining how different political systems, cultural contexts, and technological capabilities shape which institutional forms best fulfill each function. Future research should develop contingency theories linking contextual variables to institutional design choices.

Second, our framework identifies what a viable system must possess at any moment but does not model transition pathways from non-viable to viable configurations. How systems evolve through learning, how institutional entrepreneurs navigate path dependencies, and what catalyzes regime shifts require dynamic extensions incorporating theories of institutional change, social movements, and policy diffusion. This limitation is particularly important for practical application: knowing a system lacks Q6 (reflexive learning) is valuable, but understanding how to build Q6 into resistant institutional structures requires political and sociological analysis beyond cybernetic principles.

Third, while we demonstrate necessity through logical tests and validate through failure/success cases, detailed empirical calibration requires systematic comparative analysis. When does partial satisfaction of a question become critical failure? How do questions interact—does strength in Q4 compensate for weakness in Q6, or do they operate independently? What thresholds separate "adequate" from "inadequate" fulfillment? These questions demand quantitative assessment across many governance systems.

Immediate Research Agenda. These limitations point toward a three-stage research program, each stage building on theoretical foundations established here:

Stage 1: Systematic Diagnostic Application. The immediate priority is applying TCAF to evaluate major contemporary governance systems using the 18-question matrix. The EU's Carbon Border Adjustment Mechanism provides an ideal test case: as a novel instrument addressing carbon leakage through trade measures, CBAM exhibits design innovation but faces implementation challenges and legitimacy questions. Systematic evaluation can identify which questions CBAM addresses robustly, which require strengthening, and what institutional modifications would enhance viability. Similar diagnostic application to voluntary carbon markets, national emissions trading systems, and the Paris Agreement's NDC architecture will validate the framework's explanatory power and identify common failure patterns.

Stage 2: Comparative Validation and Tool Development. The second stage involves comparative analysis across multiple governance systems to empirically validate the framework's core claims: that question satisfaction correlates with governance effectiveness, that failures map onto specific question deficits, and that the six questions capture distinct dimensions of governance challenge. This stage also develops operational assessment protocols, including validated indicators for each of 18 questions, scoring rubrics calibrated through expert judgment and empirical testing, and decision trees for rapid diagnosis. The goal is translating theoretical framework into practical toolkit enabling policymakers and practitioners to evaluate governance systems systematically.

Stage 3: Institutional Design and Reform. The third stage applies TCAF diagnostically to inform institutional reform, developing design principles for addressing identified deficits. For instance, if CBAM evaluation reveals Q5 (normative legitimacy) weakness due to insufficient recognition of Common But Differentiated Responsibilities, what institutional modifications would strengthen Q5 while preserving CBAM's core function? This stage also extends the framework to emerging governance domains: carbon dioxide removal technologies, climate engineering governance, and nature-based solutions all require institutional architectures, and TCAF can guide their design from inception rather than diagnosing failures retrospectively.

Illustrative Application: CBAM as Test Case. To demonstrate the framework's diagnostic potential, consider preliminary assessment of the EU's Carbon Border Adjustment Mechanism. CBAM exhibits several theoretical strengths: Q1 (boundary definition) is relatively clear through HS code specifications and embedded emissions calculation methodologies; Q2 (autonomy-supervision balance) is structured through phased implementation allowing importers operational flexibility while building enforcement capacity; Q3 (coordination) achieves partial integration through linkage to EU ETS carbon pricing. However, CBAM may face challenges in Q5 (normative legitimacy) as developing countries perceive unilateral trade measures as violating CBDR principles, and Q6 (reflexive learning) appears underspecified with limited mechanisms for systematic methodology review based on implementation experience. Comprehensive diagnostic evaluation identifying specific institutional modifications to strengthen these dimensions will be presented in subsequent research.

7.3 Limitations and Caveats

Contextual Limitations of Empirical Validation. Our empirical validation relies primarily on retrospective analysis of failures (Kyoto Protocol, voluntary carbon markets) and one success case (Montreal Protocol). While these cases powerfully demonstrate that observed failures map onto specific question deficits and success correlates with comprehensive question satisfaction, important contextual differences must be acknowledged.

The Montreal Protocol benefited from three conditions largely absent in climate governance: (1) concentrated problem sources in limited industrial sectors (CFCs, HFCs), making boundary definition and monitoring tractable; (2) viable technical substitutes with acceptable cost structures, making compliance economically feasible; and (3) rapid atmospheric feedback through ozone hole visibility, creating political urgency and enabling quick validation of policy effectiveness. Climate change, conversely, involves economy-wide transformation requiring coordination across all economic sectors, lacks drop-in alternatives for fossil fuels in many applications, and exhibits delayed atmospheric response masking urgency and complicating learning.

These differences do not invalidate the framework—indeed, they explain why climate governance requires more rigorous application of all six questions precisely because problem characteristics are less favorable. The Montreal Protocol succeeded in addressing a governance challenge with relatively favorable characteristics; climate governance faces a harder test of the same functional requirements. The framework's value lies not in guaranteeing success when conditions are unfavorable but in identifying what functions must be fulfilled for success to be possible. Whether political will and resources can be mobilized to fulfill these functions remains an open empirical question.

Normative Neutrality and Political Contestation. TCAF specifies functional requirements for system viability but does not resolve normative conflicts about what goals governance should pursue or how burdens should be distributed. Q5 (policy coherence) requires some normative framework, but the framework itself does not determine whether that should prioritize mitigation over adaptation, growth over equity, or national sovereignty over global coordination. These are political questions requiring deliberation and contestation. TCAF provides architecture within which political conflicts can be productively managed, but it does not eliminate or resolve those conflicts.

This limitation is both unavoidable and appropriate. A governance framework claiming to derive political values from cybernetic axioms would commit the naturalistic fallacy. What TCAF offers instead is structural clarity: whatever normative framework emerges from political process (Q5), the system must also address boundary definition (Q1), autonomy-supervision balance (Q2), coordination (Q3), adaptation (Q4), and reflexive learning (Q6). Political legitimacy determines what the system pursues; functional architecture determines whether pursuit can succeed.

8. Conclusion

The climate crisis confronts humanity with a governance challenge of unprecedented scale and urgency. Existing institutional architectures have proven inadequate, and theoretical scholarship has fragmented into incommensurable approaches offering partial insights but incomplete guidance. This paper addresses the theoretical deficit by developing TCAF—a framework derived from cybernetic first principles specifying the minimal set of functions any viable governance system must perform.

We have demonstrated through logical necessity tests that six questions constitute this minimal set: boundary definition and perception (Q1), autonomy balanced with supervision (Q2), coordination and oscillation damping (Q3), adaptation through intelligence (Q4), policy coherence and identity (Q5), and reflexive learning (Q6). Applied recursively across three scales (micro, meso, macro), these generate an 18-question diagnostic matrix. Trust and integrity operate transversally across all 18 questions through capacity building, transparency, and institutional design rather than constituting an additional question.

Empirical validation demonstrates the framework's explanatory power: observed governance failures map precisely onto specific question deficits. The Kyoto Protocol's collapse from 38% to 11% coverage exemplifies Q5 failure as normative identity dissolved. The voluntary carbon market's 80-90% ineffectiveness rate reflects Q6 failure through absence of reflexive mechanisms detecting and correcting systematic fraud. Methodological fragmentation across competing standards represents Q3 failure preventing interoperability. Conversely, the Montreal Protocol's success demonstrates how addressing all six questions enables effective global governance even on politically contentious issues.

The path forward requires moving from inductive description to deductive prescription. We can no longer afford to catalog institutional variety without principles for distinguishing viable from non-viable designs. TCAF provides such principles, grounded in the physics of organized complexity. The framework establishes theoretical foundations enabling three crucial advances: systematic evaluation of existing governance systems to identify specific functional deficits, development of empirically-validated diagnostic tools translating theory into practice, and principled design of new institutions addressing identified gaps rather than replicating historical failures.

Climate governance stands at a critical juncture. The physical window for preventing catastrophic warming narrows while institutional capacity lags. Closing this gap demands not merely political will but conceptual clarity about what institutions must accomplish. This paper provides that clarity through axiomatic derivation of necessary governance functions. Future research will translate these theoretical insights into practical tools, apply the framework to evaluate major governance systems, and develop institutional design principles addressing diagnosed deficits. The theoretical foundation is now established. The task ahead is rigorous empirical application and institutional innovation guided by first principles rather than historical accident.

References

Agon, C. (2024). International law and cross-sectoral crisis response: Comparing COVID-19 and climate governance. International Affairs, 100(2), 445-463.

Argyris, C., & Schön, D. (1978). Organizational learning: A theory of action perspective. Addison-Wesley.

Arrow, K. J. (1951). Social choice and individual values. Yale University Press.

Ashby, W. R. (1956). An introduction to cybernetics. Chapman & Hall.

Beer, S. (1972). Brain of the firm. Allen Lane.

Beer, S. (1981). Brain of the firm (2nd ed.). Wiley.

von Bertalanffy, L. (1968). General system theory: Foundations, development, applications. George Braziller.

Buchanan, A., & Keohane, R. O. (2006). The legitimacy of global governance institutions. Ethics & International Affairs, 20(4), 405-437.

Bulkeley, H., et al. (2024). Theoretical fragmentation in urban climate governance. Frontiers in Environmental Science, 12, 1289456.

Guardian/SourceMaterial (2023). Revealed: More than 90% of rainforest carbon offsets by biggest certifier are worthless. Joint investigation. https://www.theguardian.com/environment/2023/jan/18/revealed-forest-carbon-offsets-biggest-provider-worthless-verra-agrafeatures

Hahn, T., et al. (2024). Climate risk disclosure and green innovation: Addressing theoretical fragmentation. Business Strategy and the Environment, 33(2), 1245-1267.

Jordan, A., et al. (2018). Governing climate change: Polycentricity in action? Cambridge University Press.

Kellner, E., et al. (2024). Design power and accountability in polycentric climate governance. Global Environmental Politics, 24(2), 45-68.

Keohane, R. O., & Victor, D. G. (2011). The regime complex for climate change. Perspectives on Politics, 9(1), 7-23.

Lederer, M., Walker, J., & Winden, A. (2025). Governing climate change: Comparing polycentric, experimentalist, and viable system models. Systems Research and Behavioral Science, 42(1), 89-107.

Morrison, T. H., et al. (2024). From polycentric to fragmented: Governance challenges in climate action. Global Environmental Change, 84, 102785.

NASA/Harvard (2024). Urban methane emissions systematically underestimated. Atmospheric Chemistry and Physics, 24(8), 4521-4538.

Ostrom, E. (2010). Polycentric systems for coping with collective action and global environmental change. Global Environmental Change, 20(4), 550-557.

Patterson, J., et al. (2023). Power and fragmentation in grassroots climate governance. Climate and Development, 15(6), 523-537.

Sabel, C. F., & Zeitlin, J. (2017). Experimentalist governance. In D. Levi-Faur (Ed.), Oxford handbook of governance (pp. 169-183). Oxford University Press.

Simon, H. A. (1962). The architecture of complexity. Proceedings of the American Philosophical Society, 106(6), 467-482.

Tobin, P., et al. (2024). Empirical realities of polycentric climate governance: A critical assessment. Global Environmental Politics, 24(1), 12-35.

UNEP (2024). Emissions gap report 2024: Still falling short. United Nations Environment Programme.

Van Deursen, T. (2024). Transparency as control: Power asymmetries in climate MRV. Global Environmental Politics, 24(3), 78-94.

West, T. A. P., et al. (2020). Overstated carbon emission reductions from voluntary REDD+ projects in the Brazilian Amazon. Proceedings of the National Academy of Sciences, 117(39), 24188-24194.

Publication & Licensing

Title: A First-Principles Framework for Climate Governance: Core Functions from Cybernetic Theory
Version: 1.0 | January 2026
Author: Alex Yang Liu
Publisher: Terawatt Times Institute | ISSN 3070-0108
Document ID: TCAF-2026-v1.0
Citation Format: Liu A. Y. (2026). A First-Principles Framework for Climate Governance: Core Functions from Cybernetic Theory. Terawatt Times (ISSN 3070-0108), v1.0. DOI: [To be assigned]

Copyright © 2026 Terawatt Times Institute. All rights reserved.

This work presents the Theoretical Climate Action Framework (TCAF), including its cybernetic foundations, six-function diagnostic architecture, 18-question matrix, and case validation methodology.

You are free to:
▷ Read, cite, and reference this work
▷ Use it for academic research, policy analysis, and education
▷ Share the document in full or in part, with proper attribution
▷ Discuss, critique, and apply the diagnostic questions for governance analysis

Commercial or Engineering Use Requires Licensing

Any form of implementation, replication, or derivative deployment of the TCAF framework requires explicit written permission from the Terawatt Times Institute. This includes, but is not limited to:

– Reproducing or operationalizing the TCAF diagnostic matrix or scoring rubrics – Embedding the six-function framework into software, platforms, or analytical tools – Developing commercial governance assessment products derived from or materially similar to TCAF – Use in professional consulting, advisory services, or policy evaluation products – Engineering, modeling, or simulation systems that implement the framework beyond citation or illustrative use

Permissions & Licensing

For permissions, licensing inquiries, or authorized derivative use: alex.liu@terawatttimes.org Terawatt Times Institute

Author

Alex Yang Liu
Alex Yang Liu

Alex is the founder of the Terawatt Times Institute, developing cognitive-structural frameworks for AI, energy transitions, and societal change. His work examines how emerging technologies reshape political behavior and civilizational stability.

Sign up for Terawatt Times newsletters.

Stay up to date with curated collection of our top stories.

Please check your inbox and confirm. Something went wrong. Please try again.

Subscribe to join the discussion.

Please create a free account to become a member and join the discussion.

Already have an account? Sign in

Read more

Sign up for Terawatt Times Insights.

Decoding the climate transition where innovation, capital, and strategy converge.

Please check your inbox and confirm. Something went wrong. Please try again.