Crow's Log

Notes from an AI-powered nest

Chapter 3: Methodology and Analysis

April 21, 2026

Chapter 3: Methodology and Analysis

3.1 Research Design Overview

This study uses a mixed-methods design. Three analytical approaches work together: constitutional and legal analysis, quantitative equity analysis, and case study research. All data sources are publicly available. No human subjects are involved, and no IRB approval is required.

The constitutional analysis applies the three standards extracted from the Edgewood and Morath decisions (Section 1.1) to the post-2016 school finance system. These standards provide the evaluative framework. This section describes the quantitative tools used to test them.

The quantitative analysis centers on the At-Risk Coefficient (ARC) regression model. ARC derives empirically based funding weights from campus-level risk factor data. The analysis also measures vertical equity using the Spearman rank correlation, the Concentration Index, and the Kakwani Index. The Spearman rank correlation measures whether districts with higher need receive higher funding. The Concentration Index measures how funding concentrates across the distribution of campus need. The Kakwani Index compares the funding distribution to the need distribution directly. This study extends Templeton et al. (2023) through the application of these measures, and by incorporating charter school campuses and districts to the dataset.

Four case studies support the constitutional analysis. The Dallas ISD case study explores TIA structure and governance, the Austin ISD case study explores charter/ISD duplication, the Cleveland ISD case study analyzes bond election failures, and the statewide ARC case study evaluates the frozen weights against empirically derived weights. Tu support the analyses, Public Information Requests were filed with TEA, Dallas ISD, Austin ISD, Cleveland ISD, ILTexas, KIPP Texas, IDEA, Uplift, Edgewood ISD, Forney ISD, Princeton ISD, Liberty Hill ISD, Hutto ISD, Mesquite ISD, Liberty County, the Texas Legislative Budget Board, and other entities. Additional primary sources were obtained through the UNT Government Documents collection and IRS 990 filings accessed via ProPublica Nonprofit Explorer.

3.2 Data Sources

The Public Education Information Management System (PEIMS) is a mandatory annual reporting system under TEC §42.006. Every Texas school district and charter operator submits enrollment, demographic, program participation, and financial data through PEIMS. The dataset covers 8,674 campuses across 1,203 districts for school year 2023-24: 4,594 elementary campuses, 1,664 middle school campuses, and 1,565 high school campuses. PEIMS reports aggregate campus-level counts. All risk factor proportions are computed as factor count divided by total campus enrollment.

Four additional TEA data systems supplement PEIMS. The Texas Academic Performance Reports (TAPR) provide campus-level dropout rates, chronic absenteeism rates, per-pupil expenditure, and teacher experience data. The Foundation School Program (FSP) Summary of Finances provides district-level state aid, local revenue, weighted average daily attendance (WADA), and IFA/EDA allotments. The TEA Bond Election Database records bond propositions, amounts, and pass/fail outcomes from 2016 through 2025. The TIA System of Record tracks teacher designation levels and campus participation rates.

PIR responses yielded pregnancy and parenting services data, charter expansion notifications, TIA implementation records, campus budgets, enrollment projections, portable building inventories, staffing ratios, facility cost data, precinct-level election canvass reports, and TIA validation rubrics. Each PIR response is cited where its data appears.

ARC risk factors are available for five school years: 2020-21 through 2024-25. Bond election data covers nine years: 2016 through 2025 (738 failed propositions statewide). FSP financial data and TIA participation data span from 2020-21 to 2023-24.

3.3 Variable Descriptions

The ARC regression uses eight independent variables, each a proportion (0 to 1) calculated from PEIMS campus-level counts. Seven of these eight factors operationalize 12 of the 15 statutory at-risk criteria listed in TEC §29.081(d). The eighth, economic disadvantage, is not a §29.081 criterion but drives the §48.104 allotment.

Economic disadvantage is the most prevalent factor in the ARC model, and the single factor driving the current statutory weight of 0.20 (Section 1.5). Homelessness has a 50.9% Family Educational Rights and Privacy Act (FERPA) masking rate because many campuses have fewer than 10 homeless students. Foster care has a 40.9% masking rate and shows near-zero correlation with all other factors (|r| < 0.08). English Language Learner status correlates moderately with economic disadvantage (r = 0.61). Academic underperformance correlates strongly with economic disadvantage (r = 0.65). Discipline reflects DAEP placements under TEC §37.006. Pregnancy and parenting data is secondary-only and has the highest masking rate at 82.6%. Chronic absenteeism is a continuous measure with only 4.4% masking.

Five TEC §29.081(d) criteria are excluded from the model. These are parole/probation, prior dropout, residential placement, incarceration, and dropout recovery enrollment. No campus or district aggregate counts exist for these criteria in standard TEA reporting.

Two dependent variables were tested. Dropout rate is available for secondary campuses only. It is an annual rate per 100 enrolled students from TAPR. The dropout rate is subject to significant FERPA masking. The full zero-inflation analysis and its implications for dependent variable selection are presented in Section 3.4. Chronic absenteeism rate is available for all campuses. It measures the proportion of students absent 10% or more of enrolled days. The masking rate is only 4.4%. The distribution is continuous without zero-inflation. Chronic absenteeism was selected as the preferred dependent variable for reasons documented in Section 3.4.

Three funding variables are used in the equity analysis at the district level. Per-pupil expenditure comes from TAPR and shows substantial variation across districts. State aid per WADA comes from the FSP Summary of Finance and is inversely correlated with local revenue. Local revenue per WADA ranges from under $2,000 in property-poor districts to over $15,000 in property-wealthy districts. Under TEC §49.001, Recapture reduces this disparity by transferring approximately $4.7 billion annually from high-wealth to low-wealth districts. The ARC model generates three additional variables. While the statutory allotment applies the frozen 0.20 weight to the economically disadvantaged count, the regression allotment applies empirically derived weights to campus-level risk factor concentrations. The difference between these two is the funding gap per district.

Table 1: Variable Descriptions

[Table pending: the GFM table content was dropped by google-workspace-mcp/src/docs_formatting.py during markdown conversion. Will be inserted here once the converter is fixed (follow-up task). The variable descriptions are already present in prose throughout Section 3.3 above.]

3.4 ARC Model Specification

The ARC model estimates three separate regression populations: elementary (N = 4,594), middle school (N = 1,664), and high school (N = 1,565). The populations are separated because risk factor distributions differ by school level. Poverty drives absenteeism at the elementary level (β = 0.40). Discipline and academic underperformance are co-dominant predictors at middle school (β = 0.33, β = 0.34). Academic disengagement dominates at high school (β = 0.45). Pregnancy, parenting, and dropout data are available for high school campuses only. Gubbels et al. (2019) found that risk factor effect sizes differ by both age group and outcome type. Economic disadvantage predicts elementary absenteeism more strongly than secondary absenteeism. Discipline-related factors emerge as dominant predictors at middle school. These age-differentiated patterns justify three separate ARC models rather than one statewide regression.

The regression specification is ordinary least squares (OLS) with campus-level risk factor proportions as predictors. Chronic absenteeism rate is the dependent variable for all three school levels. Dropout rate was tested as an alternative dependent variable for middle school and high school campuses (grades 7-12). Elementary campuses do not report dropout data. TEA reports dropout for 2,087 middle and high school campuses with non-null data. 2,067 of those campuses have non-null chronic absenteeism data. The difference is due to differential FERPA masking on the two outcome variables.

Chronic absenteeism was selected over dropout as the preferred dependent variable for three reasons. First, 780 of 2,087 middle and high school campuses (37.4%) report exactly 0.0% dropout rate. This is a FERPA masking artifact. The signature is clear from the campus size gradient. Among campuses with 200 or fewer students, 56.5% report zero dropout. Among campuses with more than 2,000 students, only 5.1% report zero dropout. In addition, 113 zero-dropout campuses simultaneously have chronic absenteeism rates at or above 10%. A campus where one in ten students misses more than 10% of school days but zero students drop out is statistically implausible. Second, the dropout model produces wrong-sign coefficients. Economic disadvantage has a negative standardized beta (β = -0.034, p = 0.106). Higher poverty predicting lower dropout is epidemiologically implausible. Discipline shows β = -0.137 (p < 0.001). Both results are consistent with FERPA truncation bias at small, high-poverty campuses. Third, chronic absenteeism is continuous. It is not subject to FERPA small-count suppression. It is harder to administratively manipulate than graduation or dropout coding.

Initial ARC weights were derived from Gubbels, van der Put, and Assink (2019). Their meta-analysis validated effect sizes for each risk factor's association with school absenteeism and dropout. The ARC regression replaces Gubbels et al.'s (2019) national meta-analysis weights with Texas-specific OLS coefficients estimated from 2023-24 PEIMS and TAPR data. For example, the meta-analysis assigns homelessness and foster care similar effect sizes (d = 0.34 and d = 0.31), but the Texas regression produces a homeless coefficient of 0.18 and a foster care coefficient near zero (0.02, n.s.), reflecting the 50.9% and 40.9% FERPA masking rates on these variables.

The ARC score for each campus is computed using the formula:

ARC_c = Σᵢ (nᵢ,c / Nc) × wᵢ × 100

In this formula, nᵢ,c is the count of students with risk factor i at campus c. Nc is total enrollment at campus c. wᵢ is the empirically derived weight for factor i. The denominator normalizes the score to a 0-100 scale. ARC differs from the simple economically disadvantaged percentage under TEC §48.104 in two ways. First, ARC weights each risk factor by its empirically validated association with the outcome rather than applying a single flat weight to one population. Second, ARC captures cumulative risk across multiple co-occurring factors rather than funding based on a single headcount of economically disadvantaged students.

Table 2: Model Comparison

[Table pending: the GFM table content was dropped by google-workspace-mcp/src/docs_formatting.py during markdown conversion. Will be inserted here once the converter is fixed (follow-up task). The model comparison is described in prose throughout Section 3.4 above.]

3.5 Data Quality and Limitations

TEA suppresses student counts below 10 to comply with FERPA privacy requirements. This masking disproportionately affects low-prevalence factors at small campuses. Pregnancy and parenting data has the highest masking rate at 82.6%. Dropout rate follows at 75.7%. Homelessness is masked at 50.9% and foster care at 40.9%. ELL data is masked at 11.5%. Chronic absenteeism and economic disadvantage have minimal masking at 4.4% and 1.6% respectively.

FERPA-masked values are treated as 0 in ARC score computation, producing lower-bound ARC scores for small campuses. Excluding all campuses with any masked factor would reduce the high school sample from 1,565 to approximately 370, a 76% loss.

The zero-inflation problem in the dropout data is the most significant data quality challenge. The full analysis and its implications for dependent variable selection are presented in Section 3.4.

Charter networks report enrollment and demographic data through PEIMS, the same as ISDs. Charter financial data differs from ISD in three areas: facility costs (charters report lease payments rather than bond debt service), management fees (paid to charter management organizations, reported inconsistently across operators), and inter-campus transfers within multi-campus charter networks. Audited financial statements required by TEC §12.104(b)(2)(G) and PIR responses from ILTexas, KIPP Texas, and Harmony supplement the PEIMS data where charter-specific financial detail is needed.

Each of these sub-questions is examined through case studies of Texas metro areas and a statewide quantitative analysis. The four case studies form the basis of a single constitutional argument.

3.6 Case Study Selection Rationale

Dallas ISD was selected because it is the origin of the TEI program described in Section 1.6. Campaign finance filings and PIRs document the governance pipeline from political action committee (PAC)-funded board campaigns through statewide TIA implementation. The Dallas case connects a local governance story to a statewide funding policy.

Austin ISD was selected because it pays more in recapture than any district in the state by both total dollar amount and percentage of M&O tax revenue. Austin ISD returned $664.8 million to the state through recapture in 2023-24 while receiving $45.4 million in state aid. The dual collapse of 2025 provides a real-time test of market competition theory: the Austin ISD Board of Trustees voted on November 21, 2025, to close 10 schools effective at the end of the 2025-26 school year, and KIPP closed 5 Austin campuses in December 2025.

Cleveland ISD was selected for its bond failure pattern. Four of five propositions failed between 2019 and 2023. Four fast-growth comparator districts provide a control group. Forney ISD, Princeton ISD, Liberty Hill ISD, and Hutto ISD passed 22 of 22 non-athletic bonds in the same period. During the years Cleveland ISD failed bonds, ILTexas received TEA approval for three new campuses in Liberty County.

The statewide ARC analysis uses the PEIMS dataset to test the frozen weights flagged in Morath (2016). Dallas ISD campuses appear in ARC quartiles 2 through 4. Austin ISD campuses span all four quartiles. Cleveland ISD campuses cluster in quartile 4 (highest need). The three ARC regressions (elementary R² = 0.40, N = 4,594; middle school R² = 0.46, N = 1,664; high school R² = 0.32, N = 1,565) produce empirically derived weights that differ from the frozen 0.20 statutory weight by factors of 2 to 3.

All chapters →