# GPT Independent Audit - TaxVault

Date: 2026-05-05  
Auditor: Codex (GPT-5)  
Workspace: `/Users/aiagent001/Desktop/Projects/Project TaxVault`  
Application revision tested: `2a9d3ce370377cedc5cefbb33a3f4eb2c5f6f01f`

## Executive Verdict

I independently ran the 100-persona audit plan from `AUDIT_PLAN_FOR_GPT.md`.
Every persona completed the full CLI path:

`validate -> forms -> compute -> export -> PDF text inspection`

The computational core is much stronger than an ordinary prototype, but I do
not agree with a clean "100/100 pass" shipping claim. My audit result is:

| Result bucket | Count |
|---|---:|
| Personas processed | 100 |
| Clean pass under this audit | 41 |
| Warn / blocker found | 59 |
| CLI execution failures | 0 |

The most important blockers are not obscure edge cases. They are user-visible
filing packet issues:

1. Current IRS Form 1040 mailing addresses do not match TaxVault output for
   47/100 personas.
2. Exported PDFs omit forms that the facts or export metadata say should be in
   the packet: Schedule C, Schedule SE, Form 8889, and Form 8995-A.
3. `persona_007` has a Form 1040 line math inconsistency: stock option income
   is included in total income but is not visible on the component income lines.
4. Two head-of-household personas lack dependent facts but validate and compute.
5. The California DOR scenarios still fail with the double-exemption-credit
   behavior Claude found, and the CA constants appear stale against current FTB
   2025 standard deduction amounts.

My conclusion: TaxVault is a credible internal audit candidate, but it is not
ready to be represented as a mail-ready, filing-packet-ready product until the
output packet and source-of-truth data issues are fixed.

## Method

I built the release workspace and ran fresh artifacts under `/tmp/gpt_audit`.

Commands used:

```bash
cd /Users/aiagent001/Desktop/Projects/Project\ TaxVault/Application
PATH=/Users/aiagent001/.cargo/bin:$PATH \
  CARGO_HOME=/Users/aiagent001/.cargo \
  RUSTUP_HOME=/Users/aiagent001/.rustup \
  cargo build --workspace --release

./target/release/taxvault info

# For each tests/audit/personas/persona_NNN.json:
./target/release/taxvault validate --input persona_NNN.json
./target/release/taxvault forms --input persona_NNN.json
./target/release/taxvault compute --input persona_NNN.json --output result.json
./target/release/taxvault export --input persona_NNN.json --output-dir packet
pdftotext -layout packet/draft_filing_packet.pdf pdf.txt
```

I also reran selected independent checks:

```bash
./target/release/oracle-classify
python3 scripts/run_dor_scenarios.py --state-dir tests/audit/state_dor/CA \
  --runner target/release/state-dor-runner --out-json /tmp/gpt_ca_dor_report.json
./tools/oracle-python/.venv/bin/python3 scripts/mef_validate.py \
  --persona-dir tests/audit/per-persona \
  --result-dir tests/audit/per-persona \
  --schema tests/audit/mef/_schemas/IRS1040.xsd \
  --out-dir /tmp/gpt_mef_validation/per-persona \
  --report /tmp/gpt_mef_validation/MEF_REPORT.md \
  --summary /tmp/gpt_mef_validation/mef_summary.json
```

## 12 Sanity Checks

| Check | Pass | Fail | Notes |
|---|---:|---:|---|
| Form 1040 income math reconciles | 99 | 1 | `persona_007` line 9 includes $1,931.82 not visible on line 1 or line 8. |
| AGI cascade | 100 | 0 | Line 11 matched line 9 minus line 10 for all personas. |
| Taxable income cascade | 100 | 0 | Line 15 matched max(0, line 11 - line 12 - line 13). |
| Refund/owed exclusivity | 100 | 0 | No return showed both refund and amount owed. |
| Refund/owed math | 100 | 0 | Payments minus tax reconciled to refund/owed outcome. |
| No negative tax | 100 | 0 | Total tax was non-negative for all personas. |
| Threshold compliance | 100 | 0 | NIIT, Additional Medicare, EITC threshold negatives passed. |
| Filing status fields | 98 | 2 | `persona_087` and `persona_094` are HoH with no dependent facts. |
| PDF presence / size floor | 90 | 10 | All PDFs existed and extracted, but 10 were below the plan's 5 KB floor. |
| PDF forms match facts/metadata | 84 | 16 | Required or claimed forms are absent from packet PDFs. |
| PDF spouse fields | 100 | 0 | MFJ spouse fields were visible when applicable. |
| Current IRS mailing address | 53 | 47 | TaxVault mailing addresses diverge from current IRS Form 1040 page. |

## Findings

### GPT-TV-001 - Current IRS Mailing Addresses Are Wrong For 47/100 Personas

Severity: high

TaxVault emits mailing destinations from `Application/core/output/src/irs_mailing_addresses.rs`.
That file describes itself as a "2025 filing season" table and routes many
with-payment returns to Cincinnati, old Charlotte ZIP `28201-1300`, or the
wrong service center.

The current IRS Form 1040 where-to-file page instead gives these 2025/2026
mailing patterns, among others:

- AL/FL/GA/LA/MS/NC/SC/TN/TX with payment: Charlotte, NC 28201-1214.
- CT/DE/DC/IL/IN/IA/KY/ME/MD/MA/MN/MO/NH/NJ/NY/PA/RI/VT/VA/WV/WI without
  payment: Kansas City, MO 64999-0002; with payment: Louisville, KY 40293-1000.
- AK/CA/CO/HI/ID/KS/MI/MT/NE/NV/OH/OR/ND/SD/UT/WA/WY with payment: Louisville,
  KY 40293-1000.
- AR/AZ/NM/OK without payment: Austin, TX 73301-0002; with payment: Louisville,
  KY 40293-1000.

Examples from the fresh run:

| Persona | Expected from current IRS page | TaxVault output |
|---|---|---|
| `persona_063` | Charlotte, NC 28201-1214 | Louisville, KY 40293-1000 |
| `persona_016` | Louisville, KY 40293-1000 | Cincinnati, OH 45280-2501 |
| `persona_017` | Louisville, KY 40293-1000 | Charlotte, NC 28201-1300 |
| `persona_042` | Kansas City, MO 64999-0002 | Ogden, UT 84201-0002 |

Source checked: IRS, "Where to file addresses for taxpayers and tax
professionals filing Form 1040"  
https://www.irs.gov/filing/where-to-file-addresses-for-taxpayers-and-tax-professionals-filing-form-1040

This is a shipping blocker for paper-filing packets.

### GPT-TV-002 - Exported PDFs Omit Required Or Claimed Forms

Severity: high

Sixteen persona packets omitted forms that the facts, forms command, or export
metadata indicate should be present.

| Missing form | Count | Affected personas |
|---|---:|---|
| Schedule C | 8 | `014`, `016`, `020`, `025`, `029`, `072`, `087`, `095` |
| Schedule SE | 7 | `014`, `016`, `020`, `025`, `072`, `087`, `095` |
| Form 8889 | 4 | `022`, `048`, `058`, `085` |
| Form 8995-A | 6 | `025`, `056`, `063`, `065`, `091`, `095` |

Two forms are especially concerning:

- For HSA cases, `export.json` can include `Form8889`, but the packet manifest
  and extracted PDF text omit Form 8889.
- For QBI cases, `export.json` can include `Form8995A`, but the packet manifest
  and extracted PDF text omit Form 8995-A.

Concrete examples:

- `persona_022`: `export.json` included `Form8889`, but the manifest/PDF only
  listed ForYourRecords, 1040, ScheduleB, and MailingInstructions.
- `persona_063`: `export.json` included `Form8995A`, but the manifest/PDF only
  listed ForYourRecords, 1040, Schedule1, ScheduleD, ScheduleE, Form8949,
  Form1040V, and MailingInstructions.
- `persona_014`: self-employment facts produced Schedule 1 and Form 1040-V in
  the packet, but Schedule C and Schedule SE were absent from the PDF.

This means the audit cannot treat the exported filing packet as complete even
when compute succeeds.

### GPT-TV-003 - Persona 007 Has 1040 Line 9 Income Not Shown On Component Lines

Severity: medium-high

`persona_007` has W-2 wages of $11,295.31 and
`income.wage_details.stock_option_income_cents = 193182`.

The computed Form 1040 output is:

| Line | Amount |
|---|---:|
| line 1 wages | $11,295.31 |
| line 8 other income | $0.00 |
| line 9 total income | $13,227.13 |

The difference is exactly $1,931.82, matching stock option income. It appears to
be included in total income but not represented on a visible component income
line. The tax outcome here is zero because the standard deduction exceeds AGI,
but the line presentation is not mathematically reconcilable.

### GPT-TV-004 - Invalid Head-Of-Household Personas Validate And Compute

Severity: medium

`persona_087` and `persona_094` have `filing_status = head_of_household` but no
dependent facts in the persona JSON. Both still validate and compute.

This may be a fixture-generation bug rather than an engine bug, but it weakens
the 100-persona audit claim. HoH is a status with dependency/residency rules; an
audit corpus should not silently include impossible HoH returns.

### GPT-TV-005 - CA Double Exemption Credit Confirmed, Plus CA Constants Look Stale

Severity: medium-high

I independently reran the CA DOR scenario harness. Result:

| Scenario | Expected credits | Actual credits | Tax impact |
|---|---:|---:|---:|
| `scenario_01` | $140.00 | $280.00 | Tax understated by $140.00 |
| `scenario_02` | $140.00 | $294.00 | Tax understated by $154.00 |
| `scenario_03` | $280.00 | $560.00 | Tax understated by $280.00 |

This confirms Claude's CA double-exemption-credit mechanism.

I also checked current FTB 2025 standard deduction amounts. FTB currently lists:

- Single or MFS: $5,706.
- MFJ, HoH, or qualifying widow(er): $11,412.

TaxVault's `Application/core/state_tax/data/ca.toml` uses:

- Single: $5,543.
- MFJ: $11,086.

Source checked: California FTB, "Deductions", 2025 standard deduction amounts  
https://www.ftb.ca.gov/file/personal/deductions/index.html

So the CA issue is broader than the duplicated credit: the 2025 CA data should
be refreshed against current FTB publications.

### GPT-TV-006 - Thin PDFs Trip The Plan's Size Floor

Severity: low-medium

All 100 PDFs were generated and `pdftotext` could extract text. However, the
plan explicitly asked for a minimum 5 KB packet sanity floor. Ten packets were
below that threshold:

`persona_001`, `persona_002`, `persona_007`, `persona_015`, `persona_036`,
`persona_040`, `persona_053`, `persona_056`, `persona_068`, `persona_069`.

This is not by itself proof of a wrong return, but it is consistent with a
packet generator that is producing very skeletal PDFs.

### GPT-TV-007 - IRS ATS / Pub 5078 Business Claims Are Bootstrapped

Severity: audit-process blocker

I agree with Claude's ETIN/ATS gate finding. The local business ATS manifest
states that Pub. 5078 is a guidelines document, and that actual ATS scenario
data lives behind the IRS MeF ATS portal requiring an IRS ETIN/e-Services
account. The business scenarios in the repo are explicitly "engine_derived" and
"NOT IRS-verified ATS scenarios."

I also found stale/inconsistent ATS artifact summaries:

- `Application/tests/audit/irs_ats/business/manifest.json` says business
  scenarios are engine-derived.
- `Application/tests/audit/irs_ats/business/report.json` reports 8/9 business
  pass with 1 Form 1120 skip.
- `Application/tests/audit/irs_ats/ATS_REPORT.md` and
  `Application/tests/audit/irs_ats/ats_summary.json` present older/partial
  summaries.

This does not invalidate the engine regression value, but it does prevent a
claim that the business ATS set is IRS-verified.

## Comparison With Claude Audit

Claude's public 100-persona status reported 100/100 pass. My independent
classification is 41 clean pass and 59 warn/blocker.

This divergence is audit value. It does not mean the compute engine collapsed;
it means Claude's pass criteria were narrower than the user-facing filing packet
criteria in `AUDIT_PLAN_FOR_GPT.md`.

### Oracle Disagreement Classifications

While this audit was running, Chain 9 produced:

- `Application/tests/audit/oracle/classifications.json`
- `Application/tests/audit/DISAGREEMENT_CLASSIFICATIONS.md`

Those classify all 26 TaxVault-vs-oracle disagreements over $5,000:

| Classification | Count |
|---|---:|
| `ORACLE_BUG` | 10 |
| `FEATURE_GAP` | 8 |
| `TAXVAULT_CORRECT_PER_IRS_RULES` | 4 |
| `LEGITIMATE_MODEL_DIFFERENCE` | 2 |
| `IRREDUCIBLE_ORACLE_DISAGREEMENT` | 2 |

I independently agree with the new framing: comparing to a median of three
mutually divergent oracles is not a good proof of correctness. Classifying each
large disagreement is a more honest bar.

However, those classifications do not address the output-packet findings above:
wrong mailing addresses, omitted PDF forms, HoH fixture invalidity, and visible
1040 line reconciliation.

### Public Audit State Observed

At the time I wrote this report:

- `Application/tests/audit/PUBLIC_AUDIT/merkle_root.txt` exists:
  `b673df5a5a71ce5ea4f51a198b7ec955b8642fa792ba3a66f28ec8dcc15d9fea`
- `Application/tests/audit/PUBLIC_AUDIT_FINAL/merkle_root.txt` did not exist.
- Chain 9 runner state showed `currentFile = 40-sign-iter3.md`, status running.

## Verification Of Claude-Specific Findings

### TVBUG-CA-DOUBLE-EXEMPTION

Confirmed. The CA DOR harness fails 0/3, with actual exemption credits higher
than expected by $140, $154, and $280 respectively. I also found stale CA 2025
standard deduction constants versus current FTB published amounts.

### OBBBA Standard Deduction Correction

Confirmed. TaxVault's 2025 federal standard deduction constants are:

- Single/MFS: $15,750.
- MFJ/QSS: $31,500.
- HoH: $23,625.

These match current IRS OBBB and Pub. 501 values. Sources checked:

- IRS, "IRS releases tax inflation adjustments for tax year 2026, including
  amendments from the One, Big, Beautiful Bill"  
  https://www.irs.gov/newsroom/irs-releases-tax-inflation-adjustments-for-tax-year-2026-including-amendments-from-the-one-big-beautiful-bill
- IRS, Publication 501 (2025), standard deduction tables  
  https://www.irs.gov/publications/p501

### Oracle Gap Diagnosis

Mostly confirmed. Chain 9's 26-disagreement classification artifact is a more
honest standard than a three-oracle median.

I also reran a five-persona oracle sample. `persona_063` is notable: fresh
TAXSIM returned $155,548.78 while the stored TAXSIM artifact showed
$224,818.56. The TAXSIM translator caps 2025 to TAXSIM's supported 2023 year,
so TAXSIM is not an authoritative 2025 source for high-stakes claims.

Sample fresh oracle rerun:

| Persona | TAXSIM | Tax-Calculator | PolicyEngine |
|---|---:|---:|---:|
| `persona_001` | $0.00 | $0.00 | $0.00 |
| `persona_020` | $21,021.13 | $22,869.56 | $18,693.78 |
| `persona_050` | $2,987.54 | $4,127.65 | $1,138.81 |
| `persona_063` | $155,548.78 | $151,635.65 | $148,629.27 |
| `persona_100` | $746,951.43 | $757,804.67 | $739,490.94 |

### ETIN / Pub 5078 Gate

Confirmed. The repo's own extraction report states that Pub. 5078 is a
guidelines document, actual scenario data requires IRS MeF ATS portal access,
and the local business facts are engine-derived.

### Hand-Built MeF Schema

Confirmed with caveat. I reran MeF validation and got 100/100 valid against the
repo's hand-built `IRS1040.xsd`, SHA-256:

`c25b71cf96a4a4b8e9f3637940dc411ffbf94f209cf2a6ced4bb66c9f11ccb8c`

That proves internal XML consistency against the local schema. It does not prove
production MeF readiness, because IRS publishes official TY2025 MeF schema and
business-rule releases. The current IRS page lists TY2025 Form 1040 series
versions through 2025v5.3, with production dates. The next audit should validate
against the official IRS schema package, not a hand-built approximation.

Source checked: IRS, "Tax year 2025 Modernized e-File schema and business rules
for individual tax returns and extensions"  
https://www.irs.gov/tax-professionals/tax-year-2025-modernized-e-file-schema-and-business-rules-for-individual-tax-returns-and-extensions

## Disagreements With Claude

| Topic | Claude position observed | GPT audit position |
|---|---|---|
| 100-persona audit | 100/100 pass | 41 clean pass, 59 warn/blocker |
| Oracle disagreements | Redefined bar is classification, not median match | Agree, but classification does not cover packet correctness |
| CA issue | Double exemption credit | Confirmed, plus CA constants appear stale |
| MeF | 100/100 valid against local schema | Confirmed only for local hand-built schema; official IRS schema needed |
| Pub 5078 ATS | ETIN gate blocks true IRS business scenarios | Confirmed |
| Filing packets | Not emphasized in Claude pass claim | Shipping blocker due to wrong mailing addresses and missing forms |

## Recommended Fix Order

1. Replace `irs_mailing_addresses.rs` with the current IRS Form 1040
   where-to-file table and add tests for payment/no-payment by state group.
2. Fix export packet generation so `forms`, `export_json.included_forms`,
   manifest, and PDF text all agree. Start with Schedule C, Schedule SE,
   Form 8889, and Form 8995-A.
3. Fix the stock option income line presentation so Form 1040 line 9 reconciles
   visibly to component lines.
4. Strengthen validation/persona generation so HoH requires qualifying
   dependent facts or an explicit modeled exception.
5. Fix CA exemption-credit application and refresh CA 2025 constants against
   current FTB sources.
6. Update the MeF validation harness to use the official IRS TY2025 schema
   packages.
7. Mark engine-derived Pub. 5078/ATS scenarios as regression tests, not
   IRS-verified ATS tests, until ETIN-gated scenario data is available.

## Bottom Line

The strongest true claim is:

> TaxVault's 100-persona compute pipeline completes without crashing, federal
> core line math is mostly consistent, OBBBA standard deductions are current,
> and all large oracle disagreements now have diagnoses.

The strongest claim I would not make yet is:

> TaxVault aced an end-to-end filing packet audit.

That claim is blocked by current IRS mailing-address mismatches and missing PDF
forms. Those are fixable, but they are real.
