Skip to content

Complete the FATES-CLM nitrogen coupling#3409

Open
slevis-lmwg wants to merge 46 commits intoESCOMP:masterfrom
slevis-lmwg:fates-cn
Open

Complete the FATES-CLM nitrogen coupling#3409
slevis-lmwg wants to merge 46 commits intoESCOMP:masterfrom
slevis-lmwg:fates-cn

Conversation

@slevis-lmwg
Copy link
Copy Markdown
Contributor

@slevis-lmwg slevis-lmwg commented Aug 11, 2025

Description of changes

For now see the issue #3378

Corresponding mods on the FATES side:
NGEET/fates#1472

Nutrient enabled FATES handbook
FATES CLM N coupling

Specific notes

Contributors other than yourself, if any:
@rgknox @adrifoster @wwieder

CTSM Issues Fixed (include github issue #):
#3378

Are answers expected to change (and if so in what way)?
For non-fates tests expect roundoff diffs due to a change in the order of operations in CNNDynamicsMod.F90 as documented below.

Any User Interface Changes (namelist or namelist defaults changes)?
Yes, see changes to CLMBuildNamelist and namelistdefaults.

Does this create a need to change or add documentation? Did you do so?
Yes, no.

  • Todo: add a modification to the PRT2 test, make sure that prescribed P uptake is set to 10. This will ensure that there are no P limitations in fates, when FATES becomes coupled to CLM's N cycle (in future PR). i:e: fates_cnp_prescribed_puptake=10

Initial testing performed
...with the first two commits in this PR:

PASS ERS_D_Ld30.1x1_brazil.I2000Clm60FatesCrujraRs.derecho_intel.clm-FatesColdPRT2
PASS ERS_D_Ld30.1x1_brazil.I2000Clm60FatesCrujraRs.derecho_intel.clm-FatesCold

Later comments point out that these two tests were inadequate at catching problems, and that I switched to two other tests.

PASS ERS_D_Ld30.1x1_brazil.I2000Clm60FatesCrujraRs.derecho_intel.clm-FatesColdPRT2
FAIL ERS_D_Ld30.1x1_brazil.I2000Clm60FatesCrujraRs.derecho_intel.clm-FatesColdPRT2--clm-mimicsFatesCold--clm-nofireemis
The latter needs "nofireemis" to work with Fates, but it then dumps core in line 1180 SoilBiogeochemDecompCascadeMIMICSMod.F90
...calculating these variables:
nf_soil%decomp_npools_sourcesink_col
nf_soil%fates_litter_flux
@slevis-lmwg slevis-lmwg self-assigned this Aug 11, 2025
@slevis-lmwg slevis-lmwg added enhancement new capability or improved behavior of existing capability investigation Needs to be verified and more investigation into what's going on. science Enhancement to or bug impacting science test: aux_clm Pass aux_clm suite before merging test: fates Pass fates test suite before merging labels Aug 11, 2025
@slevis-lmwg slevis-lmwg linked an issue Aug 11, 2025 that may be closed by this pull request
1 task
@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Aug 13, 2025

With the latest commit, I repeated the two earlier tests and added another to check whether answers have changed from the baseline:

PASS ERS_D_Ld30.1x1_brazil.I2000Clm60FatesCrujraRs.derecho_intel.clm-FatesColdPRT2
PASS ERS_D_Ld30.1x1_brazil.I2000Clm60FatesCrujraRs.derecho_intel.clm-FatesCold
FAIL ERP_Ld9.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdAllVars -c /glade/campaign/cgd/tss/ctsm_baselines/ctsm5.3.065

The latter fails in case2, after reading the restart file, with a N balance error.

UPDATE

  • See below for suggestions to resolve FAIL
  • Add new test ERP_Ld9.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdPRT2 (sanity check first that it meets requirements set forth in the fourth checkbox here, which I think means pointing to an alternate fates paramfile)
  • Later comment explains why I decided to revert to preexisting ERS tests and skip adding this ERP test.

Comment thread src/soilbiogeochem/SoilBiogeochemCompetitionMod.F90 Outdated
Comment thread src/soilbiogeochem/SoilBiogeochemCompetitionMod.F90
@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Aug 26, 2025

Enabled fixation and ran the same tests:

PASS ERS_D_Ld30.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdPRT2 -c /glade/campaign/cgd/tss/ctsm_baselines/fates-sci.1.84.0_api.40.0.0-ctsm5.3.066
PASS ERS_D_Ld30.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdLUH2 -c /glade/campaign/cgd/tss/ctsm_baselines/fates-sci.1.84.0_api.40.0.0-ctsm5.3.066

PRT2 still b4b with the baseline.
LUH2 now DIFF from the baseline.

PASS The same tests with the code change for harvest (same results relative to the baseline).

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Sep 20, 2025

Worked on the next checkbox in the issue (#3378), submitted the same tests, and after some troubleshooting:
PASS PRT2 and b4b with the baseline.
PASS LUH2 and DIFF from the baseline as before.

These variables originate in fates, so this renaming requires the same
renaming in fates; I will open the corresponding PR very soon
@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Sep 23, 2025

OK ./build-namelist_test.pl
OK ./run_sys_tests -s fates -c fates-sci.1.84.0_api.40.0.0-ctsm5.3.066/ --skip-generate
REDO ./run_sys_tests -s aux_clm -c ctsm5.3.066 --skip-generate

Notes:

  • Gnu and nvhpc tests needed a bug-fix that intel didn't catch (next commit).
  • Then the fates test-suite worked (I didn't repeat aux_clm for now)
  • Many tests DIFFer from the baseline.
  • The PRT2 test had the expected NLCOMP change, though no DIFFs from the baseline (even after rebuilding/rerunning):
  BASE: suplnitro = 'ALL'
  COMP: suplnitro = 'NONE'

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Sep 23, 2025

Updated the fates paramfile (see next commit) and submitted these two again

./create_test ERS_D_Ld30.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdLUH2 -c /glade/campaign/cgd/tss/ctsm_baselines/fates-sci.1.84.0_api.40.0.0-ctsm5.3.066
./create_test ERS_D_Ld30.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdPRT2 -c /glade/campaign/cgd/tss/ctsm_baselines/fates-sci.1.84.0_api.40.0.0-ctsm5.3.066

The first (LUH2) same as before (DIFF from baseline) since nothing changed for it.
The second (ERS_D_Ld30.f45_f45_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdPRT2.C.20250923_141838_qjxqou) now:

  • DIFF from baseline and
  • FAIL COMPARE_base_rest suggesting variable(s) missing from restart, I suspect

@rosiealice
Copy link
Copy Markdown
Contributor

Thanks @slevis-lmwg & @rgknox , that's good to know. We will circle back to this once Shelby is spun up on life in general after Easter (she only just arrived about a week ago...)

@rgknox
Copy link
Copy Markdown
Contributor

rgknox commented Mar 30, 2026

I'm investigating the ERS restart test failure and tried two tests. In the first test, I forced litter flux from FATES to CLM to be zero. This did not have any impact on restarts, suggesting that the bug is either not related to restarting litter fluxes, or simply is not relegated to restarting litter fluxes. I also forced the fine-root profile that fates sends to the nitrogen allocation scheme to zero, and this did generate b4b restarts. My next test will be to go into the allocation code, and set the fates-side boundary conditions to constants, and see if that works. That will tell us if the problem is restarting clm-side allocation/decomposition variables, of if its fates-side restarting of boundary conditions.. cc @slevis-lmwg

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Apr 6, 2026

After the latest updates, I started by submitting SMS.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-crop.C.20260406_124851_rch8g6 and this passed. I got pulled in other directions for the day, but my plan is to

  • confirm the other tests work...

FAIL PRT2_synthN RUN

forrtl: severe (408): fort: (2): Subscript #1 of the array PLANT_NH4_UPTAKE_FLUX has value 2 which is greater than the upper bound of 1
cesm.exe           0000000001DFE98F  set_restart_vecto        2713  FatesRestartInterfaceMod.F90
cesm.exe           000000000099C9D5  restart                  1935  clmfates_interfaceMod.F90
cesm.exe           0000000000939B38  clm_instrest              620  clm_instMod.F90
cesm.exe           0000000000D32E1C  restfile_write            132  restFileMod.F90
cesm.exe           0000000000917E69  clm_drv                  1529  clm_driver.F90
cesm.exe           0000000000856DC4  modeladvance              913  lnd_comp_nuopc.F90

FAIL LUH2 RUN turns out to expected as per #3789

negative area           1           1           1  -999.0
ENDRUN: ERROR in EDInitMod.F90 at line 608

PASS PRT2, just no baseline to compare
PASS PRT2_suplnAll
PASS SMS (as already mentioned above the bullet)

  • resolve failures, then start a full aux_clm test

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Apr 7, 2026

@rgknox two tests failed (PRT2_synthN and LUH2) and three passed. See error msgs in the previous post.

Looking into the first of these. I think we set allocate(bc_in%plant_nh4_uptake_flux(1,1)) in main/FatesInterfaceMod.F90 because coupled_p_uptake and coupled_n_uptake are false. The failure then occurs in main/FatesRestartInterfaceMod.F90 in this loop:

icomp = 0
do_patch: do while(associated(cpatch))
   do while(associated(ccohort))
      if (hlm_parteh_mode == carbon_nitrogen_phosphorus) then
         icomp=icomp+1
         this%rvars(ir_nh4uptakeflux_co)%r81d(io_idx_co) = bc_in(s)%plant_nh4_uptake_flux(icomp,1)

My first inclination is to move icomp = 0 into the first do loop and try the tests again. So far:

FAIL PRT2_synthN RUN
FAIL LUH2 RUN expected
FAIL PRT2 COMPARE_base_rest
FAIL PRT2_suplnAll COMPARE_base_rest
PASS SMS

so my first intuition was wrong, and I will restore FatesRestartInterfaceMod.F90 and try changing the allocate statement in FatesInterfaceMod.F90.

PASS PRT2_synthN RUN
FAIL LUH2 RUN expected
PASS PRT2 COMPARE_base_rest
PASS PRT2_suplnAll COMPARE_base_rest
PASS SMS

Fixed!

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Apr 8, 2026

Starting aux_clm on derecho + izumi:
./run_sys_tests -s aux_clm -c ctsm5.4.029 --skip-generate

RUN failures to investigate (none on izumi) in /glade/derecho/scratch/slevis/tests_0408-165737de

ERI_D_Ld9.f45_f45_mg37.I2000Clm60FatesSpCruRsGs.derecho_intel.clm-FatesColdSatPhenCamLndTuningMode
ERP_P128x2_Ld30.f45_f45_mg37.I2000Clm60FatesSpCruRsGs.derecho_intel.clm-FatesColdSatPhen
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso--clm-matrixcnOn_ignore_warnings
SMS.f45_f45_mg37.I2000Clm60FatesSpRsGs.derecho_nvhpc.clm-FatesColdSatPhen
  • ERP_P64 reran four times and then a fifth after a clean build and it continues to fail, so I'm abandoning, even though an ERP_D test with the same "matrix" error passed with one rerun.
  • ERI, ERP_P128, SMS give ERROR: fates_parteh_mode=carbon_only must have suplnitro set to suplnAll. ERROR in controlMod.F90 at line 504. Erik and I looked at CLMBuildNamelist.pm and resolved quickly.
  • Erik suggested I add a build-namelist_test.pl test confirming this failure triggers in the build phase.

Also, should we expect the DIFFs from baseline in the following tests (some also on izumi, but I did not list them) due to the diff between sci.1.91.3_api.43.1.0-25-g21c1bcbe and the baseline sci.1.91.1_api.43.1.0? Though I see diffs in non-Fates cases, too. Erik suggests introducing the changes in a b4b step followed by a non-b4b step if possible.

    FAIL ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default BASELINE ctsm5.4.029: DIFF
    FAIL ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm45BgcCrop.derecho_gnu.clm-no_subgrid_fluxes BASELINE ctsm5.4.029: DIFF
    FAIL ERS_D_Ld20.f45_f45_mg37.I2000Clm50FatesRs.derecho_gnu.clm-FatesColdTwoStreamNoCompFixedBioGeo BASELINE ctsm5.4.029: DIFF
    FAIL ERS_D_Ld6.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-clm50CMIP6frc BASELINE ctsm5.4.029: DIFF
    FAIL LGRAIN2_Ly2_P128x1.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput BASELINE ctsm5.4.029: DIFF
    FAIL LREPRSTRUCT_Ly2_P128x1.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput BASELINE ctsm5.4.029: DIFF
    FAIL SMS_D_Ld5.f10_f10_mg37.I2000Clm50FatesRs.derecho_gnu.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL SMS_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-till--clm-remove_residues BASELINE ctsm5.4.029: DIFF
    FAIL SMS_Ld5_PS.f19_g17.I2000Clm50FatesRs.derecho_gnu.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL ERI_D_Ld20.f10_f10_mg37.I2000Clm50Fates.derecho_intel.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL ERI_D_Ld20.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdTwoStream BASELINE ctsm5.4.029: DIFF
    FAIL ERI_D_Ld9.f45_f45_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdCamLndTuningMode BASELINE ctsm5.4.029: DIFF
    FAIL ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-noFUN_flexCN BASELINE ctsm5.4.029: DIFF
    FAIL ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-noFUN_flexCN--clm-matrixcnOn_ignore_warnings BASELINE ctsm5.4.029: DIFF (EXPECTED FAILURE)
    FAIL ERP_Ld9.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdAllVars BASELINE ctsm5.4.029: DIFF
    FAIL ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-crop BASELINE ctsm5.4.029: DIFF
    FAIL ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCru.derecho_intel.clm-ciso BASELINE ctsm5.4.029: DIFF
    FAIL ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCru.derecho_intel.clm-default BASELINE ctsm5.4.029: DIFF
    FAIL ERP_P64x2_D_Ld5.f10_f10_mg37.IHistClm45BgcCru.derecho_intel.clm-decStart BASELINE ctsm5.4.029: DIFF
    FAIL ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings BASELINE ctsm5.4.029: DIFF (EXPECTED FAILURE)
    FAIL ERS_D_Ld5.f10_f10_mg37.I2000Clm50Fates.derecho_intel.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL ERS_D_Ld6.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-clm50CMIP6frc BASELINE ctsm5.4.029: DIFF
    FAIL ERS_Ld30.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdFixedBiogeo BASELINE ctsm5.4.029: DIFF
    FAIL ERS_Ld30.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdSizeAgeMort BASELINE ctsm5.4.029: DIFF
    FAIL ERS_Ly5_P128x1.f10_f10_mg37.IHistClm45BgcCrop.derecho_intel.clm-cropMonthOutput BASELINE ctsm5.4.029: DIFF
    FAIL ERS_P128x1_Ld765.f10_f10_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdNoComp BASELINE ctsm5.4.029: DIFF
    FAIL REP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings BASELINE ctsm5.4.029: DIFF (EXPECTED FAILURE)
    FAIL SMS_D_Ld5.f10_f10_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL SMS_D_Ld65.f10_f10_mg37.I2000Clm45BgcCropQianRs.derecho_intel.clm-FireLi2014Qian BASELINE ctsm5.4.029: DIFF
    FAIL SMS_D_Lm6_P256x1.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL SMS_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-crop BASELINE ctsm5.4.029: DIFF
    FAIL SMS_Ld5.f10_f10_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesCold BASELINE ctsm5.4.029: DIFF
    FAIL SMS_Ln9.f09_f09_mg17.I1850Clm45Bgc.derecho_intel.clm-clm45cam4LndTuningModeZDustSoilErod BASELINE ctsm5.4.029: DIFF

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Apr 13, 2026

Testing on derecho after the latest updates:

  • ./run_sys_tests -s fates -c fates-sci.1.92.0_api.44.0.0-ctsm5.4.031 --skip-generate
  • build-namelist_test.pl

Before I try ./run_sys_tests -s aux_clm -c ctsm5.4.032 --skip-generate, I will look into the diffs from baseline with this test from the above list of failures:
SMS_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-crop
Largest diffs (grepped for "E-0") out of a total of 53 fields with non-zero diffs suggest roundoff diffs:

 RMS CH4_SURF_AERE_SAT                2.0582E-10            NORMALIZED  3.9956E-03
 RMS CH4_SURF_DIFF_SAT                8.6639E-14            NORMALIZED  1.6418E-04
 RMS FCH4                             4.7957E-15            NORMALIZED  1.3812E-04
 RMS FCH4TOCO2                        4.7907E-12            NORMALIZED  8.6788E-05
 RMS FCH4_DFSAT                       8.9717E-20            NORMALIZED  2.3733E-07
 RMS NEM                              4.7907E-12            NORMALIZED  1.2779E-04
 RMS SOM_C_LEACHED                    1.3173E-25            NORMALIZED  1.5688E-09
 RMS TOTCOLCH4                        6.1704E-10            NORMALIZED  1.9915E-09
 RMS CONC_O2_SAT                      4.7118E-03            NORMALIZED  4.4369E-02
  • The test gives similar/same diffs when I reverse the mods in SoilBiogeochemCompetitionMod.F90
  • The test returns b4b when I reverse the mods in CNNDynamicsMod.F90
  • Starting aux_clm to confirm all DIFFs go away, though Fates tests appear to fail now

Comment thread src/biogeochem/CNNDynamicsMod.F90
Comment thread src/soilbiogeochem/SoilBiogeochemCompetitionMod.F90
Comment thread src/soilbiogeochem/SoilBiogeochemCompetitionMod.F90
@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Apr 14, 2026

Now repeat ./run_sys_tests -s aux_clm -c ctsm5.4.032 --skip-generate with the latest commit:

  • izumi
  • derecho
    Three tests ran out of walltime that didn't fail before, so I will not spend additional time on them for now:
ERS_Ly5_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.derecho_gnu.clm-ciso_monthly RUN
ERS_Ly5_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.derecho_gnu.clm-ciso_monthly--clm-matrixcnOn RUN
ERS_Ly5_P128x1.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings RUN

Known issue #3840

@slevis-lmwg
Copy link
Copy Markdown
Contributor Author

slevis-lmwg commented Apr 14, 2026

@rgknox testing seems satisfactory to me, so this PR and its fates counterpart are ready for your final review, unless you think of anything else that we need to work on here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement new capability or improved behavior of existing capability investigation Needs to be verified and more investigation into what's going on. science Enhancement to or bug impacting science test: aux_clm Pass aux_clm suite before merging test: fates Pass fates test suite before merging

Projects

Status: Stalled

Development

Successfully merging this pull request may close these issues.

Completing the FATES-CLM nitrogen coupling

4 participants