-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathneon-data.qmd
649 lines (490 loc) · 25.4 KB
/
neon-data.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
# Data to constrain process model {#sec-neon}
```{r}
#| message: FALSE
library(tidyverse)
library(neonstore)
```
The goal of this chapter is to access and process data from the National Ecological Observatory Network (NEON) to calibrate parameters, estimate parameter uncertainty, assimilate data, and evaluate output from the forest carbon model in @sec-process-model. The data that is prepared for use in the forest carbon model is used in subsequent chapters.
In this document, we will be calculating the carbon in tree wood, tree leaves, and soil for a NEON site (matching the model states in @sec-process-model).
- The carbon in wood is based on the conversion of tree diameter measurements to carbon in stems and coarse woody roots. Since our simple forest model does not have a specific representation of fine roots, we lump fine roots with the wood stock.
- The carbon in tree leaves is represented by data measuring leaf area index (which is related to leaf carbon through a mass-to-area conversion).
- The carbon in soil represents all dead organic matter (both soil and vegetation) and lumps together standing dead trees, dead trees that have fallen (coarse woody debris), and soil organic matter measured using soil pits.
- Fluxes of carbon are represented by net ecosystem exchange (nee), as measured by eddy covariance towers.
Much of the code in this chapter is specific to NEON and the particular carbon pools and fluxes analyzed. While the chapter focuses on a single NEON site, the code and concepts can be applied to developing carbon budgets at other forested NEON sites. Additional carbon stock collections are needed at grassland sites (e.g., the non-woody vegetation sampling needs to be processed).
This chapter was developed in collaboration with John Smith at Montana State University.
NEON data is organized by data product ID in the NEON Data Portal: <https://data.neonscience.org/static/browse.html>
The chapter uses the `neonstore` packages developed by Carl Boettiger to access NEON data. The `neon_cloud` function uses the NEON data product ID and the table within the product to download the data from NEON cloud storage. If you are new to a NEON data product, it is important to explore the data product on NEON's Data Portal before using the `neon_cloud` functionality (otherwise you don't know what tables you need to download and how they link together).
## NEON Project
{{< video https://www.youtube.com/embed/39YrzpxVRF8?si=MZaH7miPEYhgiF6e >}}
## NEON Terrestrial sites
{{< video https://www.youtube.com/embed/FXpqf89w6QU?si=_wGoKN0i13yJbd5S >}}
## Download data
First, we define the site ID. The four letter site code denotes individual NEON sites. You can learn more about NEON sites here: <https://www.neonscience.org/field-sites/explore-field-sites>.
The elevation, latitude, and longitude are needed to convert the tree diameter measurements to biomass and are found on the NEON page describing the site.
```{r}
site <- "OSBS"
elevation <- 46
latitude <- 29.689282
longitude <- -81.993431
```
## Wood carbon
In this section, we will be calculating carbon in live and dead trees at a NEON site. The carbon in live trees represents the wood carbon stock in @fig-process-model and the dead trees represent a component of the soil organic matter stock in @fig-process-model. In the end, we will have a *site-level* mean carbon stock in *live trees* and *dead trees* for each year that was sampled from the plots that represent the ecosystem under the flux tower (e.g., tower plots). We use the tower plots so that they correspond to the same ecosystem as the NEON `nee` data.
We will select the key variables in each table (thus only downloading those variables).
The code below reads the data directly from NEON's cloud storage.
```{r}
#| message: false
## Mapping and tagging table
map_tag_table <- neon_cloud("mappingandtagging",
product = "DP1.10098.001",
site = site) |>
select(individualID, scientificName) |>
collect() |>
group_by(individualID) |>
slice(1) |> #This is needed because some individuals change species IDs
ungroup()
## Individual table
ind_table <- neon_cloud("apparentindividual",
product = "DP1.10098.001",
site = site) |>
select(individualID, eventID, plotID, date, stemDiameter,plantStatus, measurementHeight) |>
distinct() |>
collect()
## Plot table
plot_table <- neon_cloud("perplotperyear",
product = "DP1.10098.001",
site = site) |>
select(plotID,totalSampledAreaTrees,plotType) |>
distinct(plotID, .keep_all = TRUE) |>
collect()
```
The equations that convert diameter at breast height (DBH), where breast height is defined at 130 cm above the base of the tree, differ by species and location. Therefore the scientific name (both genus and species components) is needed. The species names in the mapping and tagging table need to be separated into the genus and species so that we can calculate the biomass using an R function that expects them to be separate.
```{r}
genus_species <- unlist(str_split_fixed(map_tag_table$scientificName, " ", 3))
map_tag_table <- map_tag_table |>
mutate(GENUS = genus_species[,1],
SPECIES = genus_species[,2])
```
Now we will join the tables by the key variables to build our dataset for the site.
```{r}
combined_table <- left_join(ind_table, map_tag_table, by = "individualID") |>
arrange(plotID,individualID)
combined_table <- inner_join(combined_table, plot_table, by = "plotID") |>
arrange(individualID)
combined_table_dbh <- combined_table |>
filter(measurementHeight == 130,
!is.na(stemDiameter))
```
### Calculate carbon in live trees
Tidy up the individual tree data to include only live trees from the tower plots. Also, create a variable that is the year of the sample date. We will filter the data based on measurement heights of 130 cm to only include data that had diameter at breast height (dbh) measurements.
```{r}
combined_table_live_tower <- combined_table_dbh |>
filter(str_detect(plantStatus,"Live"),
plotType == "tower",
measurementHeight == 130) |>
mutate(stemDiameter = as.numeric(stemDiameter))
```
To calculate the biomass of each tree in the table, we will use the `get_biomass` function from the `allodb` package (Gonzalex-Akre https://doi.org/10.1111/2041-210X.13756), which is a package that converts DBH measurements to tree biomass estimates. This function takes as arguments: `dbh`, `genus`, `species`, and `coords`. We have already extracted genera and species and filtered them to dbh measurements. (note `allodb` is not on CRAN but can be downloaded using `remotes::install_github("ropensci/allodb")`)
In this next section, as well as a future one where we calculate dead tree carbon, we are going to make a simplifying assumption. We will assume that the below-ground biomass of a tree is some fixed proportion of its above-ground biomass. In our analysis, we will assume this value is $0.3$ (`ag_bg_propr`), but it is a parameter that can be changed. We also assume that carbon is $0.5$ of biomass.
The `get_biomass` function is within the `allodb` package and returns the biomass of each tree in units of kg.
```{r}
library(allodb)
ag_bg_prop <- 0.3
tree_live_carbon <- combined_table_live_tower |>
mutate(ag_tree_kg = get_biomass(dbh = combined_table_live_tower$stemDiameter,
genus = combined_table_live_tower$GENUS,
species = combined_table_live_tower$SPECIES,
coords = c(longitude, latitude)),
bg_tree_kg = ag_tree_kg * ag_bg_prop, ## assumption about ag to bg biomass
tree_kgC = (ag_tree_kg + bg_tree_kg) * 0.5) ## convert biomass to carbon
```
Calculate the plot level biomass by summing up the tree biomass in a plot and dividing by the area of plot.
```{r}
measurement_dates <- tree_live_carbon |>
summarise(measure_date = max(date), .by = eventID)
plot_live_carbon <- tree_live_carbon |>
left_join(measurement_dates, by = "eventID") |>
mutate(treeC_kgCm2 = (tree_kgC)/(totalSampledAreaTrees)) |>
summarise(plot_kgCm2 = sum(treeC_kgCm2, na.rm = TRUE), .by = c("plotID", "measure_date"))
```
@fig-plot-live-carbon plot level carbon in living trees
```{r}
#| fig-cap: Plot level carbon in living trees for the focal NEON site
#| label: fig-plot-live-carbon
ggplot(plot_live_carbon, aes(x = measure_date, y = plot_kgCm2, color = plotID)) +
geom_point() +
geom_line() +
theme_bw()
```
Only a subset of plots are measured each year and we only want the plots that have annual measurements. This code determines the set of plots that are measured each year (a subset, n = 5), while all the other plots are measured every 5 years.
```{r}
last_plots <- plot_live_carbon |>
filter(measure_date == max(measure_date)) |>
pull(plotID)
site_live_carbon <- plot_live_carbon |>
filter(plotID %in% last_plots) |>
pivot_wider(names_from = plotID, values_from = plot_kgCm2) |>
na.omit() |>
pivot_longer(-measure_date, names_to = "plotID", values_to = "plot_kgCm2") |>
group_by(measure_date) |>
summarize(mean_kgCperm2 = mean(plot_kgCm2, na.rm = TRUE),
sd_kgCperm2 = sd(plot_kgCm2))
```
@fig-site-live-carbon is the site-level carbon calculated by taking the mean only of the plots that were measured each year.
```{r}
#| fig-cap: Site level carbon in living trees for the focal NEON site
#| label: fig-site-live-carbon
ggplot(site_live_carbon, aes(x = measure_date, y = mean_kgCperm2)) +
geom_point() +
geom_errorbar(aes(ymin=mean_kgCperm2-sd_kgCperm2, ymax=mean_kgCperm2+sd_kgCperm2), width=.2,
position=position_dodge(0.05)) +
theme_bw()
```
### Calculate carbon in dead trees
We will now use the `allodb` package to extract the carbon in dead trees. This is exactly like the steps above except for using the trees with a dead status.
```{r}
combined_table_dead_tower <- combined_table_dbh |>
filter(str_detect("Standing dead",plantStatus),
plotType == "tower") |>
mutate(stemDiameter = as.numeric(stemDiameter))
```
Calculate the biomass of each tree in the table. This assumes that standing dead trees have the same carbon as a live tree (which is an incorrect assumption).
```{r}
tree_dead_carbon <- combined_table_dead_tower |>
mutate(ag_tree_kg = get_biomass(
dbh = combined_table_dead_tower$stemDiameter,
genus = combined_table_dead_tower$GENUS,
species = combined_table_dead_tower$SPECIES,
coords = c(longitude, latitude)
),
bg_tree_kg = ag_tree_kg * ag_bg_prop,
tree_kgC = (ag_tree_kg + bg_tree_kg) * 0.5)
```
Calculate the plot level carbon.
```{r}
measurement_dates <- tree_dead_carbon |>
summarise(measure_date = max(date), .by = eventID)
plot_dead_carbon <- tree_dead_carbon |>
left_join(measurement_dates, by = "eventID") |>
mutate(treeC_kgCm2 = (tree_kgC)/(totalSampledAreaTrees)) |>
summarise(plot_kgCm2 = sum(treeC_kgCm2, na.rm = TRUE), .by = c("plotID", "measure_date"))
```
@fig-plot-dead-carbon plot level carbon in dead trees.
```{r}
#| fig-cap: Plot level carbon in dead trees for the focal NEON site
#| label: fig-plot-dead-carbon
ggplot(plot_dead_carbon, aes(x = measure_date, y = plot_kgCm2, color = plotID)) +
geom_point() +
geom_line() +
theme_bw()
```
Calculate site level carbon in dead trees from the plots measured each year.
```{r}
site_dead_carbon <- plot_dead_carbon |>
filter(plotID %in% last_plots) |>
group_by(measure_date) |>
summarize(mean_kgCperm2 = mean(plot_kgCm2, na.rm = TRUE),
sd_kgCperm2 = sd(plot_kgCm2))
```
@fig-site-dead-carbon is the site-level carbon.
```{r}
#| fig-cap: Site level carbon in dead trees for the focal NEON site
#| label: fig-site-dead-carbon
ggplot(site_dead_carbon, aes(x = measure_date, y = mean_kgCperm2)) +
geom_point() +
geom_line() +
theme_bw()
```
## Calculate carbon in trees on the ground (coarse woody debris)
While the code above calculates the carbon in standing dead trees, it misses the carbon in dead trees that are no longer standing (called coarse woody debris). The coarse woody debris is another component of SOM in our simple forest model.
The data needed to calculate carbon in trees that are lying on the ground are in two NEON data products.
```{r}
#| message: false
cdw_density <- neon_cloud("cdw_densitydisk",
product = "DP1.10014.001",
site = site) |>
collect()
log_table <- neon_cloud("cdw_densitylog",
product = "DP1.10014.001",
site = site,
unify_schemas = TRUE) |>
collect()
cdw_tally <- neon_cloud("cdw_fieldtally",
product = "DP1.10010.001",
site = site) |>
collect()
```
We will go through the same steps to calculate carbon in the coarse woody debris.
```{r}
## Filter by tower plot for log table
log_table_filter <- log_table |>
filter(plotType == "tower",
plotID %in% last_plots)
## Filter by tower plot for cdw table
cdw_tally <- cdw_tally |>
filter(plotType == 'tower',
plotID %in% last_plots)
## create
log_table_filter$gcm3 <- rep(NA, nrow(log_table_filter))
## Set site specific volume factor
site_volume_factor <- 8
for (i in 1:nrow(log_table_filter)){
## Match log table sampleID to cdw density table sample ID
ind <- which(cdw_density$sampleID == log_table_filter$sampleID[i])
## Produce g/cm^3 by multiplying the bulk density of the disk by the site volume factor
log_table_filter$gcm3[i] <- mean(cdw_density$bulkDensDisk[ind]) * site_volume_factor
}
year_measurement <- min(log_table_filter$yearBoutBegan)
## Table of coarse wood
site_cwd_carbon <- log_table_filter |>
summarize(mean_kgCperm2 = .5 * sum(gcm3, na.rm = TRUE) * .1) |>
mutate(year = year_measurement)
```
## Calculate carbon in fine roots
We lump fine root carbon into the wood stem stock in the simple forest model. Here we are going to calculate the carbon stored in fine roots using the root chemistry data product. We will calculate the carbon in both dead and alive roots. Though we are interested mostly in live roots, at the time of writing this, the 2021 NEON data for our site does not have `rootStatus` data available. Thus we will use historical data to compute an estimate of the ratio so that we don't have to throw away perfectly good information.
```{r}
#| message: false
## root chemistry data product
bbc_percore <- neon_cloud("bbc_percore",
product = "DP1.10067.001",
site = site) |>
collect()
rootmass <- neon_cloud("bbc_rootmass",
product = "DP1.10067.001",
site = site) |>
collect()
```
```{r}
rootmass$year = year(rootmass$collectDate)
## set variables for liveDryMass, deadDryMass, unkDryMass, area
rootmass$liveDryMass <- rep(0, nrow(rootmass))
rootmass$deadDryMass <- rep(0, nrow(rootmass))
rootmass$unkDryMass <- rep(0, nrow(rootmass))
rootmass$area <- rep(NA, nrow(rootmass))
for (i in 1:nrow(rootmass)){
## match by sample ID
ind <- which(bbc_percore$sampleID == rootmass$sampleID[i])
## extract core sample area
rootmass$area[i] <- bbc_percore$rootSampleArea[ind]
## categorize mass as live, dead, or unknown
if (is.na(rootmass$rootStatus[i])){
rootmass$unkDryMass[i] <- rootmass$dryMass[i]
} else if (rootmass$rootStatus[i] == 'live'){
rootmass$liveDryMass[i] <- rootmass$dryMass[i]
} else if (rootmass$rootStatus[i] == 'dead'){
rootmass$deadDryMass[i] <- rootmass$dryMass[i]
} else{
rootmass$unkDryMass[i] <- rootmass$dryMass[i]
}
}
##
site_roots <- rootmass |>
## Filter plotID to only our plots of interest
filter(plotID %in% last_plots) |>
## group by year
group_by(year) |>
## sum live, dead, unknown root masses. multiply by
## .5 for conversion to kgC/m^2
summarize(mean_kgCperm2_live = .5*sum(liveDryMass/area, na.rm = TRUE)/1000,
mean_kgCperm2_dead = .5*sum(deadDryMass/area, na.rm = TRUE)/1000,
mean_kgCperm2_unk = .5*sum(unkDryMass/area, na.rm = TRUE)/1000,
year_total = sum(c(mean_kgCperm2_dead, mean_kgCperm2_live, mean_kgCperm2_unk)) / length(unique(plotID)),
med_date = median(collectDate)) |>
rename(mean_kgCperm2 = year_total) |>
select(year, mean_kgCperm2)
```
## Calculate carbon in soils
The video below provides an introduction to the science of soil carbon and methods for measuring it.
{{< video https://www.youtube.com/embed/khfIC5TpyPQ?si=f1ldTnTVpA8UWbb7 >}}
Soil carbon data is contained in two NEON data products: one that describes the physical characteristics of the soil (depth and density) and another that describes the carbon concentration of the soil. Ultimately multiplying the density by the carbon concentration gives the total carbon.
```{r}
#| message: false
#Download bieogeochemistry soil data to get carbon concentration
#data_product1 <- "DP1.00097.001"
#Download physical soil data to get the bulk density
mgc_perbiogeosample <- neon_cloud("mgp_perbiogeosample",
product = "DP1.00096.001",
site = site) |>
collect()
mgp_perbulksample <- neon_cloud("mgp_perbulksample",
product = "DP1.00096.001",
site = site) |>
collect()
```
This code pulls out the relevant columns from the data that were read in above.
```{r}
bulk_density <- mgp_perbulksample |>
filter(bulkDensSampleType == "Regular") |>
select(horizonName,bulkDensExclCoarseFrag)
#gramsPerCubicCentimeter
horizon_carbon <- mgc_perbiogeosample |>
filter(biogeoSampleType == "Regular") |>
select(horizonName,biogeoTopDepth,biogeoBottomDepth,carbonTot)
year <- year(as_date(mgp_perbulksample$collectDate[1]))
```
The code below
- joins the bulk density table into the table with the carbon concentration
- Determines the height of the horizon (`biogeoBottomDepth - biogeoTopDepth`) and converts to total mass of soil in the horizon using the bulk density.
- Multiply the carbon concentration (carbonTot) by the mass of soil (along with unit conversion) to get the soil carbon in kg C / m2.
```{r}
#Unit notes
#bulkDensExclCoarseFrag = gramsPerCubicCentimeter
#carbonTot = gramsPerKilogram
#Combine and calculate the carbon of each horizon
horizon_combined <- inner_join(horizon_carbon,bulk_density, by = "horizonName") |>
#Convert volume in g per cm3 to mass per area in g per cm2 by multiplying by layer thickness
mutate(horizon_soil_g_per_cm2 = (biogeoBottomDepth - biogeoTopDepth) * bulkDensExclCoarseFrag) |>
#Units of carbon are g per Kg soil but we have bulk density in g per cm2 so convert Kg soil to g soil
mutate(CTot_g_per_g_soil = carbonTot*(1/1000), #Units are g C per g soil
horizon_C_g_percm2 = CTot_g_per_g_soil*horizon_soil_g_per_cm2, #Units are g C per cm2
horizon_C_kg_per_m2 = horizon_C_g_percm2 * 10000 / 1000) |> #Units are g C per m2
select(-CTot_g_per_g_soil,-horizon_C_g_percm2) |>
arrange(biogeoTopDepth)
```
The soil carbon can be visualized by depth @fig-som-horizon.
```{r}
#| fig-cap: Soil carbon by depth for the site from the megapit.
#| label: fig-som-horizon
ggplot(horizon_combined, map = aes(-biogeoTopDepth,horizon_C_kg_per_m2)) +
geom_line() +
geom_point() +
labs(y = "Carbon", x = "Depth", title = "Soil carbon by depth") +
coord_flip() +
theme_bw()
```
Total soil carbon is the sum across the depths.
```{r}
site_soil_carbon <- horizon_combined |>
summarize(soilC_gC_m2 = sum(horizon_C_kg_per_m2))
```
## Combine
Next, we will combine our estimates of carbon in each component for visualization and to aggregate into the wood and som stocks below.
```{r}
site_live_carbon <- site_live_carbon |>
mutate(variable = "live_tree") |>
rename(datetime = measure_date) |>
select(datetime, variable, mean_kgCperm2)
site_dead_carbon <- site_dead_carbon |>
mutate(variable = "dead_trees") |>
rename(datetime = measure_date) |>
select(datetime, variable, mean_kgCperm2)
site_cwd_carbon <- site_cwd_carbon |>
mutate(variable = "down_wood") |>
mutate(datetime = as_date(paste(year, "01-01"))) |>
select(datetime, variable, mean_kgCperm2)
site_roots <- site_roots |>
mutate(variable = "fine_roots") |>
mutate(datetime = as_date(paste(year, "01-01"))) |>
select(datetime, variable, mean_kgCperm2)
site_soil_carbon <- site_soil_carbon |>
mutate(variable = "soil_carbon") |>
rename(mean_kgCperm2 = soilC_gC_m2) |>
mutate(datetime = as_date(paste(year, "01-01"))) |>
select(datetime, variable, mean_kgCperm2)
total_carbon_components <- bind_rows(site_live_carbon, site_dead_carbon, site_cwd_carbon, site_roots, site_soil_carbon)
```
The different pools of carbon can be plotted on the same figure to compare the magnitudes @fig-all-site-carbon.
```{r}
#| warning: false
#| fig-cap: Site-level carbon stocks at the focal NEON site
#| label: fig-all-site-carbon
total_carbon_components |>
ggplot(aes(x = datetime, y = mean_kgCperm2, color = variable)) +
geom_point() +
theme_bw()
```
Combine pools of carbon to match the stocks used in our simple process model.
- wood = live trees (stem and coarse roots) + fine roots
- som = dead trees + down wood + soil carbon
```{r}
total_carbon_simple <- total_carbon_components |>
pivot_wider(names_from = variable, values_from = mean_kgCperm2) |>
mutate(wood = live_tree + mean(fine_roots, na.rm = TRUE),
som = mean(dead_trees, na.rm = TRUE) + mean(down_wood, na.rm = TRUE) + mean(soil_carbon, na.rm = TRUE),
som = ifelse(datetime != min(datetime), NA, som)) |>
select(datetime, wood, som) |>
pivot_longer(-datetime, names_to = "variable", values_to = "observation")
```
## MODIS LAI
The leaf area index can be used as a proxy for leaf carbon. The forest model converts leaf carbon into LAI using a leaf mass to area parameter. As a result, we can use the leaf area index (LAI) from the MODIS satellite sensor to constrain and evaluate LAI predictions. MODIS LAI product is an 8-day mean for a 500m grid cell.
{{< video https://www.youtube.com/embed/n9t_ANefhjU?si=U4OYgoq83Uu2nu57 >}}
Download the leaf area index for the focal NEON site using the `MODISTools` package.
```{r}
lai <- MODISTools::mt_subset(product = "MCD15A2H",
lat = latitude,
lon = longitude,
band = c("Lai_500m", "FparLai_QC"),
start = as_date(min(total_carbon_simple$datetime)),
end = Sys.Date(),
site_name = site,
progress = FALSE)
lai_cleaned <- lai |>
mutate(scale = ifelse(band == "FparLai_QC", 1, scale),
scale = as.numeric(scale),
value = scale * value,
datetime = lubridate::as_date(calendar_date)) |>
select(band, value, datetime) |>
pivot_wider(names_from = band, values_from = value) |>
filter(FparLai_QC == 0) |>
rename(observation = Lai_500m) |>
mutate(variable = "lai") |>
select(datetime, variable, observation)
```
@fig-modis-lai is the LAI for the focal NEON site.
```{r}
#| warning: false
#| fig-cap: MODIS LAI for the 500m grid cell that includes the flux tower
#| label: fig-modis-lai
lai_cleaned |>
ggplot(aes(x = datetime, y = observation)) +
geom_point() +
geom_smooth(span = 0.12) +
theme_bw()
```
## Flux data
NEE flux data is used to help constrain the net of photosynthesis and respiration in the simple forest model. It is already processed for use in the NEON Ecological Forecasting Challenge. Here we read in that data.
Learn about flux data here:
{{< video https://www.youtube.com/embed/CR4Anc8Mkas?si=45r0mDvamjYGWq1E >}}
```{r}
url <- "https://sdsc.osn.xsede.org/bio230014-bucket01/challenges/targets/project_id=neon4cast/duration=P1D/terrestrial_daily-targets.csv.gz"
flux <- read_csv(url, show_col_types = FALSE) |>
filter(site_id %in% site,
variable == "nee") |>
mutate(datetime = as_date(datetime)) |>
select(datetime, variable, observation)
```
@fig-nee-obs is the daily mean NEE for the focal NEON site.
```{r}
#| warning: false
#| echo: false
#| fig-cap: Daily mean NEE from the flux tower at the focal NEON site
#| label: fig-nee-obs
ggplot(flux, aes(x = datetime, y = observation)) +
geom_point() +
theme_bw()
```
## Combine to create data constraints
The units of the carbon stocks and nee need to be converted to the units of the forest process model. The carbon stocks are converted from kgC/m2 to MgC/ha and nee is converted from gC/m2/day to MgC/ha/day.
```{r}
obs <- total_carbon_simple |>
bind_rows(lai_cleaned, flux) |>
mutate(site_id = site) |>
#convert from kgC/m2 to MgC/ha
mutate(observation = ifelse(variable %in% c("wood", "som") , observation * 10, observation),
observation = ifelse(variable %in% c("nee") , observation * 0.01, observation))
```
The combined data with the variable names converted to the names used in the forest process model @fig-combined-all.
```{r}
#| warning: false
#| label: fig-combined-all
#| fig-cap: The data available to constrain the forest process model.
obs |>
ggplot(aes(x = datetime, y = observation)) +
geom_point() +
facet_wrap(~variable, scale = "free_y") +
theme_bw()
```
Save the observations to a CSV file.
```{r}
write_csv(obs, "data/site_carbon_data.csv")
```
Now, we have a complete, up-to-date carbon budget file that is stored in a format compatible with our simple forest process model. This will allow us to calibrate parameters, assimilate data, and evaluate forecasts. We will use this file in the subsequent chapters.