fcc_funding.qmd

---
title: "FCC Funding Map"
date: last-modified
format: 
  html:
    code-fold: true
engine: knitr
---
```{r}
#| label: utility functions
source("R/table_with_options.R")
```

## Important links:

Link: [https://fundingmap.fcc.gov/home](https://fundingmap.fcc.gov/home)

Documentation: [https://us-fcc.app.box.com/v/bfm-data-downloads-output](https://us-fcc.app.box.com/v/bfm-data-downloads-output)

The download page has two tabs:

- [Funding data](https://fundingmap.fcc.gov/data-download/funding-data)

- [Unserved/unfunded](https://fundingmap.fcc.gov/data-download/unserved-data)

# Funding data

## Program Data 

It is organized by Agency and can be downloaded by projects (`fundingdata_projectXXXX`) or for all projects in a program  (`fundingdata_programXX`).

For each download it will have a `csv` about the program. For example RDOF is just a two rows csv, header included.

## Project Data

FCC is defining 3 types of projects: 

- Defined by "Area"

- Defined by "List of locations"

- Defined by "Middle Mile" (No project representing this one: 02-05-2024)

All of those types of project will have a Project Attribute Information table (`areaattributes_program` or `locationattributes_program`). The structure of those files are close but not similar for example location project has columns related to locations (`build_req`, `loc_plan`, `loc_sup`). 

The one for RDOF (`areaattributes_program24_J23_12feb2024.csv`) has 474 rows (inclunding headers ie 473 projects). 

Area projects will have an associated `areapolygons_XXX.gpkg`. A quick glance on the one from RDOF show full valid geometries with an expected number of rows (473).

Location projects have, instead, of a `gpkg` a `csv` with `location_id`, `lat`/`long`  and addresses field (but those last those are not filled).

## List of dataset avalaible: 07-03-2024

```{r}
#|  label: very ugly code that do the job
agency_name <- "Federal Communications Commission"
program_name <- c("Bringing Puerto Rico Together",
                "Connect America Fund Phase II",
                "Connect USVI", 
                "Enhanced Alternative Connect America Cost Model",
                "Rural Digital Opportunity Fund")
program_id <-  c("25", "28", "26", "35", "24")
fcc_dat <- data.frame(agency_name = rep(agency_name, length(program_name)),
                     program_name,
                     program_id 
                      )
agency_name <- "NTIA"
program_name <- c(
  "Broadband Infrastructure Program",
  "Tribal Broadband Connectivity Program NOFO 1"
)
program_id <-  c("11", "27")
ntia_dat <- data.frame(agency_name = rep(agency_name, length(program_name)), 
                      program_name,
                      program_id 
                      )
agency_name <- "Rural Utilities Service"
program_name <- c(
  "COMMUNITY CONNECT GRANT PROGRAM",
  "RURAL ECONNECTIVITY PROGRAM",
  "TELEPHONE LOAN PROGRAM"
)
program_id <-  c("10", "6", "12")
rural_dat <- data.frame(agency_name = rep(agency_name, length(program_name)), 
                      program_name,
                      program_id 
                      )
agency_name <- "US Department of Treasury"
program_name <- c(
  "Capital Projects Fund",
  "State and Local Fiscal Recovery Fund"
)
program_id <-  c("18", "19")
usdt_dat <- data.frame(agency_name = rep(agency_name, length(program_name)), 
                      program_name,
                      program_id 
                      )

fcc_all_dat <- rbind(fcc_dat, ntia_dat, rural_dat, usdt_dat )

# ls > path/to/list_file_fcc_feb2024.txt
fcc_files <- readLines("data/list_file_fcc_feb2024.txt")
# remove zip
fcc_files_slim <- fcc_files[!grepl(".zip", fcc_files)]

fcc_files_tidy <- as.data.frame(
                                do.call(rbind, 
                                        strsplit(fcc_files_slim, "_"))
)
# remove programdata, but it is nice to see for every files 
fcc_files_tidy <- fcc_files_tidy[fcc_files_tidy[["V1"]] != "programdata",]

fcc_files_tidy[["program_id"]] <- gsub("program", "", fcc_files_tidy[["V2"]])

fcc_files_tidy[["is_area"]] <- grepl("area", fcc_files_tidy[["V1"]])

first_V4 <- function(x) {unlist(strsplit(x, ".", fixed = TRUE))[1]}

fcc_files_tidy[["file_release"]] <- sapply(fcc_files_tidy[["V4"]], first_V4)

# works for now but will breack if I have the third type of project
fcc_files_tidy[["type_proj"]] <- ifelse(fcc_files_tidy[["is_area"]], "area", "location")

type_proj_temp <- sapply(split(fcc_files_tidy[["type_proj"]], 
             fcc_files_tidy[["program_id"]]), 
             unique)
file_release <- sapply(split(fcc_files_tidy[["file_release"]], 
             fcc_files_tidy[["program_id"]]), 
             unique) 

type_proj <- data.frame(
  program_id = names(type_proj_temp),
  type_proj  = type_proj_temp, 
  # a bit lazy and should be a join 
  file_release = file_release
)

fcc_all_dat <- merge(fcc_all_dat, type_proj,
      by.x = "program_id", by.y = "program_id", 
      all.x = TRUE, all.y = TRUE)

table_with_options(fcc_all_dat)
```

## From FCC program csv

I just stacked them:

```bash
csvstack data/programdata_program* > data/all_program.csv
```

Then remove their `elig_rules_desc` and `program_desc` so it can fit in a table. 

```{r} 
all_prog <- read.csv("data/all_program.csv")
list_of_names_to_keep <- c("agency_name" ,  "program_id",
                          "program_start_date","program_end_date",    "funding_source",   "funding_type",        "funding_obligated",  "funding_disbursed",   "funding_defaulted",   "min_download_spd",    "min_upload_spd",     "low_latency",      "funding_grant",       "program_cost",       "funding_loan",        "assistance_listings", "program_acronym",     "program_url"  
)
table_with_options(all_prog[, list_of_names_to_keep])
```

# Unserved / Unfunded 

The data is available by State and recorded at the location level (`location_id`) caracterized by their census block (`block_geoid`), H3 id (`h3_res8_id`) and a coninations of services described below. 

The service can be "residential" (`r`) or "business" (`b`).

Each services is categorized as `U` for Unserved and unfunded or `C` meaning the location is Covered (ie "availability data and or funding exist at the selected technology/speed combination if that location"). 

- *wired*: Copper, Cable, fiber

- *terrestrial*: Copper, Cable, Fiber, Unlicensed Fixed Wireless, Licensed Fixed Wireless, LBR Wireless

- *wiredlfw*: Copper, Cable, Fiber, Licensed Fixed Wireless, LBR Wireless

If a location is all 'C' it will not be in thoses files. In march 07 we got 36 247 609 locations.


```{r}
#| fig-cap: Count of Unserved/Unfunded by type of services    
sum_unserved_unfunded <- read.csv("data/unfunded_unserved.csv")
temp <- as.data.frame(t(sum_unserved_unfunded))
dat <- cbind(temp,
            do.call(rbind, strsplit(row.names(temp), '_'))
)
names(dat) <- c("Count", "cnt", "technology", "dl", "ul", 'res/biz')
dat <- dat[, c("technology", "res/biz",  "dl", "ul", "Count")]

table_with_options(dat)
```