There are a number of errors that you might encounter using this rATTAINS. Here is a list of potential errors and fixes. Feel free to raise an issue if I missed something.
Network Connectivity
The following error message likely indicates an issue connecting to the EPA server:
state_summary(organization_id = "TCEQMAIN", reporting_cycle = "2022")
Potential issues/fixes:
- Check your network connection.
- Check attains.epa.gov. If you are able to connect, a warning notice about accessing U.S. Government information systems should show in your web browser.
- Occasionally proxy systems used in corporate IT systems cause issues with connections (see: https://stackoverflow.com/questions/59796178/r-curlhas-internet-false-even-though-there-are-internet-connection). I’ve tried to account for this in the package, but you might run into occasional issues.
Server Response
The server might also return http code messages. The most common will be 404 or 429. rATTAINS will generally provide a simple message and error when this is encountered:
actions(action_id = "R8-ND-2018-03")
#> Error: Too Many Requests (HTTP 429)
Potential issues/fixes:
- Wait until the server is responsive.
- Make less frequent requests.
Parsing Errors
The default behavior in rATTAINS is to parse JSON data downloaded
from the API to one or more dataframes. These are returned as a single
dataframe or list of dataframes depending on the function. rATTAINS also
tries to flatten the data as much as possible. This design choice
might have been a mistake because it can become a
source of errors if the data returned by the API changes or is
inconsistent. As of version 1.0.0 of the package the
.unnest
argument was added to most functions. By setting
.unnest=FALSE
many of these problems should be avoided.
Default behavior:
state_summary(organization_id = "TDECWR",
reporting_cycle = "2016")
#> # A tibble: 71 × 24
#> organization_identifer organization_name organization_type_text
#> <chr> <chr> <chr>
#> 1 TDECWR Tennessee State
#> 2 TDECWR Tennessee State
#> 3 TDECWR Tennessee State
#> 4 TDECWR Tennessee State
#> 5 TDECWR Tennessee State
#> 6 TDECWR Tennessee State
#> 7 TDECWR Tennessee State
#> 8 TDECWR Tennessee State
#> 9 TDECWR Tennessee State
#> 10 TDECWR Tennessee State
#> # ℹ 61 more rows
#> # ℹ 21 more variables: reporting_cycle <chr>, water_type_code <chr>,
#> # units_code <chr>, use_name <chr>, fully_supporting <dbl>,
#> # fully_supporting_count <int>, use_insufficient_information <dbl>,
#> # use_insufficient_information_count <int>, not_assessed <dbl>,
#> # not_assessed_count <int>, not_supporting <dbl>, not_supporting_count <int>,
#> # parameter_group <chr>, parameter_insufficient_information <dbl>, …
Using .unnest=FALSE
returns nested columns. The tidyr
family of unnest()
functions is an easy way to flatten this
data:
df <- state_summary(organization_id = "TDECWR",
reporting_cycle = "2016",
.unnest = FALSE)
df
#> # A tibble: 1 × 4
#> organization_identifer organization_name organization_type_text
#> <chr> <chr> <chr>
#> 1 TDECWR Tennessee State
#> # ℹ 1 more variable: reporting_cycles <list<tibble[,2]>>
df |>
tidyr::unnest(reporting_cycles) |>
tidyr::unnest(water_types) |>
tidyr::unnest(use_attainments)
#> # A tibble: 22 × 16
#> organization_identifer organization_name organization_type_text
#> <chr> <chr> <chr>
#> 1 TDECWR Tennessee State
#> 2 TDECWR Tennessee State
#> 3 TDECWR Tennessee State
#> 4 TDECWR Tennessee State
#> 5 TDECWR Tennessee State
#> 6 TDECWR Tennessee State
#> 7 TDECWR Tennessee State
#> 8 TDECWR Tennessee State
#> 9 TDECWR Tennessee State
#> 10 TDECWR Tennessee State
#> # ℹ 12 more rows
#> # ℹ 13 more variables: reporting_cycle <chr>, water_type_code <chr>,
#> # units_code <chr>, use_name <chr>, fully_supporting <dbl>,
#> # fully_supporting_count <int>, use_insufficient_information <dbl>,
#> # use_insufficient_information_count <int>, not_assessed <dbl>,
#> # not_assessed_count <int>, not_supporting <dbl>, not_supporting_count <int>,
#> # parameters <list<tibble[,9]>>
If the above option doesn’t work, rATTAINS can also provide the raw
JSON data from the API. The tibblify 📦️ and
jsonlite 📦
provide tools to convert JSON to nested lists then tibbles pretty
easily. First, use the tidy=FALSE
argument to return the
unparsed JSON string, then uses jsonlite to convert that data to a
nested list, then tibblify to convert to a nested dataframe!
raw_data <- state_summary(organization_id = "TDECWR",
reporting_cycle = "2016",
tidy = FALSE)
list_data <- jsonlite::fromJSON(raw_data,
simplifyVector = FALSE,
simplifyDataFrame = FALSE,
flatten = FALSE)
df <- tibblify::tibblify(list_data$data,
unspecified = "drop")
#> The spec contains 1 unspecified field:
#> • reportingCycles->combinedCycles
df$reportingCycles
#> # A tibble: 1 × 2
#> reportingCycle waterTypes
#> <chr> <list<tibble[,3]>>
#> 1 2016 [4 × 3]