Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fetchNASIS('pedons'): use simplifyFragmentData() for surface fragments #216

Merged
merged 6 commits into from
Jan 5, 2022

Conversation

brownag
Copy link
Member

@brownag brownag commented Nov 5, 2021

Closes #46

@brownag
Copy link
Member Author

brownag commented Nov 8, 2021

Quick test with 3 new pedons that only have site + minimal horizon data entered so far.

Before:

library(soilDB)
f <- fetchNASIS()
#> Loading required namespace: odbc
#> NOTE: all records are missing rock fragment volume
#> NOTE: all records are missing artifact volume
names(f)[grepl("surface_", names(f))]
#>                site78                site79                site80 
#>     "surface_fgravel"      "surface_gravel"     "surface_cobbles" 
#>                site81                site82                site83 
#>      "surface_stones"    "surface_boulders"    "surface_channers" 
#>                site84                site85                site86 
#>  "surface_flagstones"  "surface_paragravel" "surface_paracobbles"
f$surface_gravel
#> [1]  5  1 15
f$surface_total_frags_pct
#> NULL

After:

EDIT: fixed ordering! Was using siteiid not peiid;

library(soilDB)
f <- fetchNASIS()
#> Loading required namespace: odbc
#> NOTE: all records are missing rock fragment volume
#> NOTE: all records are missing artifact volume
names(f)[grepl("surface_", names(f))]
#>                         site51                         site52 
#>          "surface_fine_gravel"               "surface_gravel" 
#>                         site53                         site54 
#>              "surface_cobbles"               "surface_stones" 
#>                         site55                         site56 
#>             "surface_boulders"             "surface_channers" 
#>                         site57                         site58 
#>           "surface_flagstones"      "surface_parafine_gravel" 
#>                         site59                         site60 
#>           "surface_paragravel"          "surface_paracobbles" 
#>                         site61                         site62 
#>           "surface_parastones"         "surface_paraboulders" 
#>                         site63                         site64 
#>         "surface_parachanners"       "surface_paraflagstones" 
#>                         site65                         site66 
#>          "surface_unspecified" "surface_total_frags_pct_nopf" 
#>                         site67 
#>      "surface_total_frags_pct"
f$surface_gravel
#> [1]  5  1 15
f$surface_total_frags_pct
#> [1] 24 34 20

Here is a summary of things needing resolution before this can be merged:

  1. Breaking change: Switching to simplifyFragmentData would change the column name used for "fine gravel" from surface_fgravel to surface_fine_gravel
  • I am thinking we should possibly duplicate this column for the next release, using both names, so that there is a clean way to transition reports over to the new (somewhat more consistent) naming without them being broken
  1. Site observation table-related changes
  • a. The note that is printed for all siteobs missing surface fragments will be the same as if all horizons are missing frags. Need to pass an argument for handling different messages: NOTE: all records are missing surface fragments or similar.

  • b. Need better handling for the increasingly likely possibility of multiple siteobs per site, which in this scheme will be aggregated together and warnings only issued when cover % is >100, which means most of the time surface frags could be silently "duplicated" if measured on each visit

Warning message:
fragment volume >= 100%
siteiid:
####### 
  1. In long run I think it would be good to combine the simplifyFragmentData/simplifyArtifactData/"simplifySurfaceFragmentData" to use a single function of which there are three aliases/wrappers. This would incorporate 2a as well as some more generic handling of some of the artifact-related summaries (innocuous, persistent etc.)

@brownag brownag mentioned this pull request Nov 8, 2021
@dylanbeaudette
Copy link
Member

Thanks for making these changes. That TODO has been in there for a long time.

I agree with your suggestions on items 1-3. We may need to craft some example data with multiple siteObs... I hadn't thought about that possibility.

A new sieveCoarseFraction is a good idea. The un-exported helper functions .sieve and .rockFragmentSieve do most of the work right now. It wouldn't be hard to move thresholds into arguments for systems outside of the USDA "ruler".

@brownag
Copy link
Member Author

brownag commented Nov 8, 2021

Edited above post. Fixed an issue with using siteiid and unintentional assumed sorting matching that of peiid. Now the peiid column is returned by the surface frags table query and passed to simplifyFragmentData.

@brownag brownag merged commit 60ca7d7 into master Jan 5, 2022
@brownag brownag deleted the fetchnasispedons-sfrags branch January 7, 2022 18:25
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NASIS surface rf summary should use soilDB::simplifyFragmentData()
2 participants