Overview


This document was prepared for the virtual workshop, Georeferencing for Paleo: Refreshing the approach to fossil localities. Our goal here is to explore the georeferenced data that paleontological collections are currently providing to biodiversity data aggregators, namely iDigBio and GBIF. In particular, we want to know…

  1. How prevalent is georeferencing in our data? (Workshop Day 1)
  2. What standard terms are in use? (Workshop Day 1)
  3. How are standard terms being used? (Workshop Day 2)
# Load core libraries; install these packages if you have not already
library(ridigbio)
library(tidyverse)
library(wordcloud)

# Load library for making nice HTML output
library(kableExtra)

What data are we looking at?

Data in this example (unless otherwise noted) was downloaded from iDigBio on 2020-02-04 using the query: basisofrecord = “fossilspecimen.” A data download from iDigBio includes both the raw data, as published by the data provider (e.g. the collection), and a second version of the same data which has been processed by iDigBio. You can learn more about what the difference between raw and processed recordsets contained in an iDigBio data download in this blog post.

# Read into R the raw occurrence data, which should be whatever was published by
# the data provider (e.g. the collection)
raw_idb <- read_csv("4336327f-dae0-4877-9d6d-460cb3a6ef13/occurrence_raw.csv", 
                    na = character(),
                    col_types = cols())

# Read into R the version of occurrence data processed by iDigBio
processed_idb <- read_csv("4336327f-dae0-4877-9d6d-460cb3a6ef13/occurrence.csv", 
                          na = character(),
                          col_types = cols())

# Count how many total records are present in `processed_idb`
records_total <- nrow(processed_idb)

# Count how many records are georeferenced in `processed_idb`
records_georef <- processed_idb %>% 
  filter(`idigbio:geoPoint` != "") %>% 
  nrow()

Our data here are comprised of 57 provider datasets representing a total of 5,569,112 specimen records.

Example of what the raw provider data look like

 

coreid aec:associatedTaxa dc:rights dcterms:accessRights dcterms:bibliographicCitation dcterms:language dcterms:license dcterms:modified dcterms:references dcterms:rights dcterms:rightsHolder dcterms:source dcterms:type dwc:Identification dwc:MeasurementOrFact dwc:ResourceRelationship dwc:VerbatimEventDate dwc:acceptedNameUsage dwc:acceptedNameUsageID dwc:accessRights dwc:associatedMedia dwc:associatedOccurrences dwc:associatedOrganisms dwc:associatedReferences dwc:associatedSequences dwc:associatedTaxa dwc:basisOfRecord dwc:bed dwc:behavior dwc:catalogNumber dwc:class dwc:classs dwc:collectionCode dwc:collectionID dwc:continent dwc:coordinatePrecision dwc:coordinateUncertaintyInMeters dwc:country dwc:countryCode dwc:county dwc:dataGeneralizations dwc:datasetID dwc:datasetName dwc:dateIdentified dwc:day dwc:decimalLatitude dwc:decimalLongitude dwc:disposition dwc:dynamicProperties dwc:earliestAgeOrLowestStage dwc:earliestEonOrLowestEonothem dwc:earliestEpochOrLowestSeries dwc:earliestEraOrLowestErathem dwc:earliestPeriodOrLowestSystem dwc:endDayOfYear dwc:establishmentMeans dwc:eventDate dwc:eventID dwc:eventRemarks dwc:eventTime dwc:family dwc:fieldNotes dwc:fieldNumber dwc:footprintSRS dwc:footprintSpatialFit dwc:footprintWKT dwc:formation dwc:genus dwc:geodeticDatum dwc:geologicalContextID dwc:georeferenceProtocol dwc:georeferenceRemarks dwc:georeferenceSources dwc:georeferenceVerificationStatus dwc:georeferencedBy dwc:georeferencedDate dwc:group dwc:habitat dwc:higherClassification dwc:higherGeography dwc:higherGeographyID dwc:highestBiostratigraphicZone dwc:identificationID dwc:identificationQualifier dwc:identificationReferences dwc:identificationRemarks dwc:identificationVerificationStatus dwc:identifiedBy dwc:individualCount dwc:informationWithheld dwc:infraspecificEpithet dwc:institutionCode dwc:institutionID dwc:island dwc:islandGroup dwc:kingdom dwc:language dwc:latestAgeOrHighestStage dwc:latestEonOrHighestEonothem dwc:latestEpochOrHighestSeries dwc:latestEraOrHighestErathem dwc:latestPeriodOrHighestSystem dwc:lifeStage dwc:lithostratigraphicTerms dwc:locality dwc:locationAccordingTo dwc:locationID dwc:locationRemarks dwc:lowestBiostratigraphicZone dwc:materialSampleID dwc:maximumDepthInMeters dwc:maximumElevationInMeters dwc:member dwc:minimumDepthInMeters dwc:minimumElevationInMeters dwc:modified dwc:month dwc:municipality dwc:nameAccordingTo dwc:namePublishedIn dwc:namePublishedInID dwc:namePublishedInYear dwc:nomenclaturalCode dwc:nomenclaturalStatus dwc:occurrenceDetails dwc:occurrenceID dwc:occurrenceRemarks dwc:occurrenceStatus dwc:order dwc:organismID dwc:organismName dwc:organismQuantity dwc:organismQuantityType dwc:organismRemarks dwc:originalNameUsage dwc:originalNameUsageID dwc:otherCatalogNumbers dwc:ownerInstitutionCode dwc:parentNameUsage dwc:phylum dwc:pointRadiusSpatialFit dwc:preparations dwc:previousIdentifications dwc:recordNumber dwc:recordedBy dwc:reproductiveCondition dwc:rights dwc:rightsHolder dwc:sampleSizeValue dwc:samplingEffort dwc:samplingProtocol dwc:scientificName dwc:scientificNameAuthorship dwc:scientificNameID dwc:sex dwc:specificEpithet dwc:startDayOfYear dwc:stateProvince dwc:subgenus dwc:taxonID dwc:taxonRank dwc:taxonRemarks dwc:taxonomicStatus dwc:typeStatus dwc:verbatimCoordinateSystem dwc:verbatimCoordinates dwc:verbatimDepth dwc:verbatimElevation dwc:verbatimEventDate dwc:verbatimLatitude dwc:verbatimLocality dwc:verbatimLongitude dwc:verbatimSRS dwc:verbatimTaxonRank dwc:vernacularName dwc:waterBody dwc:year gbif:Identifier gbif:Reference idigbio:recordId symbiota:recordEnteredBy symbiota:verbatimScientificName zan:ChronometricDate
3ee2f19f-046f-4c52-ab31-f9b42ed12a89 NA NA 2011-05-09 00:00:00 NA NA NA NA NA NA NA NA NA NA NA FossilSpecimen NA NA Lzzz/4510 NA Fossil NA Asia NA NA Indonesia NA NA NA NA NA NA NA NA NA NA Cervidae NA NA NA NA Axis NA NA NA NA NA NA NA NA MZLU NA NA NA NA NA Sangiran NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA MZLU:Fossil:Lzzz/4510 Artiodactyla NA NA NA NA NA NA NA NA NA Skeletal part(s) NA NA NA NA NA NA NA Axis sp NA NA NA Java NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
9702a3d1-810a-4f9a-b9e9-7bc04f54f7f4 NA NA Open Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj Paramys (YPM VP 059011) en http://creativecommons.org/publicdomain/zero/1.0/ 2017-03-28 16:45:37 http://collections.peabody.yale.edu/search/Record/YPM-VP-059011 Yale Peabody Museum of Natural History NA PhysicalObject NA NA NA NA NA NA NA NA NA NA FossilSpecimen NA NA YPM VP 059011 Mammalia NA VP NA North America NA NA USA NA Coordinate data unavailable NA 10 NA NA Eocene Tertiary NA NA 1963-06-10 NA NA NA Ischyromyidae NA 63-188 NA NA NA Willwood Fm Paramys NA NA NA Animalia; Chordata; Vertebrata; Amniota; Mammalia; Theriiformes—–Theria-Placentalia-Epitheria; Preptotheria-Anagalida-Simplicidentata; Rodentia; Sciuromorpha; Ischyromyoidea; Ischyromyidae; Paramyinae North America; USA; Wyoming NA NA NA NA NA 1 YPM NA NA NA Animalia NA NA NA NA NA NA NA NA NA NA 6 NA NA NA NA ICZN NA NA urn:uuid:004bd82e-de14-4917-8d21-ab9dcb39b2fb jaw fragment with tooth, 2 jaw fragments with incisors, 1 incisor, 1 incisor fragment; VP number 59011; lot count 1 Rodentia NA NA NA NA NA NA NA YPM NA Chordata NA Paramys NA Yale 1963 Wyoming (Willwood) Expedition, Yale 1963 Wyoming (Willwood) Expedition NA NA NA NA NA NA Paramys Leidy, 1871 NA NA NA Wyoming NA Genus Fossils, Rocks and Minerals: Fossils - Vertebrates NA NA NA NA NA NA NA NA NA NA NA squirrels; rodents; mammals; vertebrates; chordates; animals NA 1963 NA NA NA NA NA
07da9e61-2e81-4eb4-b7c1-74c1cac96630 NA NA Open Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj Deinonychus antirrhopus (YPM VP 059012) en http://creativecommons.org/publicdomain/zero/1.0/ 2017-03-22 15:31:23 http://collections.peabody.yale.edu/search/Record/YPM-VP-059012 Yale Peabody Museum of Natural History NA PhysicalObject NA NA NA NA NA NA NA NA NA NA FossilSpecimen NA NA YPM VP 059012 Reptilia NA VP NA NA NA NA NA NA NA NA NA NA NA NA NA Dromaeosauridae NA NA NA NA Deinonychus NA NA NA Animalia; Chordata; Vertebrata; Amniota; Reptilia; Diapsida; Archosauria; Saurischia; Theropoda; Dromaeosauridae NA NA NA NA NA 1 YPM NA NA NA Animalia NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ICZN NA NA urn:uuid:8a42b32c-5934-43c6-8c61-be863403fc55 Composite left manus for Teaching Collection, see notes for list of elements; VP number 59012; lot count 1 Saurischia NA NA NA NA NA NA NA YPM NA Chordata NA cast Deinonychus antirrhopus NA NA NA NA NA NA NA Deinonychus antirrhopus Ostrom, 1969 NA NA antirrhopus NA NA Species Fossils, Rocks and Minerals: Fossils - Vertebrates NA NA NA NA NA NA NA NA NA NA NA raptors; dinosaurs; Reptiles; vertebrates; chordates; animals NA NA NA NA NA NA NA
05698b27-d162-4628-92e9-3153ff67a6ab NA NA Open Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj Rodentia (YPM VP 059002) en http://creativecommons.org/publicdomain/zero/1.0/ 2017-03-28 16:17:01 http://collections.peabody.yale.edu/search/Record/YPM-VP-059002 Yale Peabody Museum of Natural History NA PhysicalObject NA NA NA NA NA NA NA NA NA NA FossilSpecimen NA NA YPM VP 059002 Mammalia NA VP NA North America NA NA USA NA Big Horn County Coordinate data unavailable NA 18 NA NA Eocene Tertiary NA NA 1963-06-18 NA NA NA NA 370 NA NA NA Willwood Fm NA NA NA Animalia; Chordata; Vertebrata; Amniota; Mammalia; Theriiformes—–Theria-Placentalia-Epitheria; Preptotheria-Anagalida-Simplicidentata; Rodentia North America; USA; Wyoming; Big Horn County NA NA NA NA NA 1 YPM NA NA NA Animalia NA NA NA NA NA NA NA NA NA NA 6 NA NA NA NA ICZN NA NA urn:uuid:e3c74dca-f079-4078-90f0-299b3208cf18 jaw fragment with teeth; VP number 59002; lot count 1 Rodentia NA NA NA NA NA NA NA YPM NA Chordata NA Rodentia NA Yale 1963 Wyoming (Willwood) Expedition, Yale 1963 Wyoming (Willwood) Expedition NA NA NA NA NA NA Rodentia Bowdich, 1821 NA NA NA Wyoming NA Order Fossils, Rocks and Minerals: Fossils - Vertebrates NA NA NA NA NA NA NA NA NA NA NA rodents; mammals; vertebrates; chordates; animals NA 1963 NA NA NA NA NA
8c826bb5-ba30-4357-b119-18b24541a02c NA NA NA http://ucmpdb.berkeley.edu/cgi/ucmp_query2?spec_id=V285838&one=T http://vertnet.org/resources/norms.html NA NA NA NA NA NA NA NA NA NA NA FossilSpecimen NA NA 285838 Reptilia NA V NA North America NA NA United States NA Apache County NA NA NA NA Mesozoic Late Triassic Mesozoic Triassic NA NA NA NA NA Stagonolepididae NA NA NA NA Chinle Acaenosuchus -7308 NA NA NA NA Late Triassic NA NA NA NA Location data available to qualified researchers on request. UCMP NA NA NA Animalia NA Mesozoic Late Triassic Mesozoic Triassic NA Saint Johns 2 NA -7308 NA Late Triassic NA NA NA NA NA NA NA NA NA NA NA ICZN NA NA urn:catalog:UCMP:V:285838 Aetosauria NA NA NA NA NA NA NA NA NA transverse process and osteoderms tip NA NA NA NA NA NA NA Acaenosuchus geoffreyi NA NA geoffreyi NA Arizona NA species NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
f26beca1-32ab-4c87-bc70-57af70aac9c8 NA NA NA http://ucmpdb.berkeley.edu/cgi/ucmp_query2?spec_id=V285929&one=T http://vertnet.org/resources/norms.html NA NA NA NA NA NA NA NA NA NA NA FossilSpecimen NA NA 285929 Amphibia NA V NA North America NA NA United States NA Apache County NA NA NA NA Mesozoic Late Triassic Mesozoic Triassic NA NA NA NA NA Metoposauridae NA NA NA NA Chinle -7308 NA NA NA NA Late Triassic NA NA NA NA Location data available to qualified researchers on request. UCMP NA NA NA Animalia NA Mesozoic Late Triassic Mesozoic Triassic NA Saint Johns 2 NA -7308 NA Late Triassic NA NA NA NA NA NA NA NA NA NA NA ICZN NA NA urn:catalog:UCMP:V:285929 Temnospondyli NA NA NA NA NA NA NA NA NA skull fragment NA Camp, C.L. NA NA NA NA NA NA Metoposauridae NA NA NA Arizona NA family NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

   

Example of what the data look like after being processed by iDigBio

 

coreid idigbio:associatedsequences idigbio:barcodeValue dwc:basisOfRecord dwc:bed gbif:canonicalName dwc:catalogNumber dwc:class dwc:collectionCode dwc:collectionID idigbio:collectionName dwc:recordedBy dwc:vernacularName idigbio:commonnames dwc:continent dwc:coordinateUncertaintyInMeters dwc:country idigbio:isoCountryCode dwc:county idigbio:eventDate idigbio:dateModified idigbio:dataQualityScore dwc:earliestAgeOrLowestStage dwc:earliestEonOrLowestEonothem dwc:earliestEpochOrLowestSeries dwc:earliestEraOrLowestErathem dwc:earliestPeriodOrLowestSystem idigbio:etag dwc:eventDate dwc:family dwc:fieldNumber idigbio:flags dwc:formation dwc:genus dwc:geologicalContextID idigbio:geoPoint dwc:group idigbio:hasImage idigbio:hasMedia dwc:higherClassification dwc:highestBiostratigraphicZone dwc:individualCount dwc:infraspecificEpithet dwc:institutionCode dwc:institutionID idigbio:institutionName dwc:kingdom dwc:latestAgeOrHighestStage dwc:latestEonOrHighestEonothem dwc:latestEpochOrHighestSeries dwc:latestEraOrHighestErathem dwc:latestPeriodOrHighestSystem dwc:lithostratigraphicTerms dwc:locality dwc:lowestBiostratigraphicZone dwc:maximumDepthInMeters dwc:maximumElevationInMeters idigbio:mediarecords dwc:member dwc:minimumDepthInMeters dwc:minimumElevationInMeters dwc:municipality dwc:occurrenceID dwc:order dwc:phylum idigbio:recordIds dwc:recordNumber idigbio:recordset dwc:scientificName dwc:specificEpithet dwc:startDayOfYear dwc:stateProvince dwc:taxonID dwc:taxonomicStatus dwc:taxonRank dwc:typeStatus idigbio:uuid dwc:verbatimEventDate dwc:verbatimLocality idigbio:version dwc:waterBody
3ee2f19f-046f-4c52-ab31-f9b42ed12a89 NA NA fossilspecimen NA axis lzzz/4510 mammalia fossil NA NA asia NA indonesia idn NA 2017-06-28 13:15:02 0.1594203 82681e6ba5c73210b7791c287c81de2850d670e6 cervidae [“dwc_taxonrank_added”, “dwc_phylum_added”, “dwc_scientificnameauthorship_added”, “dwc_taxonomicstatus_added”, “gbif_genericname_added”, “dwc_datasetid_added”, “dwc_parentnameusageid_added”, “dwc_taxonid_added”, “idigbio_isocountrycode_added”, “gbif_canonicalname_added”, “gbif_taxon_corrected”, “dwc_class_added”, “dwc_kingdom_added”] axis FALSE FALSE NA mzlu NA NA animalia sangiran NA NA NA NA mzlu:fossil:lzzz/4510 artiodactyla chordata [“858a7761-82a5-47df-8e8a-dbc8806cf424\mzlu:fossil:lzzz/4510”] NA 858a7761-82a5-47df-8e8a-dbc8806cf424 axis sp NA java 8535967 doubtful genus 3ee2f19f-046f-4c52-ab31-f9b42ed12a89 NA NA NA NA
9702a3d1-810a-4f9a-b9e9-7bc04f54f7f4 NA NA fossilspecimen NA paramys ypm vp 059011 mammalia vp NA NA yale 1963 wyoming (willwood) expedition, yale 1963 wyoming (willwood) expedition squirrels; rodents; mammals; vertebrates; chordates; animals [“squirrels; rodents; mammals; vertebrates; chordates; animals”] north america NA united states usa 1963-06-10 2017-12-06 14:53:16 0.3478261 eocene tertiary ab063f634bbd55f12925d73fbafebaadc9cae97d 1963-06-10 ischyromyidae 63-188 [“dwc_country_replaced”, “idigbio_isocountrycode_added”, “gbif_canonicalname_added”, “dwc_taxonomicstatus_added”, “gbif_genericname_added”, “dwc_datasetid_added”, “gbif_taxon_corrected”, “dwc_parentnameusageid_added”, “dwc_taxonid_added”] willwood fm paramys FALSE FALSE animalia; chordata; vertebrata; amniota; mammalia; theriiformes—–theria-placentalia-epitheria; preptotheria-anagalida-simplicidentata; rodentia; sciuromorpha; ischyromyoidea; ischyromyidae; paramyinae 1 ypm NA NA animalia NA NA NA NA urn:uuid:004bd82e-de14-4917-8d21-ab9dcb39b2fb rodentia chordata [“0220907a-0463-4ae0-8a0b-77f5e80fff40\urn:uuid:004bd82e-de14-4917-8d21-ab9dcb39b2fb”] NA 0220907a-0463-4ae0-8a0b-77f5e80fff40 paramys 161 wyoming 4828164 accepted genus 9702a3d1-810a-4f9a-b9e9-7bc04f54f7f4 NA NA NA NA
07da9e61-2e81-4eb4-b7c1-74c1cac96630 NA NA fossilspecimen NA deinonychus antirrhopus ypm vp 059012 reptilia vp NA NA raptors; dinosaurs; reptiles; vertebrates; chordates; animals [“raptors; dinosaurs; Reptiles; vertebrates; chordates; animals”] NA NA 2017-12-06 14:53:16 0.1884058 27d07c44df90c0d9d64f5645bf540cb2d69bc4f3 dromaeosauridae [“gbif_canonicalname_added”, “dwc_taxonomicstatus_added”, “gbif_genericname_added”, “dwc_datasetid_added”, “gbif_taxon_corrected”, “dwc_parentnameusageid_added”, “dwc_taxonid_added”, “gbif_vernacularname_added”, “dwc_scientificnameauthorship_replaced”] deinonychus FALSE FALSE animalia; chordata; vertebrata; amniota; reptilia; diapsida; archosauria; saurischia; theropoda; dromaeosauridae 1 ypm NA NA animalia NA NA NA NA urn:uuid:8a42b32c-5934-43c6-8c61-be863403fc55 saurischia chordata [“0220907a-0463-4ae0-8a0b-77f5e80fff40\urn:uuid:8a42b32c-5934-43c6-8c61-be863403fc55”] NA 0220907a-0463-4ae0-8a0b-77f5e80fff40 deinonychus antirrhopus antirrhopus NA 4966355 accepted species 07da9e61-2e81-4eb4-b7c1-74c1cac96630 NA NA NA NA
05698b27-d162-4628-92e9-3153ff67a6ab NA NA fossilspecimen NA ypm vp 059002 mammalia vp NA NA yale 1963 wyoming (willwood) expedition, yale 1963 wyoming (willwood) expedition rodents; mammals; vertebrates; chordates; animals [“rodents; mammals; vertebrates; chordates; animals”] north america NA united states usa big horn county 1963-06-18 2017-12-06 14:53:16 0.3768116 eocene tertiary 1edf097413b16fc09bb889804599f2b4cc6a37bd 1963-06-18 370 [“dwc_country_replaced”, “idigbio_isocountrycode_added”] willwood fm FALSE FALSE animalia; chordata; vertebrata; amniota; mammalia; theriiformes—–theria-placentalia-epitheria; preptotheria-anagalida-simplicidentata; rodentia 1 ypm NA NA animalia NA NA NA NA urn:uuid:e3c74dca-f079-4078-90f0-299b3208cf18 rodentia chordata [“0220907a-0463-4ae0-8a0b-77f5e80fff40\urn:uuid:e3c74dca-f079-4078-90f0-299b3208cf18”] NA 0220907a-0463-4ae0-8a0b-77f5e80fff40 rodentia 169 wyoming NA order 05698b27-d162-4628-92e9-3153ff67a6ab NA NA NA NA
8c826bb5-ba30-4357-b119-18b24541a02c NA NA fossilspecimen NA acaenasuchus geoffreyi 285838 reptilia v NA NA north america NA united states usa apache county NA 2017-07-10 22:25:29 0.3913043 mesozoic late triassic mesozoic triassic 5425ef28de328df8942f7bd5bc44c5395afd9985 stagonolepididae [“dwc_phylum_added”, “dwc_scientificnameauthorship_added”, “dwc_taxonomicstatus_added”, “gbif_genericname_added”, “dwc_datasetid_added”, “gbif_taxon_corrected”, “dwc_taxonid_added”, “idigbio_isocountrycode_added”, “gbif_canonicalname_added”, “dwc_parentnameusageid_added”, “dwc_genus_replaced”] chinle acaenasuchus -7308 FALSE FALSE late triassic NA ucmp NA NA animalia mesozoic late triassic mesozoic triassic saint johns 2 late triassic NA NA NA NA urn:catalog:ucmp:v:285838 aetosauria chordata [“5ab348ab-439a-4697-925c-d6abe0c09b92\urn:catalog:ucmp:v:285838”] NA 5ab348ab-439a-4697-925c-d6abe0c09b92 acaenosuchus geoffreyi geoffreyi NA arizona 4967763 accepted species 8c826bb5-ba30-4357-b119-18b24541a02c NA NA NA NA
f26beca1-32ab-4c87-bc70-57af70aac9c8 NA NA fossilspecimen NA 285929 amphibia v NA NA camp, c.l. north america NA united states usa apache county NA 2017-07-10 22:25:29 0.4492754 mesozoic late triassic mesozoic triassic 2ad7cfef0c826acd4d4732229ebb9c998c276c4e metoposauridae [“idigbio_isocountrycode_added”] chinle -7308 FALSE FALSE late triassic NA ucmp NA NA animalia mesozoic late triassic mesozoic triassic saint johns 2 late triassic NA NA NA NA urn:catalog:ucmp:v:285929 temnospondyli [“5ab348ab-439a-4697-925c-d6abe0c09b92\urn:catalog:ucmp:v:285929”] NA 5ab348ab-439a-4697-925c-d6abe0c09b92 metoposauridae NA arizona NA family f26beca1-32ab-4c87-bc70-57af70aac9c8 NA NA NA NA

1. How prevalent is georeferencing in our data??


Of these records, 42.2% are georeferenced. The majority of this georeferencing has been done in the recent past.

# Collate data about when records were georeferenced, based on data provided
# in the column `data.dwc:georeferencedDate`
georef_timeline <- raw_idb %>% 
  select(`dwc:georeferencedDate`) %>% 
  filter(!is.na(`dwc:georeferencedDate`) & `dwc:georeferencedDate` != "") %>%
  mutate(date = lubridate::as_date(`dwc:georeferencedDate`)) %>% 
  mutate(year1 = lubridate::year(date)) %>% 
  mutate(year2 = case_when(is.na(year1) ~ `dwc:georeferencedDate`)) %>% 
  unite(year, c(year1, year2), sep = " ", na.rm = TRUE) %>% 
  mutate(year = str_trim(str_replace(year, "NA", ""))) %>% 
  group_by(year) %>% 
  tally() %>%
  filter(nchar(year) == 4 & year > 2000 & year < 2021)

# Plot `georef_timeline`
 ggplot(georef_timeline, aes(x = year, y = n)) + 
   geom_bar(stat = "identity", fill = "steelblue") +
   ggtitle("Timeline of when paleo records on iDigBio were georeferenced") +
   xlab("Year") +
   ylab("Number of records")


2. What standard terms are in use?


At the recordset level

Data for the figure below were downloaded from GBIF on 2020-04-23 using the query: basisofrecord = “fossil” (doi.org/10.15468/dl.7nnj39). This dataset includes 1,1665,493 specimen records provided by >90 collections.