Difference between revisions of "Digitization"
DeborahPaul (Talk | contribs) (→Data Standards and Mobilization) |
DeborahPaul (Talk | contribs) (→Data Standards and Mobilization) |
||
Line 55: | Line 55: | ||
* [https://www.gbif.org/darwin-core What is Darwin Core, and why does it matter?] | * [https://www.gbif.org/darwin-core What is Darwin Core, and why does it matter?] | ||
* [https://dwc.tdwg.org/terms/ Darwin Core quick reference guide] | * [https://dwc.tdwg.org/terms/ Darwin Core quick reference guide] | ||
− | * [https://github.com/tdwg/dwc-qa/wiki Darwin Core Hour] | + | * [https://github.com/tdwg/dwc-qa/wiki Darwin Core Hour]: check this resource for help with Darwin Core. |
− | : | + | |
* Use the [https://bit.ly/dwchour-input Darwin Core Hour Input Form] for questions, insights and to offer help. | * Use the [https://bit.ly/dwchour-input Darwin Core Hour Input Form] for questions, insights and to offer help. | ||
Revision as of 21:27, 9 April 2019
Contents
- 1 Statement of Purpose
- 2 Contributors
- 3 Digitization Resources
- 3.1 Data Aggregation
- 3.2 Data Aggregators
- 3.3 Data Management
- 3.4 Data Mobilization
- 3.5 Data Standards and Mobilization
- 3.6 Data Transcription
- 3.7 Database Software
- 3.8 Georeferencing
- 3.9 iDigBio Digitization Resources Wiki
- 3.10 Imaging and Media
- 3.11 Key References and Further Reading
- 3.12 Webinars
- 3.13 Workflows
- 3.14 Workshops and Symposia
Statement of Purpose
Realizing the import of collections, SPNHC recognizes the need to collaborate to develop, discover, disseminate and update best (better, current, recommended) practices for creating digital collections resources and publishing them for global access. Materials linked here represent the efforts of many collections data mobilization projects worldwide. All in the collections and standards community are encouraged to contribute.
Defining Digitization
In the context of the SPNHC wiki 'digitize' means converting ALL analog data to digital data according to standard vocabularies such as DarwinCore and AudubonCore. That is, we start with the concept of a specimen that has been accessioned in a collection. We envision these digital data eventually to include the entirety of analog data that are associated with a particular specimen. This may include but is not limited to:
- Text data from labels and ledgers associated with specimens
- Images of specimens
- DNA and other 'omics
- Field notes, images
- Tomographic imaging data
- Specimen history (including preservation)
- Specimen-associated literature
- Collection-level metadata
Digitizing might be accomplished by collections managers, technicians, contractors, and other entities, the results of which are included within the institution's collection management system. In many instances these data may be generated off site by investigators.
The process of digitization has been analyzed by Nelson et al. (2012), and five task clusters that comprise the digitization process leading up to data publication have been identified:
- Pre-digitization curation and staging
- Specimen image capture
- Specimen image processing
- Electronic data capture
- Georeferencing locality descriptions
We expect these groupings to change over time as standards of practice for digitization processes and procedures evolve. For example, it makes sense to add Data mobilization as a task cluster, as after data capture in a local database, the data need to be shared outside the local database.
Contributors
Current content contributors: SPNHC members Breda Zimkus, Jessica Cundiff, Genevieve Tocci, Nicole Fisher, and Deborah Paul. We hope that others will add their names to this list as information is added and updated.
Original digitization page content now found here was generated during The American Society of Ichthyologists and Herpetologists (ASIH) Annual Joint Meeting - 2016, during an iDigBio sponsored workshop by the following individuals participating in the "Digitization" working group of the aforementioned workshop: Gil Nelson (Florida State University, Courtesy Faculty), Larry Page (The Florida Museum of Natural History, Ichthyology Curator), Cristina Cox-Fernandes (UMass Amherst Biology, Adjunct Research Associate Professor), Mark Sabaj (ANSP, Ichthyology Collection Manager), Adam Summers (University of Washington, Professor - Friday Harbor Labs), Kevin Love (iDigBio, IT Expert), Ken Thompson (Lock Haven University, Professor; Retired), Randy Singer (Florida Museum of Natural History), and Gregory Watkins-Colwell (Yale Peabody Museum, Herps and Fishes, Collection Manager).
Digitization Resources
Data Aggregation
Data mobilization (getting the data out of your local collection management database) involves contributing data and media to a designated aggregator/s. These data are then integrated with data from other institutions to provide access to more complete datasets. The aggregation resource scope may be taxonomic-focused (e.g. SCAN), organization or institution-based (e.g. C. V. Starr Virtual Herbarium), regional (e.g. SEINet), national (e.g. the Atlas of Living Australia), global (e.g. GBIF), or otherwise. Aggregating data offers collections unique opportunities to enhance collections data, facilitate discovery, and increase re-use. The following resources introduce the aggregator's point-of-view and what to expect.
- Darwin Core Hour: Aggregators - a Darwin Core View (GBIF and iDigBio)
- Darwin Core Hour: Aggregators - a Darwin Core View (More than Vert)Net
Data Aggregators
Natural history collections commonly contribute to these data aggregators:
Data Management
- The DataONE Data Management Skillbuilding Hub contains resources in data management and includes teaching materials, webinars, and a database of best-practices to improve methods for data sharing and management.
- See this iDigBio Workshop for topics, materials, and presentations relevant to Managing Natural History Collections for Global Discoverability.
- Search all iDigBio for materials tagged data management
Data Mobilization
Consider what needs to be done to get data out of a local collections database and into one or more other online resources. Some of the other categories on this wiki page that relate to this topic are data standards, data management, data aggregation, and workflows. Sharing data is often a cyclic process. Once shared, aggregators provide feedback and collections staff need to evaluate which items to address and how. After updates, data can be published again, with the enhancements.
Data Standards and Mobilization
Darwin Core has become a standard for biodiversity data sharing since its inception as a standard by the organization Biodiversity Information Standards (TDWG; historically known as the Taxonomic Databases Working Group) in 2009. A number of resources exist for its use:
- What is Darwin Core, and why does it matter?
- Darwin Core quick reference guide
- Darwin Core Hour: check this resource for help with Darwin Core.
- Use the Darwin Core Hour Input Form for questions, insights and to offer help.
Data Transcription
Transcription is an essential part of the digitization process but can pose a number of challenges:
- iDigBio Transcription Resources
- DataONE (need link)
Database Software
Those curating natural history collections are currently using a number of different platforms to track data:
Georeferencing
A number of resources pertaining to the process of georeferencing, defining a location using map coordinates and assigning the coordinate system of the map frame, are available:
iDigBio Digitization Resources Wiki
- The iDigBio Digitization Resources wiki page provides resources and information regarding digitization, including training workshops being conducted by iDigBio, digitization information and resources, and links to documents, websites, videos, presentations, and other important information related to biological collection digitization.
Imaging and Media
A number of techniques are available for two-dimensional (2D) and three-dimensional (3D) digitization, including X-ray computed tomography (CT):
- Search iDigBio for all available materials regarding imaging
Key References and Further Reading
- Nelson, G., D. Paul, G. Riccardi, and A.R. Mast. 2012. Five task clusters that enable efficient and effective digitization of biological collections. Zookeys 209:19-45. [1]
- Vollmar, A. J.A. Macklin, and L.S. Ford. Natural History Specimen Digitization: Challenges and Concerns. 2010. Biodiversity Informatics 7:93-112. [2]
- ZooKeys Special Issue (No specimen left behind: mass digitization of natural history collections (2012)
- Search iDigBio for all available digitization materials
Webinars
Access to various webinars regarding digization is available:
- iDigBio Webinars
- Webinars Darwin Core Hours
- See this Digitization Working Group Page for Paleo Webinars
- SCNet Webinars
Workflows
Various general and discipline-specific materials regarding digitization are available via iDigBio:
- Workflows: Perspectives from Peabody Entomology
- Workflows Herbarium Digitization
- Workflows for Digitizing Vertebrate Paleontology
- Workflows at University of North Carolina
- We're Virtually There: Digitizing NTBG's Herbarium Collection (PTBG)
- The Valdosta State Herbarium (VSC) Experience: Mobilizing Small Herbaria for DIgitization
Workshops and Symposia
A number of workshops and conference symposia have focused on the subject of digitization: