Carolina Roman Amigo Paul Joseph Richard Arias-Hernandez Context - Directed Study and Co-op Project, co-supervised by a SLAIS faculty and a UBC librarian. - Aim was to explore the potential uses of the Open Collections Research API for infoviz, linked data, and digital humanities projects. - We developed simple web applications providing data visualizations of collections metadata linked to external datasets. Available here: https://ubc-library.github.io - This presentation focus on development process of the UBC Institute of Fisheries Field Records web application. UBC Open Collections https://open.library.ubc.ca
- Aggregates content from UBC
Library's open access digital repositories. - Has a research API that provides machine-readable access to collections metadata and transcripts. UBC Institute of Fisheries Field Records - 11021 field records, over a more than 100 year period. - Contains latitude and longitude metadata. - Each record lists fish species collected in a given location, plus habitat conditions. UBC Institute of Fisheries Field Records - 11021 field records, over a more than 100 year period. - Contains latitude and longitude metadata. - Each record lists fish species collected in a given location, plus habitat conditions. Encyclopedia of Life Aggregates “knowledge of the many life-forms on Earth from books, journals, databases, websites, specimen collections, and in the minds of people everywhere”. How to enhance this collection and make it more accessible to researchers? Link both datasets, enriching OC collection metadata.
Show geographical location of records on a map.
Process Overview Getting metadata Cleaning Reconciling Building Interface Getting the metadata Get data using a php script available on the OC API page.
It returns a metadata txt file for
each record in the collection.
Merged files to one single txt
file using a python script.
Import it to OpenRefine for the
cleaning step. Cleaning The data was cleaned using OpenRefine.
Latitude and longitude had to
be formatted to the standard CARTO understands: "0 30 S"@en --> -0.3
The fish species metadata
needed cleaning and clustering to fix inconsistencies: "Ambassis sp."@en > Ambassis Reconciling Use OpenRefine reconciliation service to link our fish species to EOL species entries.
It suggests the best three
matches for each value for you to review.
It also retrieves EOL id,
useful to build the URIs for linking. Interface building Export the data in a csv format for use in CARTO builder.
Georeference the data in
CARTO.
Aggregate records in the same
location using a SQL query in CARTO.
Customize interface using the
builder. GitHub Repo with step by step process and files https://github.com/ubc-library/open- collections-api-case-fisheries Thank you! carolamigo@gmail.com