Discovering and developing primary biodiversity data from social networking sites
Issue Date
2015-05-31Author
Barve, Vijay V.
Publisher
University of Kansas
Format
98 pages
Type
Dissertation
Degree Level
Ph.D.
Discipline
Geography
Rights
Copyright held by the author.
Metadata
Show full item recordAbstract
An ever-increasing need exists for fine-scale biodiversity occurrence records for a broad variety of research applications in biodiversity and science more generally. Even though large-scale data aggregators like GBIF serve such data in large quantities, major gaps and biases still exist, both in taxonomic coverage and in spatial coverage. To address these gaps, in this dissertation, I explored social networking sites (SNS) as a rich potential source of additional biodiversity occurrence records. In my first chapter, I explored the idea of discovering, extracting, and organizing massive numbers of biodiversity occurrence records now available on SNSs. I presented a proof-of-concept with Flickr as the SNS and Snowy Owls (Bubo scandiacus) and Monarch Butterflies (Danaus plexippus) as target species. The methods presented in this chapter can easily be used for any other SNS, region, or species group. These approaches are broadly applicable to animal and plant groups that are photographed, and that can be identified from photographs with some degree of confidence (e.g., birds, butterflies, cetaceans, orchids, dragonflies, amphibians, and plants). SNS thus offer a rich new source of biodiversity data. To understand the strengths and weaknesses of biodiversity data, we need effective tools by which to explore and visualize these data. I developed a suite of such tools in an R package called bdvis, which is described in chapter two. The package allows users to explore spatial, temporal, and taxonomic dimensions of biodiversity data sets to highlight gaps and identify strengths. In the third chapter, I explored Flickr further as a source of biodiversity data for the birds of the world, to assess the potential of augmenting the largest portal to biodiversity occurrence data, i.e., the Global Biodiversity Information Facility (GBIF). GBIF provides access to ~190 x 106 bird records, compared to ~7 x 106 that I could discover from Flickr, out of which only ~1.3 x 106 were geotagged. However, the Flickr data showed the potential to add to knowledge about birds in terms of geographic, taxonomic, and temporal dimensions, as Flickr data tended to be complementary to the GBIF-derived information. Finally, I developed a case study to investigate the quantity of records existing, and the quality of identifications by users on Flickr. I developed a detailed case study of Indian swallowtail butterflies, and implemented a crowd-sourcing platform to recruit identification expertise and apply it to butterfly photographs from the SNS. Results were encouraging, with 93% correct identities for records of this family of butterflies from across India.
Collections
Items in KU ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
We want to hear from you! Please share your stories about how Open Access to this item benefits YOU.