NADDI

Permanent URI for this collection

Browse

Recent Submissions

  • Publication
    Mapping DDI2 to DDI4
    (2019-05-06) Hoyle, Larry; Wackerow, Joachim
    This poster describes the effort to add a DDI-Codebook (DDI-C) import function into the DDI4R R package. The DDI4 Codebook Group did a lot of the modeling of one section of DDI4 using a spreadsheet mapping DDI-C elements into DDI4 properties. This started with a list of elements used by CESSDA and was refined at the May 2016 Knutholmen Sprint. Unfortunately, these mappings were not always at the leaf node level. An R program also imported DDI-C XML from the European Social Survey and generated a list of unique XPaths of leaf elements used in that set of metadata. These elements, along with corresponding DDI4 leaf paths, were used to update the spreadsheet. This spreadsheet has been further refined to create an actionable table mapping DDI-C leaf values to leaf properties in DDI4. Writing code to import the DDI-C required additional information: • mapping from DDI-C sub-paths to DDI4 Identifiable classes (e.g. all the information for one DDI-C “var” maps to one DDI4 IdentifiableVariable), • mapping abstract target classes to specific extensions, • additional semantic property values like “typeOfMethodology”. Importing DDI-C into a lifecycle level version of DDI like DDI4 also involves identifying repeated metadata like reused value domains (e.g. reused Likert style codelists) that are repeated for multiple variables. An R function served to do this sort of matching using the R “all.equal” function excluding differences in agency, id, and version.
  • Publication
    DDI4 in R – New Possibilities Operations on Metadata
    (2019-05-06) Hoyle, Larry; Wackerow, Joachim
    We have been working on a representation of the DDI4 model in R, realized as a package. We now have an R object oriented class for each DDI4 class with associated functions to validate and print objects, manage a registry of DDI4 objects, manage DDI4 URNs, and import & export DDI4 XML. Our original goal was to enhance the ability of researchers to capture and report on metadata at the source, with the ability to embed references to metadata. In this presentation we’ll discuss the intriguing prospect of computing directly on the metadata. What could be done with these metadata objects to facilitate comparison and harmonization? In what ways could the metadata be visualized? DDI4 has powerful new capabilities in the Collections pattern. For classes realizing a collection, operators could be defined to return their intersection, union, and difference. Inner and outer joins could also be defined. Relationships within the collection could be visualized via network diagrams. These operators might provide efficient tools for harmonization. Operators could also be defined on pairs of objects of the same class. Similarity measures could be computed via corresponding attributes. These could be used to create visualizations of, say, similarities among variables.
  • Publication
    Sample Use Cases for the DataDictionary View in DDI Views (DDI4)
    (2017-04-06) Gillman, Dan; Gregory, Arofan; Hoyle, Larry; Wenzig, Knut
    The DDI Moving Forward project (DDI-4) is the effort to modernize the way DDI is managed. Through the use of UML (Unified Modeling Language), a software independent representation of DDI is being developed and maintained. Compatibility with the older versions of DDI, DDI 2.x (Codebook) and DDI 3.x (Lifecycle), is a requirement. So, XML and RDF bindings to the UML model are being developed. In order to be sure the new model is effective, various test cases are being applied to be sure the resulting XML is efficient. DDI Views (DDI4) has a DataDictionary View which can be used to describe the physical layout of a variety of data files. This presentation presents the use of this view to describe examples of CSV, fixed column, segmented fixed column, aggregate, and hierarchical files. Example datasets are drawn from the Australian Election Study, 2013 and the U.S. Census 2000 Public Use Microdata Sample. The presentation will include a brief tour of the DDI Views model, descriptions of the classes and attributes useful for describing a physical file, and examples of the XML used to describe each type of file. We will also discuss the description of event data, data where an event is the unit of observation. DDI Views will need some new features to be able to deal with this flexible type of data. We will describe the additions needed to DDI Views necessary for describing event data .This presentation is based on work done at Schloss Dagstuhl event 16433, October 23 through 28 2016. http://www.dagstuhl.de/de/programm/kalender/evhp/?semnr=16433
  • Publication
    Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files
    (2013-04-03) Block, William C.; Williams, Jeremy; Vilhuber, Lars; Lagoze, Carl; Brown, Warren; Abowd, John
    The US federal statistical system produces prodigious amounts of public and restricted use data. The restricted use data can be difficult to interact with due to poor documentation. Documentation across agencies and the public use/restricted use divide that has been produced does not adhere to a single standard, making the metadata useful, but insular. The Data Documentation Initiative (DDI) is an emerging metadata standard that is used internationally to describe data in the social sciences. It has the potential to unify the metadata managed by separate organizations into a comprehensive searchable set. Researchers from the Labor Dynamics Institute, in collaboration with the Cornell Institute for Social and Economic Research (CISER) received funding from the National Science Foundation to improve the documentation of federal statistical system data with the goal of making it more discoverable, accessible and understandable for scientific research. The scope of this paper is a subset of the overall project, and reports on development of the web interface for user searches and the search API. The primary data model being utilized in this application is DDI 2.5 (Codebook), which contains elements and attributes to describe the contents of a data set.
  • Publication
    The Paradata Information Model
    (2013-04-02) Greenfield, Jay; Carpenter, Danielle
    Paradata is data on study processes and the collection of study data. Here we describe the development of a Paradata Information Model (PIM) in support of the National Children¹s Study (NCS) of the Eunice Kennedy Shriver National Institute of Child Health and Development. We propose that paradata can be recorded with accompanying metadata informed by the General Longitudinal Business Process Model (GLBPM) developed by the Data Documentation Initiative (DDI) and the General Statistical Business Process Model (GSBPM). The PIM is be constructed in a joint top-down and bottom-up approach, appropriating broad verbs from DDI, HL7, LS-DAM, and CDISC, while incorporating study-specific processes involved in collecting NCS operational data elements (ODEs). The hope of paradata in longitudinal studies is that the collection of paradata will ensure that future researchers can integrate disparate data sets collected by a variety of technologies, especially in rapidly-evolving fields like genomics. Additionally, by giving PIM elements preconditions and postconditions, we can develop software agents which use paradata metadata as well as other information to assist humans in conducting biomedical research, ultimately facilitating more rapid collection and analysis of information and enabling a broader subset of researchers to discover and extract relevant information from study data sets.
  • Publication
    Administrative Data in the IAB Metadata Management System
    (2013-04-02) Schiller, David; Barkow, Ingo
    The Research Data Centre (FDZ) of the German Federal Employment Agency (BA) at the Institute for Employment Research (IAB) prepares and gives access to research data. Beside survey data the IAB provides data deriving from the administrative processes of the BA. This data is very complex and not easy to understand and use. Good data documentation is crucial for the users. DDI provides a data documentation standard that makes documentation and data sharing easier. The latter is especially important for providers of administrative data because more and more other data types are merged with administrative data. Nevertheless there are also some drawbacks when using the DDI standard. Data collection for administrative data differs from data collection for survey data but DDI was established for survey data. At the same time the description of complex administrative data should be simple as possible. IAB and TBA21 are currently carrying out a project to build a Metadata Management System for IAB. The presentation will highlight the documentation needs for administrative data and show how they are covered in the Management System. In addition the need for DDI profiles, comprehensive software tools and future proofed data documentation for multiple data sources will be depicted.
  • Publication
    Rogatus – a planned open source toolset to cover the whole lifecycle
    (2013-04-02) Barkow, Ingo; Schiller, David
    During the last years several different tools for DDI Lifecycle have been published. Nevertheless none of the current tools is able to cover the full lifecycle from beginning to end. This presentation wants to show a first outlook into Rogatus - an open source toolset currently in development at DIPF with support of GESIS, TBA21, OPIT, Colectica, Alerk Amin and IAB. Rogatus consists of different DDI compliant applications (e.g. Qbee - Questionnaire Builder, Cbee – Case Builder, Tbee – Translation Builder, Mbee – Metadata Builder and Rogatus Portal). Furthermore some components are re-used in other software products (e.g. IAB Metadata Management Portal). This presentation will also show how a final version of Rogatus could be combined with other well-known tools like Colectica or Questasy using DDI as standard for data exchange to show how a survey process from creating a study from the scratch, designing the instruments, performing the data collection, handling the administrative processes, curating the data, disseminating the data, publication and at last data archiving for secondary usage could be handled with individual tools.
  • Publication
    DDI and Relational Databases
    (2013-04-04) Amin, Alerk; Barkow, Ingo
    Although the DDI standard is expressed in XML, many institutions have a requirement or preference to use relational databases (eg. Access, MySQL, Oracle, Postgres) in their applications. This may be because of integration with existing applications, expertise at the institute, or other reasons. This workshop will discuss how to model DDI in a relational database, and the pros/cons of using this approach for application development. Interoperability with other applications, including those based on XML databases, and long term management of applications (including support for multiple versions of DDI) will also be discussed. Prior knowledge of relational databases and SQL is recommended but not required.
  • Publication
    DDI: Metadata to support collection processes, discovery, and comparability
    (2013-04-04) Thomas, Wendy
    DDI-Lifecycle (DDI-L) was designed to support processes and comparability all through the lifecycle of the data. The result is a standard based on planned reuse of metadata that describes the methodologies used, the processes of capturing, cleaning, and modifying data, and the various storage formats of the data itself. These structures enhance its value to the research process in terms of quality control and by supporting a "metadata driven research process". Discovery tools can leverage "reusable" metadata which supports comparability within and between data sets, capturing intended points of comparability within series and collections of data. This workshop will focus on the DDI content that support collection, discovery, and comparability: Questions, data collection instruments, variables, concepts, geography, grouping and comparison.
  • Publication
    DDI: Capturing metadata throughout the research process for preservation and discovery
    (2013-04-04) Thomas, Wendy
    DDI supports two development lines, DDI-Codebook (DDI-C) and DDI-Lifecycle (DDI-L). This workshop provides an overview of the uses of both DDI-C and DDI-L in capturing metadata during the research process and how it is used for preservation and discovery purposes. The focus is on the types of metadata covered by DDI, how they are structured, and how they are used across the data lifecycle. Differences between the two development lines will be highlighted including structural and coverage differences.
  • Publication
    Using SAS to generate DDI-Codebook XML from Information Managed in Excel Spreadsheets
    (2013-04-02) Wright, Philip A.
    DDI-C compliant files are used for two distinct rolls by ICPSR to generate variable documentation from information managed in Excel spreadsheets by the data producer. For completed studies, DDI-C compliant files are used to generate codebooks which include unweighted frequencies. For data in production, DDI-C is used to bulk load questions and variable attributes into a browser-based variable editor. This presentation will describe in moderate detail how SAS is used to generate the major DDI-C XML elements.
  • Publication
    Metadata Portal Project: Using DDI to Enhance Data Access and Dissemination
    (2013-04-03) Vardigan, Mary
    The Inter-university Consortium for Political and Social Research (ICPSR), NORC at the University of Chicago, and the American National Election Studies program in the Center for Political Studies at the University of Michigan’s Institute for Social Research are currently engaged in a new collaborative effort to create a common metadata portal for two of the most important data collections in the U.S. – the American National Election Studies (ANES) and the General Social Survey (GSS). Technical support is provided by Metadata Technology and Integrated Data Management Services. This pilot project, funded by the National Science Foundation, will produce a combined library of machine-actionable DDI metadata for these collections, and demonstrate DDI-based tools for advanced searching, dynamic metadata presentation, and other functions intended to facilitate discovery and analysis of these data. The project will also lay a foundation for developing new metadata-driven workflows for both ANES and GSS. This presentation describes the major phases and deliverables of the project and presents a plan of action, with an emphasis on how the project will benefit the wider community.
  • Publication
    DDI - A Metadata Standard for the Community
    (2013-04-02) Vardigan, Mary; Wackerow, Joachim
    This presentation gives an overview of the primary benefits of DDI like rich content, metadata reuse across the life cycle, and machine-actionability in a global network. Examples of successful adoption are described, along with barriers and challenges to using DDI. The presentation concludes with a summary of future directions for the standard.
  • Publication
    DDI Specification: Current Status and Outlook
    (2013-04-03) Thomas, Wendy; Gregory, Arofan
    An update on current activities and plans from DDI Technical Implementation Committee (TIC) members.
  • Publication
    Colectica for Excel: Using DDI Lifecycle with Spreadsheets
    (2013-04-02) Smith, Dan
    Colectica is a suite of modern metadata management software that is used to document statistical datasets, public opinion and survey research methodologies, and data collection. This demonstration will introduce the new Colectica for Microsoft Excel software, a free tool to document statistical data using open standards. The software implements leading open standards including the Data Documentation Initiative (DDI) Lifecycle version 3 and ISO 11179. Using this software allows organizations to both better educate sponsors and the public on their methodology and increases the organization’s reputation for performing credible scientific research. The free Colectica for Excel tool allows researchers to document their data directly in Microsoft Excel. Variables, Code Lists, and the datasets can be globally identified and described in a standard format. Data can also be directly imported and documented from SPSS and Stata files. The standardized metadata is stored within the Excel files so it will be available to anyone receiving the documented dataset. Code books can also be customized and generated by the tool, and output in PDF, Word, Html, and XSL-FO formats.
  • Publication
    Applying DDI to a Longitudinal Study of Aging
    (2013-04-02) Radler, Barry; Iverson, Jeremy; Smith, Dan
    Midlife in the United States (MIDUS) is a large, multi-disciplinary longitudinal study of aging conducted by the University of Wisconsin. MIDUS researchers want to provide a comprehensive, canonical source of documentation for the research project. To accomplish this, the team took the diverse set of sources that previously documented the MIDUS study and created a standardized, DDI 3-based set of documentation that better enables researchers to discover and use the MIDUS data. This talk will outline the process used to create the DDI 3 documentation, and will demonstrate the resulting documentation and dissemination tools provided by Colectica. The project is a joint effort between MIDUS and Colectica.
  • Publication
    PANEL: DDI and Metadata from the Researcher's Perspective
    (2013-04-02) Thomas, Wendy; Brown, J. Christopher; Nakao, Ron
    This was a panel discussion on lifecycle metadata issues from the researcher's perspective. A lot of focus has been placed on how to integrate DDI into large data collection processes in the world of official statistics, research centers, and long term projects. It makes sense in these areas to talk about the payoff for metadata reuse, developing processes and tools to harvest metadata along a production process, and the value of a software neutral means of capturing and transporting metadata. The question facing academic based data libraries and archives is how to integrate DDI into the smaller, limited time frame, research project. What are the payoffs for the individual researcher? What tools can be provided to support researchers? This panel is designed to gather input from attendees to help answer the following questions: How can the use of DDI throughout the research process help researchers during the process? What needs to be there (tools, processes, informational materials, etc.)? What can data libraries/archives/services do to promote and support DDI use? What is needed from others (Funding agencies, academic departments, computing services, etc.)? What can DDI do to increase the use of DDI within the academic environment?
  • Publication
    Panel - Generic Longitudinal Business Process Model
    (2013-04-03) Barkow, Ingo; Block, William C.; Greenfield, Jay; Hebing, Marcel; Hoyle, Larry; Thomas, Wendy
    This presentation described a model for the processes involved in a longitudinal study. The model was developed at a symposium-style workshop held at Dagstuhl in September of 2011 (http://www.dagstuhl.de/11382). The Generic Longitudinal Business Process Model (GLBPM) emulates the Generic Statistical Business Process Model (GSBPM) (http://www1.unece.org/stat/platform/download/attachments/8683538/GSBPM+Final.pdf?version=1) which, in turn, was developed with DDI Lifecycle in mind. The GLBPM is intended as a generic model that can serve as the basis for informing discussions across organizations conducting longitudinal data collections, and other data collections repeated across time. The model is not intended to drive implementation directly, but may prove useful for those planning a study. An introductory presentation on the model will be followed by a panel discussion.
  • Publication
    Collaborative Markup of Library and Research Data: Examples from OCUL
    (2013-04-02) Leahey, Amber
    This presentation will focus on collaborative efforts to capture, store, and disseminate social science survey data & researcher data across all of Ontario's University Libraries. Together through shared platforms and practices, collaborative markup of data using the Data Documentation Initiative (DDI) standard is possible in order to effectively deliver rich discovery services to users of library and researcher data. An overview of Scholars Portal's data services including the Ontario Data Documentation, Extraction Service and Infrastructure (ODESI), and Dataverse will highlight effective collaborative markup strategies for data.
  • Publication
    DDI-Lifecycle and Colectica at the UCLA Social Science Data Archive
    (2013-04-02) Iverson, Jeremy; Stephenson, Elizabeth
    The UCLA Social Science Data Archive’s mission is to provide a foundation for social science research involving original data collection or the reuse of publicly available studies. Archive staff and researchers work as partners throughout all stages of the research process, beginning when a hypothesis or area of study is being developed, during grant and funding activities, while data collection and/or analysis is ongoing, and finally in long term preservation of research results. Three years ago SSDA began to search for a better repository solution to manage its data, make it more visible, and to support the organization’s disaster plan. SSDA wanted to make it easier for researchers to look for data, to document their data, and use data online. Since the goal is to document the entire lifecycle of a data product, the DDI-Lifecycle standard plays a key role in the solution. This paper explores how DDI-Lifecycle and Colectica can help a data archive with limited staff and resources deliver a rich data documentation system that integrates with other tools to allow researchers to discover and understand the data relevant to their work. The paper will discuss how SSDA and Colectica staff worked together to implement the solution.