Disability Inclusion in Datasets

Team:

This is a project conducted by the IAM Lab at the University of Maryland, College Park. I work with my advisor, Hernisa Kacorri and my fellow lab members, Utkarsh Dwivedi, Lining Wang, Kyungjun Lee, and Farnaz Zamiri Zeraati.

Purpose:

Datasets and data sharing play an important role for innovation, benchmarking, and mitigating bias. However, there is a scarcity of data in accessibility due to smaller populations, disparate characteristics, lack of expertise for data annotation, as well as privacy and ethical concerns. This raises practical challenges in developing technologies that work for diverse user populations.

The mission of this project is to address the issues of inclusivity in data-driven methods and technologies for people with disabilities and promote responsible practices for stewarding datasets sourced from these groups.

Execution:

As a first step, we deployed a data surfacing repository called IncluSet (Official Site) for the accessibility community to easily discover and link to accessibility datasets. The datasets were manually located by our research . The repository stores metadata about where the datasets can be found, the populations represented, data types, or technology used. None of the datasets is stored in our servers. A download link is only included for the datasets that are publicly available. Datasets that are available upon request include an email that data creators have indicated and a link to a webpage describing the data when available.

We have moved to our next step, which is to deepen our understanding of data sharing practices as well as the perspectives of data contributors from the disability community. We analyzed the datasets located in our IncluSet repository by their metadata, including the populations represented, data types, project source/funding, and data sharing methods. We aim to observe some insightful patterns of existing data sharing practices, such as whether dataset creators choosing to provide direct download link, share upon request, or provide no information on dataset sharing vary by their target population groups.

Rie Kamikubo, Kyungjun Lee, and Hernisa Kacorri, “Contributing to Accessibility Datasets: Reflections on Sharing Study Data by Blind People”, In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI ‘23).

Rie Kamikubo, Lining Wang, Crystal Marte, Amnah Mahmood, and Hernisa Kacorri, “Data Representativeness in Accessibility Datasets: A Meta-Analysis”, In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ‘22).

Rie Kamikubo, “Facilitating Sharing and Re-use of Accessibility Datasets: Benefits and Risks”, ACM SIGACCESS Accessibility and Computing, Issue 132, Article 4, 2022.

Rie Kamikubo, Utkarsh Dwivedi, and Hernisa Kacorri, “Sharing Practices for Datasets Related to Accessibility and Aging”, In Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’21). *Simmona Simmons Best Student Paper on Diversity Award

Hernisa Kacorri, Utkarsh Dwivedi, Rie Kamikubo, “Data Sharing in Wellness, Accessibility, and Aging”, In the NeurIPS Workshop on Dataset Curation and Security, 2020.

Funding:

This project is supported by the National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR, ACL, HHS #90REGE0008).

Disability Inclusion in Datasets

Team:

Purpose:

Execution:

Related Publications:

Funding: