Home » Columns, Stats4Good

Considering Ethical Best Practices in Data for Good Projects

1 May 2019 1,369 views No Comment
This column is written for those interested in learning about the world of Data for Good, where statistical analysis is dedicated to good causes that benefit our lives, our communities, and our world. If you would like to know more or have ideas for articles, contact David Corliss.

David CorlissWith a PhD in statistical astrophysics, David Corliss leads a data science team at Fiat Chrysler. He serves on the steering committee for the Conference on Statistical Practice and is the founder of Peace-Work, a volunteer cooperative of statisticians and data scientists providing analytic support for charitable groups and applying statistical methods in issue-driven advocacy.

 

At the ASA’s recent Conference on Statistical Practice, I had the opportunity to speak on a panel discussing ethics. Organized by ASA President-elect Wendy Martinez, the panel was well received, and a group has begun work on future events.

While my presentation recommended Data for Good as one dimension of the work of the ethical statistician, most of the conversation was around data privacy. I was there to talk about when things go right, while my colleagues addressed the pressing need to work out what happens when things go wrong.

These twin concerns—best practices for good and worst practices that do harm—are not entirely separate. As statisticians and data scientists seek opportunities to do good, ethical best practices will become second nature.

This month’s column is an opportunity and invitation to engage in discussions about ethical best practices, both in general and specifically in the context of Data for Good. These thoughts are offered, then, not as answers but as conversation starters—something for all of us to think about and discuss.

Of course, researchers always will want to employ best practices for the ethical use of data. Ethical considerations in Data for Good studies can include the following:

  • Informed consent for the use of privately obtained data
  • Data use consistent with the purposes stated at the time of collection
  • Transfer of data to third parties, including government, law enforcement, and other agencies serving the people affected
  • Data security
  • Data retention
  • Data ownership

While ethical practices are a concern for any statistician, the circumstances and context of Data for Good projects highlight particular concerns. Just as one example, banding birds to track them across future studies is one thing; tracking domestic violence or human trafficking victims across multiple data-bases—including advocacy groups, service providers, and law enforcement—is quite another! One important question is data ownership, which can be complex in our projects.

For example, suppose a news media organization interviews the family of a crime victim and publicly disseminates the report. A Data for Good researcher captures this public report as part of an epidemiological study of that particular type of crime. In this instance, claims to the ownership of the data contained in the news report could be made by the victim, the family, the reporter, the media service, the perpetrator, and the general public.

To help start a conversation, following are practices that could be recommended as part of the researcher’s data governance practices supporting ethical use of personally identifying data used in Data for Good projects:

  • Follow all applicable laws and regulations.
  • Encrypt all data with the potential to identify individuals, either alone or in connection with other data, at rest and in motion.
  • Establish a retention policy and record the date the data were captured.
  • Avoid making unnecessary copies of the data.
  • Treat publicly available sources of data that may be used for purposes of identification in the same manner as privately sourced data.
  • Focus on creating security standards and practices that meet the reasonable expectations of the persons whose data are collected. Compliance with applicable laws is necessary, but may not be sufficient to support ethical practice in all cases.

Storage and use of personal, highly sensitive data require adherence to all applicable laws. Above and beyond legal requirements, the focus in ethical use of data should be on security of personally identifying data and use consistent with the informed reasonable expectations of data owners. Governance recommendations for ethical use further include informed consent, including research purposes and access, use consistent with the purposes stated to stakeholders when the data are obtained, strong data security, encryption in motion and at rest, and adherence to a clearly stated data retention policy.

Another consideration for ethical best practices is that changing technology brings new situations. Ethical questions about the use of data aren’t like some math problems that can be solved and be done. As times, tools, and applications change, new questions need to be addressed. In this ongoing challenge, project directors and principal investigators may want to review older databases to ensure the purposes stated and agreed to at the time of data collection are consistent with new uses.

In modeling ethical best practices, the entire Data for Good community—professionals, students, and volunteers—all need to play a central role. As people striving for the well-being of others, D4G researchers must be advocates and role models for others to see and follow. Just as we are helping to shape the public image of statistics, the Data for Good community is called to be at the forefront of developing ethical practices; asking the hard questions; and advocating for data privacy, security, informed consent, and use consistent with permissions given. How do we go about doing this? Let’s start the conversation …

Get Involved
Gartner Research has put together a great Data for Good web page with a list of organizations, cases studies, and other resources. This is a page people will definitely want to see.

Also, as students heading home for the summer, is participating in Data for Good part of your plans? D4G offers the opportunity to take what you have been learning in the classroom and apply it to real-world problems. One good way to start is talking with people at organizations with which you are already connected. Maybe it’s an animal shelter, community organization, or sports group. Find a place where you are connected already and ask how your statistical skills can be used to make a difference for good.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

Comments are closed.