Home » President's Corner

#LeadWithStatistics: A Data Ethics Call to Action

1 May 2020 One Comment

Wendy Martinez. Photo courtesy of Barbi Barnum, Studio B Photography

#LeadWithStatistics was the challenge and opportunity delivered to us by Lisa LaVange during her JSM 2018 president’s address. In this column, I want to share some thoughts about data ethics and again challenge our community to lead.

For more than 25 years, annual ethics training has been part of my longtime US federal government employment. Most of these classes covered topics such as conflicts of interest, taking outside employment, accepting gifts, and working after government service. There is even a government office dedicated to ethics called (not surprisingly) the Office of Government Ethics.

The type of ethical behavior outlined by government offices is directed by regulations and legal codes. Although this is important, I believe we need to also work with colleagues and stakeholders on data ethics education.

Encyclopedia Britannica offers this definition: “Ethics, also called moral philosophy, the discipline concerned with what is morally good and bad and morally right and wrong. The term is also applied to any system of theory of moral values or principles.”

The set of laws or regulations provided by government offices to their employees does not suffice as ethical guidelines. Ethical guidelines are a set of moral principles that guide our behavior, which in turn depends on our cultural and religious beliefs and demographic characteristics such as age, gender, and education (Moral Machine). We need the same set of principles for data ethics.

My formal data ethics journey began at the 2019 Conference on Statistical Practice held in New Orleans, where I had the honor of chairing a panel session on ethics. This session emerged from an abstract submitted by ASA member Mary Gray of American University. David Corliss—author of the Amstat News Stats4Good column—and Juan Lavista Ferres from Microsoft joined her for a discussion about the risk of algorithms used by data scientists, along with the legal and ethical implications that result.

The CSP 2019 panel made me realize that, although I had substantial experience with professional codes of ethics, my knowledge was incomplete. This panel discussion delved into aspects of ethics I had not considered. Panel members also gave numerous examples of models that have learned biases inherent in the data used to build them. Mary gave a great talk on the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) tool, which uses a black-box proprietary model to highlight potential areas where crimes will take place, provide information or recommendations to use in sentencing, and predict the risk of recidivism. For more insights into black-box algorithms and the need for interpretable models, I encourage you to watch the ASA Government Statistics and Social Statistics sections’ virtual workshop given by Cynthia Rudin of Duke University.

The ASA Ethical Guidelines for Statistical Practice are essential to our work and should inform our interactions with colleagues and stakeholders. The ASA Committee on Professional Ethics (COPE), whose charge is to maintain and promulgate the set of ASA ethical guidelines, has developed resources to help with educating people about the guidelines.

COPE Chair Michael B. Hawes has this to say about a call for action:

The Committee on Professional Ethics reviews and revises the ASA’s Ethical Guidelines for Statistical Practice every five years so they remain current and relevant for our members and for the broader statistical community. In preparation for the next revision in 2021, the committee has created a discussion board for ASA members to submit suggestions. If you would like to comment, you may do so at the ASA ethics website.

Rochelle E. Tractenberg of Georgetown University has written several papers about higher education and how to incorporate ethics throughout one’s career. One of these papers includes a description and cross-walk of two sets of guidelines—one from the Association of Computing Machinery (ACM) and our very own ASA code of ethics. She provided the following content to this column:

National Academy of Sciences (2018) Recommendation 2.5: The data science community should adopt a code of ethics; such a code should be affirmed by members of professional societies, included in professional development programs and curricula, and conveyed through educational programs. The code should be reevaluated often in light of new developments.

Data science arises from two disciplines with long-standing commitments to ethical practice: computing and statistics. Ethical guidelines have been developed over several decades to support ethical professional practice with—as well as the application of—tools, techniques, and methods from both statistics (ASA 2018) and computing (Association of Computing Machinery, ACM, 2018). Both the ASA (representing roughly 18,000 practitioners worldwide) and the ACM (representing roughly 100,000 computing professionals worldwide) assert that their ethical practice guidance should pertain to members and non-members alike who utilize their methods and techniques. Although neither group specifies that their ethical guidance is relevant for data science per se, examination of the concordance in their guidance is a natural first step for describing “ethical data science.” As they are representative of essential constituent disciplines for data science, Table 1 in a white paper published as part of an Open Science Framework project explores the thematic alignment (i.e., concordance) between their two ethical guidance documents (as of 2018).

A three-year grant is in review at NSF that seeks (in part) independent examination of the concordance of the ASA and ACM ethical guidance. The purpose is not to influence the content of either set of ethical practice guidelines: If each organization acknowledges the relevance of their and the other organization’s ethical guidance, together, for defining ethical data science, it would accomplish NAS Recommendation 2.5.

We have made important contributions, and it is critical that we continue to ensure the ethical practice of data science. I encourage you to read the Ethical Guidelines for Statistical Practice and provide comments to the committee about potential revisions and engage your colleagues in discussion about data ethics. We can #LeadWithStatistics!

Data Ethics Principles
During talks on data ethics, ASA President Wendy Martinez asked attendees to rank data ethics principles. The principle deemed most important in every instance was transparency.

What data ethics principle is most important to you? Take a short survey and let us know your thoughts.

This is a sample of existing data ethics resources:
UK Government
Magna Carta for Data
DataEthics
Royal Statistical Society Data Manifesto

Join the conversation!

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

One Comment »

  • Wendy Martinez said:

    Just wanted to pass along an additional resource on data ethics. This is a seminar given by past ASA President Jessica Utts. The title of the seminar is “Enhancing Data Science Ethics through Statistical Education and Practice.” You can see her slides and hear the seminar by going to

    https://www.stat.uci.edu/seminar-series/

    and scrolling down to April 30.

    I hope others will post relevant data ethics links and resources as comments to this column.