Home » Member News, People News

Q&A with Rochelle Tractenberg

2 January 2023 1,449 views One Comment

Rochelle TractenbergOne of the world’s leading experts on the practice of ethics, statistician and Georgetown University professor Rochelle Tractenberg has written two books, Ethical Practice of Statistics and Data Science and Ethical Reasoning for a Data-Centered World. These companion volumes are the first and only books to be based on, and provide guidance for using, the American Statistical Association and Association of Computing Machinery’s ethical guidelines/code of ethics.

We wanted to find out more about the author and these publications, so we asked her the following questions.

Who is the audience for these books?

There are three audiences, really.

  1. Instructors in any quantitative course (Ethical Reasoning for a Data-Centered World) in/for any field, including philosophy, business, and computer science, or instructors of courses in which the students are closer to actually practicing (Ethical Practice of Statistics and Data Science).
  2. Self-directed learners in any field data is used (e.g., students taking an intro to stats or intro to data science course)
  3. Practitioners who find themselves using data and/or statistical practices and want to assure themselves or others that they practice ethically. This group could include managers or supervisors of teams that use data, statistics, or data science.

The books feature ethical reasoning as a paradigm, which can be used at any point in a career, with any ethical code/conduct or principles. The books are intended to support, and could also be used to teach, ethical engagement in any discipline in which data, data science, and statistics are used.

Why are these books important?

These books are important for the following three reasons:

  1. Neither the ASA nor the ACM have effectively disseminated their ethical standards of practice, although both assert (within their standards) the ethical guidance is intended for all who use their disciplinary tools and techniques. The books make these guidance documents accessible using authentic tasks (Ethical Reasoning for a Data-Centered World) and cases (Ethical Practice of Statistics and Data Science) so readers can learn about and practice reasoning with the ethical standards.
  2. Ethical Reasoning for a Data-Centered World is intended for every person who uses statistics and data science—or deals with data—so the ASA and ACM objectives that everyone who uses statistical practices and computing can do so ethically. Even if they only take one course in statistics or data science, readers of Ethical Reasoning for a Data-Centered World can also learn about their ethical obligations in an authentic way. Just because you’re not a statistics or data science major, or that’s not your job title or objective, does not remove your obligation to be transparent, honor and respect stakeholders and data providers, and act in a stewardly way toward the data and those who make decisions based on the data.
    Ethical Practice of Statistics and Data Science is intended for people who are practicing or are close to becoming independent practitioners. The main difference between the books is that in Ethical Reasoning for a Data-Centered World, the bulk of the book (Section 2) is dedicated to learning how to work with data in an ethical way—what the ethical practice standards say about doing the job ethically. By contrast, the bulk of Ethical Practice of Statistics and Data Science (Section 3) is on actual cases. Having had experience working with others is important background for the 47 case analyses in Ethical Practice of Statistics and Data Science. So, rather than having one book for every reader/user of statistics and data science, there are two—and they are complementary if readers move from learning to work ethically with data (Ethical Reasoning for a Data-Centered World) to learning to work ethically with people and data (Ethical Practice of Statistics and Data Science).
  3. These are actually the first books to feature these guidelines to promote ethical practice of statistics and data science. Not only are they fully current, they are also inclusive of checklists and practice standards. There are many discussion opportunities for students, instructors, and practitioners, which I hope end up being useful.

In cases where people are required to learn about “responsible conduct of research,” it is better to learn how to reason and follow coherent practice standards than it is to learn about abstract concepts like “paternalism” or “autonomy”—without having those clearly linked to your actual statistics and data science practice. Modern applications of statistics and data science “in research” require a great deal more attention to what constitutes ethical statistics and data science than is currently given in typical bioethics-oriented or machine learning/artificial intelligence–intensive “ethics” training.

Do your books provide techniques for solving specific ethical problems?

There are three main techniques presented in both books. These are useful for identifying problems, determining options, and justifying decisions relating to any kind of ethical problem that can arise in the practice of statistics and data science.

  1. Consider where you are in what I call the “statistics and data science pipeline.” If you recognize what tasks you are doing, it makes it simpler to identify stakeholders and applicable aspects of the ethical practice standards. Both books are organized along this pipeline.
  2. Examine the effects of your choices on stakeholders in your decision-making. Ethical Practice of Statistics and Data Science in particular, with its 47 cases, features a stakeholder analysis in every case analysis. Examining all or just one stakeholder analysis can make it clear that a) harms and benefits are not exchangeable; b) typically, harms to a stakeholder beyond the practitioner can be quite serious; whereas, benefits that may accrue when ethical guidelines are ignored are typically not impressive or important; and c) harms and benefits of ignoring ethical practice standards do not accrue equally. The public and public trust are typically harmed by nonethical practice, and these harms can be serious (e.g., data breaches lead to degraded public trust and potential harms to the public whose data is breached). Benefits, by contrast, are typically minor (e.g., saves time).
  3. Ethical reasoning helps you identify unethical behavior and choose—and justify—how to respond to it. It is recognized that just knowing about or the contents of ethical guidelines is not sufficient to make defensible decisions. These books seek to rectify that by presenting ethical reasoning as a learnable, improvable skill set. One thing I hope is made clear in all the cases is that ignoring unethical behavior, or ‘doing nothing’ when you are faced with unethical behavior, is always an option and never an ethical option. Both the ASA and ACM ethical practice standards are clear about this.

Why is it important to learn how to reason ethically, rather than memorize ethical practice standards?

There are three primary reasons for learning to reason ethically rather than memorizing standards:

  1. Knowing how to reason ethically is something you can learn, improve, and apply to any set of rules or guidelines. This is quite different from knowing what the rules are, although knowing the ethical guidelines (or workplace policy) is the first step in the process of ethical reasoning—establishing the knowledge that is prerequisite for figuring out what is going on and what to do about it.
  2. The 2022 ASA Ethical Guidelines for Statistical Practice has 72 elements. Memorizing them would be … challenging. Rather, being familiar with the eight organizing principles and knowing there is a 12-item appendix, plus knowing the process of ethical reasoning, would be a lot simpler and is probably a more achievable/realistic goal for more people. You can apply ethical reasoning in any situation or context, including statistics and data science, so it’s just more efficient to learn how to reason ethically than to memorize the guidelines.
  3. The ASA Ethical Guidelines for Statistical Practice is reviewed periodically and quinquennially since 2016. The 2022 revisions reflect the results from the first of these quinquennial reviews. Since the world is changing, ethical practice standards must also change, if needed, to reflect current/modern practice. Memorizing them each time they’re revised could be confusing. Also, if you’re teaching statistics/statistical practice, the students or mentees you have one year could learn one set of guidance while the next year (or later) they learn a different set.

This is true for formal and informal teaching (e.g., learning on the job). To understand and reflect on the dynamic nature of statistics and data science, it is important to recognize how the ethical practice standards may also change over time. Committing them to memory each time is less efficient than learning how to use them (at any time/in any version) in an argument or to determine and then justify a course of action.

It should be noted that the effort you invest in learning to reason ethically and becoming familiar with what constitutes ethical statistics and data science practice is authentic to work in the domain. While exploring interesting “ethical dilemmas” such as self-driving cars and how artificial intelligence identifies criminal activities is engaging, those are a) specific cases without much generalizability and b) not related to the day-to-day activities of statistics and data science, all of which need to be done ethically.

What role does ethical reasoning play in the development and support of professionalism?

This is a great question. Like the ethical practice standards, people studying biostatistics, statistics, and data science are unlikely to get much instruction about a professional identity, either. Merriam-Webster defines “professionalism” as “the conduct, aims, or qualities that characterize or mark a profession or a professional person,” which are externally observable.

In their 2002 Focus on Health Professional Education article, “Clinical Reasoning and Self-Directed Learning: Key Dimensions in Professional Education and Professional Socialisation,” M. Paterson, J. Higgs, S. Wilcox, and M. Villenuve define professional identity as “… the sense of being a professional … the use of professional judgment and reasoning … critical self-evaluation and self-directed learning …,” which is internal, not as externally observable.

Research done on physical therapists learning their profession presented in the 2012 Asia-Pacific Journal of Cooperative Education article, “Role of Work-Integrated Learning in Developing Professionalism and Professional Identity,” noted, “Professional identity formation means becoming aware of … what values and interests shape decision-making.” This definition includes an internal (becoming aware) and external (values and interests in decision-making) element. The books definitely aim for the latter, with ethical reasoning plus the ethical practice standards. You don’t have to be in the major or have that job title to have a sense of doing your job that includes statistics and data science in an ethical and professional manner.

I would argue that the ASA Ethical Guidelines for Statistical Practice (and ACM Code of Ethics), given their origins with experienced practitioners and their general support for any kind of practice with statistics (ASA) and computing (ACM), represent the “values and interests” that shape decision-making by ethical practitioners in statistics and data science. Transparency, refusal to generate results simply because they’re asked or pressured to do so, and demonstrating respect for the rights and wishes of data contributors—all of which are key features across multiple ASA ethical guideline principles—should characterize the professional identity of anyone who works with data, even if their job title or degree (or both) are something other than “statistician” or “data scientist.”

Additionally, I would say a person who is asked (or told) to memorize the ASA Ethical Guidelines for Statistical Practice is less likely to feel like a member of the profession that engages in ethical statistics and data science, and a person who learns to reason with the ASA ethical guidelines is more likely to feel like a member. They will certainly know how to use those guidelines in a professional setting, and they might be a great person to confer with in case you ever run into an ethical challenge you’re unsure about how to address.

Also, it’s worth reiterating that if you use computing, statistics, or both, you have ethical obligations to do so in accordance with their ethical practice standards, even if your professional identity is “scientist.” If your scientific discipline does not specify it is unethical to mislead readers with your statistics analysis or report—and few do—the ASA Ethical Guidelines for Statistical Practice actually specify this (C2). If you’re using statistics to get your science or its reporting done, you should follow the ASA ethical guidelines to ensure you’re using statistical practices ethically. When you don’t, you may be contributing to the reproducibility crisis going on in science right now.

How can a client know they are working with an ethically trained statistician or data scientist?

There is probably a lot of confusion about what would qualify someone as being ethically trained in any field. I personally have completed the bioethics training—same content every three years since about 1994—none of which relates in any way to being transparent and stewardly with statistics or data science. There is no way this training could plausibly be considered ethical training for statistics and data science. However, since this type of bioethics-based training is federally mandated for scientists who work with human subjects and research that is federally funded (like I do), it does in at least one sense make me ethically trained.

So, the first way to know you’re working with someone who is actually ethically trained in statistics or data science is to confirm they have done something beyond the mandated US training in responsible conduct of research. Specifically, training having to do in some sense with the ethical use of statistics and data science.

This is not to say, nor to suggest, that statisticians and data scientists who haven’t done extra training beyond what is required at work are not ethical. That is not my point at all. This question is about identifying an “ethically trained statistician or data scientist,” and I’m promising that anyone who completes US-based training for responsible conduct of research does not meet the criterion, because the training is simply neither designed nor intended to support ethical practice of statistics or data science.

The 2018 National Academies report on the undergraduate data science curriculum suggests “ethics” should be integrated throughout the data science curriculum. However, would a curriculum that requires students to memorize the ASA Ethical Guidelines for Statistical Practice turn out “ethically trained” practitioners? I would argue not. In fact, in their book Ethics and Science: An Introduction, Adam Briggle and Carl Mitcham state, “Ethics is the effort to guide one’s conduct with careful reasoning. One cannot simply claim ‘X is wrong.’ Rather, one needs to claim ‘X is wrong because (fill in the blank).’” So, here is an extreme example of a practitioner recognizing a request created an ethical problem and using the ethical guidelines to explain exactly why they could not comply with that request:

Client: “Have you ever used the ASA or ACM ethical practice standards to guide your decision-making?”

Statistician or Data Scientist: “Yes, one time and I said, ‘Sorry, that’s against ASA ethical guidelines A2, A4, B2, C2, E3, E4, and H2, and it also makes you violate G1 and G5.’ Then, I noticed the request violated appendix items 1, 4, 8, and 9.”

That would be evidence they’re an ethical practitioner, whether or not they received formal training!

The ASA Professional Statistician Accreditation (PStat®) requires that an applicant “… affirms intent to uphold the ASA’s Ethical Guidelines for Statistical Practice.” So theoretically, each PStat-accredited statistician would be someone who, if not ‘ethically trained,’ is committed to upholding the ASA Ethical Guidelines for Statistical Practice.

My hope is that these books, whether they’re used to teach courses or they’re used by self-directed learners and practitioners, will lead to a much larger group of people who can state they have used the ethical practice standards to “guide their conduct with careful reasoning.” Once you complete a course in which the guidelines, and reasoning with them, are featured—taught and learned—you would be able to see in the transcript that the person was “trained in ethical statistics (and/or data science).” I think “trained in ethical practice” is a lot clearer and a far easier criterion to meet than “ethically trained.”

What prompted you to write these books?

As an academic, I became interested in 2009 when the National Institutes of Health issued NOT-OD-10-019, “Update on the Requirement for Instruction in the Responsible Conduct of Research.” This update includes the statement, “Responsible conduct of research is defined as the practice of scientific investigation with integrity. It involves the awareness and application of established professional norms and ethical principles in the performance of all activities related to scientific research.”

What struck me about the update are 1) its implication that only people who are being trained to do research are required to get training in ethical (research) practices and 2) its reliance on “established professional norms” without specifying that if your profession is something like clinician, for example, that profession’s norms are unlikely to sufficiently describe ethical statistical practices in cases where statistical practices are not generally part of your profession. [As an aside, the update also suggested I, as a statistician, should be trained to follow the NIH’s ethical recommendations instead of the ASA’s when I do statistical research if it is federally funded and involves humans!]

I was on my institution’s task force for responding to the update. The person sitting next to me at the first task force meeting (an ethicist and microbiologist) and I started exploring the relevance of the ethical reasoning paradigm for meeting—and often exceeding as it turned out—the NIH’s targeted learning.

I was appointed to the ASA Committee on Professional Ethics in 2013 and became the vice chair, chairing the first working group on revising the ASA Ethical Guidelines for Statistical Practice (originally approved in 1995). Recognizing the critical role of statistics and data science in reproducible biomedical research—and how the NIH policy excluded emphasis on ethical statistics and data science content—drove me to work on getting the ethical reasoning paradigm to be more widely shared.

Working on the ASA ethical guidelines and their revisions led me to recognize the committee faces an uphill battle when meeting its charges to a) “sensitize members of ASA to the ethical issues in statistical practice and in other fields in which statistics is used” and b) to “… promulgate … the set of ASA Ethical Guidelines that describes the general view of ethics in statistical practice.” To me, ethical reasoning plus the guidelines helps the committee meet both charges.

I had a sabbatical in 2019, so both books were drafted during that time. Both books include a map so two cases can be shown to reflect each of the 2022 NIH “responsible conduct of research” topics. Using these books, individuals can learn to use statistics and data science ethically [whether or not it is for human subjects/federally-funded research], while also fulfilling the NIH mandate for their particular topical training, if that is a requirement.

Is there anything you would like to share that we haven’t asked you about?

One question I kind of expected was along the lines of, “How can anyone be expected to squeeze a new course—much less two—on ethical practice into an already-crowded curriculum?” This is one reason I wanted to make sure the books are accessible to any practitioner, as well as to instructors and self-directed learners. I don’t expect programs or instructors to prioritize ethical practice or reasoning over learning the discipline itself. What I hope will happen, and is already happening, is that programs will see these books as auxiliary texts, with ideas about and opportunities to reflect on ethical dimensions of practice throughout all the courses in a statistics or data science program.

We don’t expect all instructors to know about all the methods their program teaches, but it seems more reasonable to expect people teaching about statistics—whatever the method/course—be knowledgeable about how to do their specialty method ethically. If students are asked to use the same two books throughout their course of study, they will learn to reason ethically with any material, and they will learn how to do everything the program is established to teach them in an ethical way.

I recognize this could require a massive, coordinated effort. Some programs are contemplating doing this at the same time they are planning to create or revise a statistics and data science program. If anyone does do this, I hope they will share their struggles and successes with the rest of the community.

Something else I note when I’m discussing the books or ethical reasoning for statistics and data science is that the statistics and data science pipeline (plan/design, collect/munge/wrangle data, analysis—run or program to run, interpret, document your work, report and communicate, and work on a team) identifies as many as seven distinct opportunities to disrupt an overall unethical practice or norm. For example, the Cambridge Analytica scandal could have been interrupted by anyone engaging in one or all of these tasks.

Are you familiar with any places a statistician or data scientist can ask for help from a colleague when confronted with an ethical issue?

What a great question! The short answer is the colleague. All the cases (seven in Ethical Reasoning for a Data-Centered World and 47 in Ethical Practice of Statistics and Data Science) discuss how and when to confer with a colleague or peer. One of the important aspects of the books is to help familiarize readers with both the ethical practice standards—which are important for having any kind of discussion about what to do when confronted with an ethical issue—and what options or next steps are feasible.

Ethical reasoning applied to a case (or event) will encourage practitioners to collect the information they have in an organized way. This alone can help them feel more confident that they have identified an ethical issue. If you work somewhere with an ombudsman or you feel comfortable going to your supervisor, having a sort of write-up of the problem will definitely help you initiate the conversation.

Ethical Practice of Statistics and Data Science discusses role playing as a way to get yourself more familiar with the kind of conversations with colleagues you might have. The person you discuss the situation with doesn’t have to be an ethicist, just someone who is willing to read the ethical practice standard (ASA or ACM) or your workplace policies carefully and thoughtfully.

My ultimate hope for these books is that, as a result of reading and working through the examples, readers will be those colleagues who are able to provide a careful and thoughtful conversation.

One of my favorite quotes about instruction in ethical practice comes from Michael Kalichman in the report of a National Academy of Engineering meeting: “… (t)he entire community of scientists and engineers benefits from diverse, ongoing options to engage in conversations about the ethical dimensions of research and (practice).” I hope the books help create those options and to motivate practitioners in statistics and data science to engage in these conversations and learn to see themselves as those colleagues to whom others can turn to for help identifying and responding to ethical challenges at work.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

One Comment »