Risks and Best Practices for Working with Student Data

There are many programs, departments, and offices across campus that rely on accurate and timely information about students to evaluate their impact, growth and success. As student data becomes more readily accessible through reporting systems like SIRIS, we want to offer some thoughts about how to present, interpret, and distribute this information responsibly. The list below is not exhaustive, but can be used as a starting point for conversation or simply food for thought.  The issues below are also not intended as guidance on who should have access to data, but rather introduce some of the common pitfalls that those who regularly prepare and analyze data may experience. We encourage you to share your comments and suggestions with us at siris-support-team@lists.stanford.edu

As a starting point, we have created three fictional scenarios that illustrate some of the ways that data might be misused or misinterpreted. Click the links embedded in each scenario to learn more about that particular risk and how to mitigate it.


Scenario 1
B has recently joined Stanford as the Associate Director of an Interdisciplinary STEM Program for undergraduates. She is excited to improve the student experience, and is interested in understanding more about how students progress through the different courses in the program as well as whether there are differences in outcomes for different groups of students. She uses SIRIS to pull a list of all students who have completed the program in the last year, as well as their grades in all of the program courses and several personal attributes including gender and race/ethnicity. After examining the course grade data, she notices that the average grade of the URM students in the program courses (a total of 4 students out of 25) is equal to that of the non-URM students (over-interrupting significance). B wonders whether this is result is representative of other STEM-related IDPs on campus so she also pulls the data for their students and courses (comparing without communicating). After finding that the other programs do show a statistically significant difference between the performance of URM and non-URM students in their courses, she creates several plots on the second tab of her Excel spreadsheet and emails everything to the Dean of the school(release of student row level data). In hopes of increasing the diversity of the program, B also updates the program website to include a statement that, in contrast to other programs on campus, this program promotes achievement by URM students.(stereotype threat).
Scenario 2
G is a new analyst in the Diversity Office of School X. He is trying to prepare a report for the annual budget letter, and wants to show that the Diversity Office has been successful in increasing the number of women and under-represented minorities in the School’s graduate student population since it was formed 5 years ago. His boss, the Assistant Dean for Diversity, has requested this analysis and will be presenting it to the Dean at the next meeting of the School’s advisory council. He pulls demographic data for the graduate students for the last 10 years from SIRIS, and is gratified to see that the percentage of enrolled women and URM students shows a significant increase 4 years ago, one year after the Diversity Office began its recruitment efforts at national conferences. He produces some graphs illustrating the trend, and puts them in the PowerPoint deck for the Assistant Dean. At the meeting the Assistant Dean shows the data and explains how the Diversity Office’s efforts to attract more diverse applicants have been very effective. Someone else at the meeting points out that 4 years ago is when the new coterminal Masters program first began, and asks if that wouldn’t be more likely to explain the apparent increase (mistaken causality). Additionally, another presentation at the same meeting by the Assistant Dean for Academic Affairs appears to show no increase in diversity in the graduate student population over the same time period. Only after several minutes of confused discussion is it understood that the second analysis did not include coterms in their data(conflicting findings).
Scenario 3
K was recently hired as analyst in the Office of Undergraduate Affairs. A reporter from Buzzfeed contacts him to ask what percent of freshman take an engineering class in their first year. K pulls the numbers and sends them off to the reporter (release of data to the media). The following week, the Senior Associate Dean of Arts and Humanities is forwarded the article by one of her staff. The article attests that 87% of freshman take an engineering class and is titled “Interested in humanities? Don’t go to Stanford!”. The article does not mention any other comparable statistics for other universities or other divisions at Stanford (inappropriate use of statistics).  The Senior Associate Dean is angry and embarrassed—she is scheduled to give a presentation the next week at the Faculty Senate about how Arts and Humanities has actually been successful at boosting overall freshman enrollment during the last five years and that in the last year, 83% of freshman also took a class in her division—but now everyone on campus is talking about how the humanities are unpopular with students(findings taken out of context).

Release of Data to the Media

Risks: Sharing data or analysis with the media or public without consulting the appropriate channels can potentially result in negative personal and professional consequences such as termination from you position and loss of eligibility for employment at Stanford in the future.

Example: A reporter from a newspaper contacts a school’s dean’s office to ask what the most popular undergraduate class is and what the breakdown of enrollment by ethnicity is for the class. The reporter randomly picks a person on the den’s office contact list from the student service office to contact and the individual releases the information.

Best Practices: When contacted by a member of the press, always contact University Communications (650-725-8396) before responding. As a nonprofit, educational institution, Stanford is bound by laws and policies that restrict the kinds of information that we are allowed to share.  At minimum contact your supervisor for guidance.

 

Release of Data to an External Source

Risks: Sharing data or analysis with the media or public without consulting the appropriate channels can potentially result in negative personal and professional consequences such as termination from you position and loss of eligibility for employment at Stanford in the future.

Example: A ranking agency, such as US News, reaches out and asks for student population counts, cost of tuition per student, and graduation rates at one of the schools.  The ranking agency’s normal contact is out of the central office so they reach out to the school for the information.  The student service officer gives the agency the information without checking with their supervisor first to see how the request is normally handled, and if it should be forwarded to a central office.

Best Practices: When contacted by a member of an external agency for data to be used in a survey or report, always contact your supervisor or Institutional Research and Decision Support for guidance. Many times these requests are handled at the university level due to the necessity to coordinate across many different units and to determine if university data should be provided to the external agency.

 

De-anonymization

Risks: Stanford is bound by federal (FERPA) and state law to protect student privacy. Tables, graphs, and other visualizations which result in clusters or cells of less than 5 students have a high risk of allowing the identification of individual students.

Example: A program director is asked to provide a breakdown of time-to-graduation by ethnicity. The total number of under-represented minority students in a particular year is 3. This data point should be omitted and it should be noted on the table or graph that the number is not displayed due to small cell size. Alternatively, the data could be presented as an aggregate of cohorts over the last five years or rolled up in another way to prevent the risk of identifying an individual student.

Best Practices: Always mask results in tables, graphs, and other visualizations where the cell size is less than 5. If your population is small to begin with, aggregating over time or rolling up to larger categories can help to increase the sample size.  Please refer to Stanford’s guidelines found at the Office of Audit, Compliance, Risk and Privacy and FERPA for university policy on this topic.

 

Conflicting Findings

Risks: Reports and analyses are often shared across units that may define similar populations in different ways. When data are provided without an explanation of how they were pulled, statistics can appear inconsistent. This can result in confusion or action based on misinformation, misunderstanding of the data, and mistrust of the data source.

Example: A student services officer receives a call from another department analyst who is reviewing counts of enrolled students provided at a meeting. The totals provided do not match the official published numbers for their school. The analyst wants to know how the SSO came by the numbers. The SSO states that the query used to obtain the data was built by an employee that has left and they are unsure of the logic behind the numbers.

Best Practices: Always include a cover sheet or other documentation detailing how numbers were calculated, any assumptions that were made, and defining any terms to avoid ambiguity and misinterpretation. If you are forwarding or repurposing data make sure you, as the provider, are comfortable with explaining how the findings were obtained.  Include meaningful labels for your graphs and table columns, and include footnotes on any anomalies that were excluded and what office prepared the data. Many times simple documentation and proper labeling will answer questions beforehand and save the analyst time trying to reconcile results that were not prepared in the same way.  If you are unsure of what a term means or how to report it, refer to the Data Governance glossary for details on common reporting terms and their usage.

 

Re-purposing of Reports

Risks: When taking the time to craft a thoughtful report or visualization, you don’t want to make it easy for others to modify your work or attempt to extract data for other purposes. Results that are taken out of context can lead to confusion or misinformation.

Example: A student services officer creates a graph in Excel showing trends in the department’s course enrollments over time.  The long term trend in course enrollment is flat, with little to no growth.  However there are some years with little blips of growth that when taken out of content could be interpreted as  an indication of enrollment growth. The SSO sends the graph to the department’s Director of Undergraduate Studies with a documentation of their findings along with the protocol on how the data was obtained and analyzed.  The director extracts a portion of the graph that implies the enrollment for the course has increased over time and presents these findings.

Best Practices: We recommend when providing reports, to distribute your findings as an image file or PDF to minimize attempts to tamper with, modify, or reuse your visualizations in ways that can increase misinformation.

 

Release of Student Row Level Data

Risks:  Data that are extracted could be re-analyzed in an inappropriate manner. It is particularly important not to share individual row level data inadvertently when sharing aggregate results, as this is a potential violation of FERPA and Stanford’s privacy guidelines found at the Office of Audit, Compliance, Risk and Privacy.

Example: A student services officer creates a graph in Excel showing trends in the department’s course enrollments over time. They send this to the department’s Director of Undergraduate Studies. The Excel file also includes a tab with individual student names, gender, and grades, which the DUS then examines and notices her neighbor’s son’s name and their grades which she passes on to the student’s parents.

Best Practices: We recommend when providing reports, to distribute your findings as an image file or PDF to minimize attempts to tamper with, modify, or reuse your visualizations. Avoid giving the underlying row level data in order to avoid the data being analyzed in an inappropriate way or violations of individual students’ privacy. Include clear documentation of the data source and include the date the report was run (see “Conflicting Findings; Best Practice”).

 

Over-interpret Significance

Risks: A common pitfall of data interpretation is the tendency to interpret significant differences between populations when in fact the population sizes are too small to make any such claims. Changes in a small population (e.g. a department’s faculty or an IntroSem class list) will manifest as large percentage changes that seem more meaningful than they are. The numbers of students in most Stanford classes and departments are often too small to use reliably for many common statistical procedures.

Example: A department reports that the number of Masters degrees has increased by 40% for each of the past two years. The department manager is very excited initially and wants to ask for more funding from the dean, but the dean notices that the actual increase is from 6 to 8 degrees awarded in one year, and from 8 to 11 degrees awarded in the next. The dean is not impressed.

Best Practices: Utmost care should be taken to ensure that data are interpreted and resulting action taken on the basis of meaningful and robust statistics. Use proper techniques to evaluate significance and design visualizations that clearly convey any uncertainty. Ensure that visualizations use appropriate scales to avoid overemphasizing differences and always include clearly visible population/sample sizes.

 

Findings Taken Out of Context

Risks: When broader context is missing, it is easy to come to incorrect conclusions about the reasons for a particular finding. Erroneous findings and repeating of incomplete findings can lead to changes in policy and procedures that are not based in fact.

Example: An analyst gives a presentation to her school’s monthly faculty meeting that highlights the decline of enrollment in a particular humanities course. The resulting faculty discussion focuses on the instructor’s ineffectiveness which caused the decrease in enrollment.  However when the findings were explored further a more plausible explanation was identified which indicated a broader context about the changing interests of Stanford students over time and overall enrollment declines in the division.

Best Practices: Be as clear as possible and always give detailed context for any results.  Be aware of your audience and give explicit warnings about not taking things out of context.  Findings should be communicated in a way that includes a discussion of historical trends and policy changes that could have and impact on the findings.

 

Inappropriate Use of Statistics

Risks: Inappropriate application of statistical analysis and significance testing can lead to erroneous results or misleading conclusions.

Example: A department wants to understand the effectiveness of a particular intervention for students who are performing poorly in a course. 10 students in the course are identified for the intervention, and the department analyst performs a statistical test to see if these 10 students experience a bump in test scores on the final exam compared to students who did not receive the intervention, controlling for GPA and other factors. No statistically significant difference is observed, so the department decides to discontinue the intervention. In reality, the intervention may still be effective; however, the small size of the test group meant that a statistical test was not likely to yield meaningful results.

Best Practices: Ensure that you understand whether a statistical test is appropriate (e.g. has sufficient statistical power) and ensure that any statistical models are implemented correctly or vetted by someone experienced in that type of analysis. The numbers of students in many Stanford classes and departments are often too small to use reliably for many common statistical procedures.

 

Mistaken Causality

Risks: When performing a statistical analysis, be very cautious when attributing causal relationships to patterns in data. Just because two data sets appear to trend together or be related in some way does not mean that there is a direct causal link between them. Correlations can be caused by a completely external factor or be purely coincidental.

Example: A student data analyst is asked to pull data on course enrollment for a particular class over time along with who taught the class. There appears to be a correlation between the number of students enrolled in the course and whether it is taught by a faculty member (more students) or lecturer (fewer students). Without looking at other variables that are influencing this correlation (i.e. more students enroll in Autumn because it is a prerequisite; it became a requirement for graduation, the room is bigger in Spring, etc.) the department decides to not renew the contract of the lecturer.

Best Practices: Ensure that data-driven decisions or interventions are grounded in what can be reliably understood from an analysis. Think carefully about other possible explanations for relationships in data and perform further analyses as needed. When summarizing your findings or making policy recommendations, the chief guiding principle is that correlation does not imply causation. When presenting results, be explicit about what cannot be inferred and clear about the fact that causality is not implied.

 

Comparing without Communicating- Just Ask

Risks: Including relevant comparison groups can add strength to an interpretation. Be extremely careful, however, when pulling and analyzing data about parts of campus other than your own (e.g. another department, unit or school) without permission. There is a lot of variation across Stanford with regard to how data elements are defined and counted, and you may inadvertently and unnecessarily create an uncomfortable situation when data from another unit are used incorrectly or without prior discussion with that unit.

Example: An analyst in a particular school’s dean’s office pulls statistics on enrollment by URM and gender for several of their degree programs as well as for comparable programs in other units for a unit to unit comparison. The dean then presents this information at an executive meeting with other deans, and one of the other deans is dismayed that the numbers appear to show their school in a negative light in comparison and don’t reflect their school’s internal reporting practices.

Best Practices: When gathering data for comparative purposes, contact the unit whose data you are interested in to obtain permission to use the data or ask if they have a similar analysis that you may use for your purpose. Be prepared to let the other unit know how the data will be pulled and analyzed, and in what form results will be distributed. Offer to provide them with a copy of your findings for vetting prior to release of the data.

 

Limiting Opportunity

Risks: Analyses should not result in decisions being made to actively or passively exclude certain groups of students or individuals from classes or programs based on a statistical probability of future success or failure (in contrast to having a uniform policy which applies to all students—e.g. class A is a prerequisite for class B). Stanford leadership believes that all students should have access to the same set of academic opportunities and that demographic or group affiliation should not predetermine a student’s path at the university.

Example: A department finds that students in a specific major routinely under-perform in one of their upper-level classes. The student services officer selectively emails students with that major who enroll in the upper-level class to let them know about a different, lower level class on the same topic.

Best Practices: Think ahead to what actions might result from the knowledge gained from any specific analysis or report and ask questions to determine what the underlying goals are. When in doubt contact IR&DS for further guidance.

 

Stereotype Threat

Risks: Findings that appear to show one group of students underperforming in a class or activity (for example, in terms of grades, graduation rate, time-to-degree, etc.) can introduce or reinforce stereotypes that lead other members of that group to experience anxiety about their potential to confirm the stereotype. This fear of confirming a stereotype about one’s social group is known as stereotype threat. A robust body of research has shown that stereotype threat can lead individuals affected by the stereotype to suffer dips in performance and avoidance of situations that evoke the threat. In addition, information about group disparities in performance can lead faculty and staff to evaluate students belonging to the stereotyped group more negatively than is warranted. Importantly, simply being reminded of the stereotype can lead to these undesirable effects; individuals do not need to consciously subscribe to the stereotype for the threat to occur.

Example: A faculty member asks student services to evaluate if first-generation students perform at the same level as other students in their introductory class. The results of this inquiry suggest that first-generation students underperform compared to continuing generation students, and the results are  published in the Stanford Daily. First generation students who read the article are reminded of stereotypes about their group’s academic preparation or ability, which may cause them to avoid the course or suffer a dip in performance when taking it. The findings may also lead faculty members who teach the course to develop an unconscious bias about first-generation students that could lead to differential treatment.

Best Practices: Before providing any findings on group differences, first understand how the data will be used and released, and carefully weigh potential consequences with potential benefits. A good rule of thumb is that any information about the typical performance of sub-groups of students should not be released beyond the senior administrators of the university or school. If there are remaining questions about stereotype threat or its potential consequences, please contact IR&DS for further guidance.