Reflecting on Data, Power, and Pedagogy

Oct 23

By Autumm Caines and Michelle Ciccone

Coming Together in the Sandbox

At the end of September, we both participated in Data, Power, and Pedagogy, a 3 day in-person event hosted in collaboration between the Information Future’s Lab (IFL) at Brown University and HestiaLabs. The event was less conference and more workshop though it was often referred to as a “sandbox”. It was a unique professional development opportunity as it was designed to bring together a small working group to produce resources for educators and researchers more so than for the participants to just come and receive information as you might at a conference, for instance. There was an application process for participants, as funding was available, and neither of us realized that the other would be attending till the day before the event. We have seen one another in virtual spaces—like the recent 2022 Civics of Technology conference—but we had never directly collaborated or met in person. When we realized we would both be attending this event, we agreed that a reflection post for the CoT blog would be in order, and so here we are.

The event was held on Brown’s campus and COVID protocols were in place including pre-screening, masks, and air filters. The attendees ranged from researchers, to educators, to politicians and there were only about 15 of us. There was lots of ideation, imagination, envisioning, curating, and resource building. We started by ideating around personal objectives for the week, which ranged from building educational materials and developing curriculum to defining research projects to brainstorming how to raise awareness in general around misinformation, surveillance, and public health. This was followed by several rounds of a “speed dating” format when we all got to know one another. And that was just the first day. The second and third days ramped up as we were encouraged to work in groups or individually on the projects that we identified. The final day we all reported out on the various seeds we had planted and we collaborated on curating lists of readings and resources. IFL is now busy putting what we all built over the week into something for public consumption and we are awaiting that publication.

HestiaLabs

All of this was accented by conversations, workshops, and demos from HestiaLabs which were woven in throughout the three days. Based in Geneva and founded by Paul-Olivier Dehaye, who helped uncover the Cambridge Analytica scandal, HestiaLabs focuses on assisting groups of people with common interests to access and question their personal data stored by various companies. It is pretty common knowledge, especially among readers of the CoT blog, that platforms collect our data and that the primary business model of the companies behind those platforms is to monetize those data. Many of us see this as a trade off for “free” services but even services we pay for still often collect and monetize our data on top of charging us for the service. HestiaLabs and IFL provided us with instructions to download our data from a variety of platforms (Twitter, Google, Uber, TikTok). Just the act of making these requests had to be communicated prior to the workshop because of the complexity and time required in doing so. Autumm blogged a bit about her experience leading up to the event and she reflected on this process.

But getting the data is just the first step. After requesting your data from these sites most of them had a waiting period of at least a few days, after which you get a raw data file which is pretty hard to make sense of on its own. During the workshop we submitted these unwieldy raw data files to a browser-based data visualization tool built by HestiaLabs called DigiPower. DigiPower pulls out insights (as defined by HestiaLabs) about the data practices of these platforms, and ultimately attempts to help users develop their data literacy through dashboards which help users visualize their data. Platforms like Google and Twitter do provide their own tools that let you see some of the data they collect but the DigiPower tools go a bit deeper. For instance, though you can go to Google Timeline and see your location history (if you have it turned on) this only shows you one data point per location. The DigiPower tool showed us that there are actually multiple data points for every location that Google shows you on Timeline. Google does not really know where you have been for any given moment but they guess from multiple possibilities based on different kinds of information, including GPS coordinates, cellular pings, wifi pings, and finally by your own search history. (Not a big surprise that search history seemed to carry a lot of weight in terms of determining your presumed location.) And so, for every location that Google shows you on Timeline, which DigiPower showed us was labeled as “winner” by Google, there are actually multiple other “loser” data points, with the percentage of (un)certainty, which are the guesses that didn’t make the cut.

Reflections & Takeaways

This full process—of requesting our data from these platforms, then seeing our data visualized via DigiPower, then thinking about how we could turn this process into a teaching and learning experience in our contexts—surfaced some reflections and ideas that Autumm and Michelle are continuing to grapple with. We share some of our reflections below.

Tensions between accuracy and advertiser intent

It is definitely interesting to learn that there are multiple predictions behind each location that Google shows you in Timeline. And it is interesting to see the different ways that Google gets that data. But it is not exactly surprising in and of itself. After jumping through the multiple hoops and being presented with a mountain of data, even after it is broken down into colorful graphs which are interactive and responsive, you can be left with a sense of “so what?”

Michelle had this thought when she saw the top ads shown to her by Twitter—for TV shows she has no plans to watch and cars she will never buy. In that moment of uncovery, Michelle laughed at what she saw, identified this as incorrect targeting on the part of these platforms, and thought, “These advertising algorithms don’t work very well, do they? What’s to be concerned about?”

That’s a key tension, though, that is present whenever we engage in data literacy in our classrooms. These moments of discovering the inaccuracy of the targeting are important: they reveal the fallibility of these automated decision-making algorithms, make us laugh, give us back the feeling that we are in fact more complex human beings than any algorithmic system thinks we are. You want to put me in a box? Well, nice try, you can’t.

But we can’t stop there at the inaccuracy. The accuracy isn’t really the point. In fact, if we focus on the in/accuracy of the targeting we can fall into the trap of demanding to be seen better and more precisely by platforms. And for what? So that our complexity is more accurately reflected in our targeted ads? I don’t think that’s actually what I want.

The other problem with focusing on what we feel is accurate about our own interests at any given moment is that this trains our attention on something that the advertisers aren’t necessarily focused on. Sure, accurately addressing or speaking to users’ interests is important to advertisers—to a point. But ultimately advertisers are more interested in understanding what might nudge users towards certain behaviors, and that’s perhaps better understood as a game of probable futures more than accurate presents. What is the probability that a user who likes x, y, and z might be moved to click on this ad rather than this other one? Understanding our interests, habits, and personality is certainly helpful here, but advertisers don’t need to be able to peer into our innermost soul to engage in that game.

In fact, it’s not even really about us as individuals, because the targeting doesn’t happen at the individual level but rather at the group level. DigiPower showed that one of the targets that advertisers can purchase for their ads on Twitter is “follower look-alikes” (there’s not a lot of information on how this is determined besides what Twitter itself says about it). So even if an advertiser understood my interests more precisely, those precise interests would still be clustered into a group of “people like me”—reducing complexity into what is common among a group of people.

The funny thing is, despite having already reflected on this “trap of accuracy” before, Michelle still fell for it when she saw her inaccurately targeted ads staring at her on the screen. The promise of accuracy and of being seen is just too alluring.

When engaging in data literacy work in our classrooms, it’s helpful to keep two ideas at play at once: on the one hand, these algorithmic systems are nowhere near as “smart” as these platforms want to lead us to believe they are; and on the other hand, concerns about accuracy can distract us from the bigger picture, that these platforms are built on a logic of prediction that, one nudge at a time, may ultimately infringe upon users’ ability to make up their own mind. If we focus too much on the former point then we risk disconnecting from the conversation too early, as soon as we can laugh at the inaccuracy, and if we focus too much on the latter point then we risk interacting with these platforms as if they are already all-knowing, all-seeing, brainwashing machines. Those two ideas make for complicated classroom conversations, though in Michelle’s experience students of all ages are clamoring for this nuanced take, because either end of these extremes just doesn’t feel reflective of many of our lived experiences of using these platforms.

What does it mean to “care about” data privacy issues?

Another way of understanding Michelle’s “so what?” feeling as she looked at her inaccurately targeted ads is that “so what?” is an expression of ambivalence, and ambivalence is complicated. Ambivalence is created when we hold contradictory ideas that are hard to square with one another, and holding contradictory ideas can be kind of exhausting. Sometimes it’s easier to disconnect and say to ourselves, “If it’s so complicated then why should I care about this topic at all?”

Educators who engage in data literacy education can certainly see ambivalence in our students' reactions, and we can wonder sometimes whether our students care or not about data privacy issues. But this seemingly simple question—“Do students care?”—can swallow up a lot of complexity. Take for example the conversation about targeted ads above. Michelle cares and yet still feels ambivalent. Michelle cares deeply and yet can still feel this “so what?” response. During classroom conversations about data privacy issues, we can encounter multiple opportunities for feeling ambivalent—of caring and also not caring—in different ways. The question “Do students care?” smushes together all of this complicated grappling, as “caring” becomes a fixed category attempting to capture how each person aggregates all of these different ambivalences when they declare how much they care (or don’t) about data privacy issues.

So it feels important to understand “caring about” not as a yes-or-no thing—you either care or you don’t—but as a multidimensional spectrum. How people demonstrate “caring about” these issues may not look the way we expect it to. What’s more, behaviors that demonstrate “caring about” these issues are at least partly circumscribed by people’s personal and professional situations. The truth is, people can benefit greatly from engagement on platforms, whether it be in connecting to job opportunities or staying connected to friends and family. And so “caring about” these issues doesn’t always mean that students end up deleting certain social media accounts. It could! But you can care and yet still participate, for a variety of legitimate reasons. As educators, it’s important that we don’t put so much pressure on ourselves that the only outcome we recognize is wholesale rejection of the technologies that have, at this point, become infrastructural.

Conversations throughout the 3 day sandbox surfaced the wide range of what “caring about” can look like. During one activity, Michelle’s small group discussed the hoped-for outcome that our students may simply (powerfully!) share what they learned with other people in their lives. We also discussed having the longview, that more informed relationships to platforms may inform choices our students make long after any classroom assessment can capture it. In fact, developing that sense of choice—identifying opportunities where we can choose to not participate—in and of itself is significant.

Getting out of our feelings: from ambivalence to larger implications

“Caring about” data privacy is an important precursor to finding intrinsic value in an exercise where one is asked to evaluate their own data. But as we have shown, even those who do care about data privacy can suffer from ambivalence after engaging in such an exercise from the sheer number of data points and complexities of analysis. A natural question to arise when considering using such an exercise with students is, do students care at the onset and will entangling them in all this data help them care about these issues if they don’t? Perhaps an answer lies in starting from a different question. Autumm’s recent dive into past research on caring made her wonder if instead of concerning ourselves with “caring about”data privacy if perhaps questions about how “caring for”ourselves and our broader environments in the face of data collection and targeted advertising is more important?

Autumm too experienced some of this ambivalence when she looked at her personal data. In looking at her Twitter advertisers it was revealed that her top advertiser was BetMGM, an online sportsbook casino. “Thinking about the learning I was doing in that moment is interesting because of the immediate feelings of “so what” that came up for me. I think I may have physically rolled my eyes when I saw this was my top advertiser. I remember seeing those ads. They were everywhere and I don’t gamble so they were just annoying”. Looking at her data using DigiPower confirmed that the only criteria for why she was targeted with these ads was that she was over 21 and that she lived in Michigan. But what does this mean?

Thinking about how those ads impact our broader environments and doing a bit more investigation paints a more interesting picture. After coming home and reflecting on things a bit, Autumm did a little research and found that she was not the only person annoyed by the ads. They were everywhere in Michigan and not just on social media but on television, radio, and billboards—many people were talking about their prevalence. All of this only takes on meaning when you realize Michigan just legalized these gambling platforms a few years ago (with some of the tax revenue from them going to the schools by the way). There are deeper implications here but it took time for Autumm to realize those implications, as her lived experience of being annoyed by the ads initially got in the way of asking those deeper questions.

The advertising is no surprise to anyone paying attention to the legislation, but is advertising with such a broad brush really responsible on the part of advertisers? Broadcasting ads to large numbers of people is normal in TV and radio but social media advertising boasts about its nuanced approach—which makes it even more shocking to see social media platforms selling ad space of a potentially dangerous product so broadly! Gambling can be an addictive practice for some that can ruin lives, and Michigan is already seeing some of the repercussions of this. If we are evaluating this data from the perspective of “caring for” ourselves and our communities this becomes a different lens that may encourage “caring about” these matters.

Teaching data literacy

Caring about caring is essential to the topic of teaching data literacy. Students will of course participate in activities of the class just for the grade but for transformative learning to happen they will likely need to care about what they are doing. But the question of “do students care” is only interesting to a point. The point at which the answer to this question ends in apathy is where we stop being educators. If we are educators who care about these issues then it is our job to help students see why they should care about these issues. Helping students to care about these issues is at the heart of what we do. Now, not every educator needs to take on this particular issue—it may be outside the disciplinary scope for some. But if a student does not care about data privacy and targeted advertising, as educators who see the larger implications of these issues on our environments, we owe it to them, to ourselves, and to our neighbors to try to show them. An important way of doing this might be to pair the data analysis with research into larger happenings in politics and culture, like the above example of considering the larger context of gambling ads in Michigan demonstrates.

We also must try to teach students in a way that is safe for them or at the very least prepares them for any risks they might be taking in engaging with this learning. There is always a connection between the tools we use to teach these skills and our pedagogy, which cannot be ignored. DigiPower seems to be pretty privacy aware and all of what we have described so far happens in the browser with no transfer of data. (Note: The tool does have optional data sharing capabilities which can aggregate a group’s data on Hestia’s servers. This option is encrypted end to end and only the teacher/group leader can see the data. This feature is not discussed in this post.) But we cannot rely on tools alone to provide a safe environment and experience for students in learning about data literacy and must consider the experience itself. For instance, after Autumm’s experience of data analysis working in a small group around a table, later that day, someone mentioned that others could have gotten the impression that BetMGM was her top advertiser because she was prone to gambling. “I had not even considered this, I was just remembering those annoying ads, and was actively turning my computer to those I was sharing a table with and comparing results.” Students working together in a group trying to make meaning out of their own data could find themselves in a similar situation. Your lack of imagination about your own data may not result in a lack of imagination by others about what they think of you. Putting students in these situations without preparing them about assumptions they might make of others could lead to embarrassment and misunderstanding.

Some other challenges around teaching students data literacy with their own data that we discussed at the sandbox included:

Literacy issues, being intimidated by data analysis
Time and exhaustion
Limited access to computers
Less frequent use of apps (lack of data to analyze)
Despair arising from learning about these issues
People with identities that put them at a greater risk (race, gender, sexual orientation, diability, etc.)
People in situations that put them at a greater risk (abortion, abusive partners, undocumented status, etc.)
Dependence on platforms for livelihood (abundance of data to analyze but perhaps greater risk)

But this is the beginning of something, not the end, so these challenges are likely to be addressed as we go forward. Following the sandbox and the response received, Hestia has decided to hold regular seminars around the #DigiPower methodology. You can register here.

Marie Heath

Reflecting on Data, Power, and Pedagogy

What can speculative fiction teach us about technology?

Podcasts We Learn From

Civics of Technology