The Environmental Data Maze

A small movement at the top of a mountain causes a packed ball of snow to start slowly bouncing down. As it moves, it gathers momentum and substance. It starts rolling faster, gaining more materials, and growing bigger and bigger until it finds a resting place at the bottom of the mountain.

This so-called “snowball effect” is often the case with environmental data. A single image is captured and from that image, more people become concerned, more data is collected, involvement moves from community to agencies, and/or litigation. The initial data set (or even data point) is crucial, but we have to equally weigh the navigation of the legal environment with this information. Here enters what I’m calling, rather than a data pipeline, the “environmental data maze” a system of twists and turns that despite best attempts at navigation can lead to dead ends.

Beginning in 2015, I worked with a federal advisory council for the U.S. Environmental Protection Agency (EPA), the National Advisory Council on Environmental Policy and Technology (NACEPT). We responded to a charge from EPA on providing an assessment of EPA’s approach to citizen science and creating recommendations the Agency could take to integrate citizen science into the work of EPA. In the report, we developed a spectrum of use scenarios that pinpoint where these forms of citizen and community science can be useful to reach certain determined ends [1].

The spectrum of groups working with environmental data

On the far left of the spectrum are activities such as using community monitoring for core engagement activities, bringing people together to get them excited and activated about an issue. Or education, where environmental data and information can be used to teach about the environmental issue. In the middle, condition indicators, research, and management are activities that may be the first step in asking for a more multifaceted study or for communities to collect baseline information, do ongoing research and observation, or use environmental data and information to develop environmental management practices. As the snowball effect is applied to environmental data, almost all (if not all) cases that lead to regulatory decisions, standard-setting, and enforcement start at a point between the left and middle of the spectrum. The way that many projects use data and information is in support of core activities that land on the far left side of the spectrum, but I’m interested in how we can identify more cohesive leverage points to help people affected by environmental pollution achieve victories on the far right side.

After several weeks of conversations with stakeholders ranging from nonprofit to UN representatives, we’ve identified a recurrent issue. There is no clear framework for the movement of environmental data and information, leaving every situation unique. It takes significant navigation of both the data and information available and the legal systems (local and federal) to move across a landscape of environmental problem-solving. In some instances, this chain or flow of how information should work is unintentionally (or sometimes intentionally) obfuscated with special loopholes in laws. The individual discretion for regulators to interpret these laws along the way complicates matters even further. For instance, while California AB 617 makes significant progress in recognizing the need for community supplied data, it leaves a large gap in background air pollution.

Our models for how to best structure scientific data and information gathering at a community level are well documented, but the murkiness of data usefulness and impact for the chain of users along the environmental data maze is problematic. Though there are groups actively creating harmonization (though primarily for basic science rather than applied science applications) and metadata projects for community science and environmental data, we’re not placing enough focus on where this coordination will lead us.

As we engage in the application of science to answer environmental questions, there’s some core framing that we can apply to think about the end impact:

Environmental data can snowball. Much of the data that leads to policy and regulatory change in the examples we have are kick-started from an initial data set (or even data point). Though this can lead to deeper, stronger and more impactful involvement along the way, it also means that many cases of environmental data use require unique and resource-intense navigation.
As data snowballs, understanding what that snowball is moving towards becomes critical [2]. Identifying a) what data is appropriate for the circumstance and goals, and b) when the data available is good enough for the questions attempting to be resolved are steps in alleviating the burden of non-consumable or impactful data. A straightforward approach to this looks like discovering what data exists, identifying if it is sufficient and relevant and then considering what additional information might be needed for the purpose of answering an environmental question. For instance, if you’re trying to educate your community about potential issues within a watershed, is visual evidence sufficient? Or does it require you to take quantitative measurements?
Though we don’t need to collect data if it already exists, many times this information is hard to find (not discoverable) or it is presented in a format that is complicated to use. It is one thing to open a dataset, it is another to do so in a way that communicates the complexity (or sometimes simplicity) of the information in an impact and user-friendly way. These are considerations for all constituents who are releasing data-- can others find it, is it clearly presented, are there contextual elements that can be added so it is framed in a way that takes into account unique situations in which the data was captured?

Data and information themselves are percentage players. A vast majority of the critical infrastructure for how environmental information flows and is used specifically to create stronger policies and practices lies squarely in the realm of translating and navigating the vast laws, loopholes, and relationships that can shift outcomes. Importantly, this means that while there is a heightened ability for people to collect data and information (through technological innovation) and share it (through apps, social media, etc.), the ratio of data collected to data used is questionable. To address this, we must examine the resource-intense navigation of environmental data mazes in which a large portion of the impactful use of environmental data for policy, regulation, and enforcement are lagging.

We have a broken social contract between communities that have suffered the burden of pollution and the government. With no one model at hand for how we clarify the resource-intense process of navigating environmental data, information, laws and loopholes, what if we start focusing our efforts on fixing our broken social contract? This takes recognizing that it is not the responsibility of people affected by environmental pollution to navigate the complexity of the laws that govern environmental and health protections. Instead, there should be a transparent and open relationship between communities and their government to advance the betterment of the environmental public good, which starts with acknowledging through the application of relevant information, the environmental issues at hand. Our plan is to build a governance framework that outlines a future in which the environmental data maze can actually become an environmental data pipeline. With initiatives like this, we can start knitting together a new social contract for clarifying both environmental protection processes and responsibilities.

[1] The full report can be accessed here.

[2] For examples of this, see Clearing the Path: Citizen science and public decision making in the United States.

‍

May 28, 2020

Public Environmental Data Partners

Data Stewardship

Federation for American Scientists Policy Memo: A Certification System for Third Party Climate Models to Support Local Planning and Flood Resilience

Digital Toolkit for Collaborative Environmental Research (Digitcore)

Putting Data Centers on the Map: An Interview with Karen Edelstein

Local Advocacy in the Data Center Capital: An Interview with Julie Bolthouse

Data Science by Design

Low-cost and Open Tools for Environmental Decision-making

Community Data Demo & Workshop

Values and Principles

What We're Reading

Work With Us

Licenses

How We're Supported