We need your help!

This document is being developed collaboratively. We hope to make it the best possible resource by leveraging the collective knowledge of government experts and other stakeholders. You can contribute by:

Adding your comments to the discussion at the bottom of this page

Check out GovEx’s accompanying open data concerns video.

Responding to Open Data Concerns

Introduction

Cities starting down the path of opening data often have concerns about releasing data to the public. This document compiles some of the most common concerns we have heard about opening data. Along with these frequently cited issues, we have included technical responses, philosophical responses, and an example from a city for each concern to help other cities respond to these arguments.

Help us build a more comprehensive resource by letting us know if you have other frequently heard concerns or great examples of responses so we can include them here.

Addressing Concerns

1. Someone will reengineer the data to get personal information

"If we open our data, someone could identify individuals and then use that to harm them (referred to as the mosaic effect)."
    

Philosophical Response: Personal data should be protected and we must have strong governance practices to ensure that we are releasing data that should be released.
Technical Response: We can aggregate data to ensure that private information is protected. There are a variety of other approaches that can be included in a city’s plan to manage the risk of opening data.
City Example: Many cities aggregate crime data to the nearest intersection or block level when it is released (for example Seattle uses the Hundred Block Level). This allows critical information on public safety to be available to the public without identifying individuals.

2. Our data will say we are doing poorly

"We are not doing well and opening data will expose us to criticism from the public."
    

Philosophical Response: Hard-working city leaders and staff should never be satisfied with poor performance. Releasing this information can encourage self-correcting behavior, help the public understand our challenges, and generate stronger partnerships with stakeholders.
Technical Response: We can contextualize information by telling the story around the data. Contextualized data can also help make the case for additional resources or alternative solutions.
City Example: Many cities release datasets with explanations and important contextual information as a sort of justification of their data. This allows cities to demonstrate the action they are taking to address challenges and improve performance while promoting transparency and opening lines of communication with the public.

3. The data will be misinterpreted by end users

"Media or residents will take our data and miss key elements, telling the wrong story."
    

Philosophical Response: Opening data provides an opportunity to shape the story since users will draw their own conclusions with our without the released data.
Technical Response: Providing context along with data can help us tell our story and ensure that we are addressing items that might be commonly misinterpreted.
City Example: San Francisco provides excellent contextualized data for one of its biggest and highest profile challenges: housing. It's Housing Hub provides policies, reports, and resources to help put the data it is releasing in the proper context for users.

4. We don’t have time to prioritize opening data while doing our jobs

"Our staff are already too overworked to add another thing for them to do."
    

Philosophical Response: While it is true that an open data program takes resources, efficiencies are also gained. Most often these are from a reduction in freedom of information requests and improved data sharing across city departments.
Technical Response: We can automate the routine publication of data to significantly reduce staff burdens in the long-term.
City Example: Prioritizing an investment in open data can help staff focus on their jobs instead of responding to requests for data. For example the City of Hartford publishes towing data automatically every hour so that residents can check if their car has been towed, reducing the burden on staff to respond to requests.

5. We might get sued if we release protected information

"Releasing data is risky and we might get sued if we reveal something wrong."
    

Philosophical Response: To date, very few cities have been sued for releasing open data which turned out to be faulty or misleading. Having a plan and process to protect sensitive information can help limit a city's exposure to risk.
Technical Response: We can include disclaimers in the Terms of Use for our open data. We can also prepare a response plan to carry out in the event we accidentally release protected or incorrect information.
City Example: Chattanooga’s Open Data Terms of Use include limitations of liability and indemnification clauses in Section VIII.

6. The cost to keep this program going in the future will be too high

"We will spend a lot on software, staff, training, etc. while getting little in return."
    

Philosophical Response: It may be more costly to keep on with business as usual. Delivering open data encourages cities to view their digital information as a more strategic asset.
Technical Response: Publishing raw data allows third parties to deliver that information to the public (build apps, visualizations, etc.), often relieving us of that responsibility.
City Example: The Kansas City Area Transit Authority publishes route and schedule data. It also provides a directory of third-party applications which make that information more accessible to the public, reducing the need for it to do so directly.

7. Using an open data portal creates a cybersecurity risk to our internal IT systems

"Loading datasets onto an open data portal will leave our internal IT systems vulnerable to exploitation and puts our data at risk."
    

Philosophical Response: Generally, open data portals, and the data they hold, are completely separate from your internal IT systems. There are very infrequent exceptions to this, and in those cases, GovEx can provide best practices to ensure the greatest possible security of both your technology infrastructure as well as your data. There are no known examples of cybersecurity attacks where government data has been inappropriately obtained through open data portals.
Technical Response: Data and the systems it’s housed in are typically decoupled through automated ETL (extract-transform-load) or human intervention; there isn’t usually a connection between a customer accessing information on an open data portal and the computers where the data is maintained. During ETL processes, data is flattened, filtered, merged with information from other databases, and otherwise manipulated which obfuscates or masks the source data system(s). Finally, an effective open data program has processes to review data before it’s published to prevent the accidental release of data not appropriate for public consumption.
City Example: Having a clear process to publish data eliminates cybersecurity risks. Chattanooga created a workflow to ensure that data moves through the proper channels before being released. Once data is identified for publishing, it’s reviewed by the city attorney’s office, the office of performance management and open data, the relevant department’s open data coordinators, and the department of information technology. The city has also set clear protocols for decoupling data from the city’s data systems before releasing data to promote security.