02.01.2010 data sharing, Open Government, privacy, Processes, security, transparency Comments Off on Data.gov CONOP – Five ideas posted to “Evolving Data.gov with You”

Data.gov CONOP – Five ideas posted to “Evolving Data.gov with You”

Following up on my comments and thoughts about the Open Government Directive and Data.gov effort, i just posted five ideas on the “Evolving Data.gov with You website and thought i would cross-post them on my blog as well…enjoy! r/Chuck

1. Funding – Data.gov cannot be another unfunded federal mandate

Federal agencies are already trying their best to respond to a stream of unfunded mandates. Requiring federal agencies to a) expose their raw data as a service and b) collect, analyze, and respond to public comments requires resources. The requirement to make data accessible to (through) Data.gov should be formally established as a component of one of the Federal strategic planning and performance management frameworks (GPRA, OMB PART, PMA) and each agency should be funded (resourced) to help ensure agency commitment towards the Data.gov effort. Without direct linkage to a planning framework and allocation of dedicated resources, success of Data.gov will vary considerably across the federal government.

2. Strategy – Revise CONOP to address the value to American citizens

As currently written, the CONOP only addresses internal activities (means) and doesn’t identify the outcomes (ends) that would result from successful implementation of Data.gov. In paragraph 1 the CONOP states “Data.gov is a flagship Administration initiative intended to allow the public to easily find, access, understand, and use data that are generated by the Federal government.”, yet there is no discussion about “what data” the “public” wants or needs to know about.

The examples given in the document are anecdotal at best and (in my opinion) do not reflect what the average citizen will want to see–all apologies to Aneesh Chopra and Vivek Kundra, but I do not believe (as they spoke in the December 8th webcast) that citizens really care much about things like average airline delay times, visa application wait times, or who visited the Whitehouse yesterday.

In paragraph 1.3 the CONOP states “An important value proposition of Data.gov is that it allows members of the public to leverage Federal data for robust discovery of information, knowledge and innovation,” yet these terms are not defined–what are they to mean to the average citizen (public)? I would suggest the Data.gov effort begin with a dialogue of the ‘public’ they envision using the data feeds on Data.gov; a few questions I would recommend they ask include:

  1. What issues about federal agency performance is important to them?
  2. What specific questions do they have about those issues?
  3. In what format(s) would they like to see the data?

I would also suggest stratifying the “public” into the different categories of potential users, for example:

  1. General taxpayer public, non-government employee
  2. Government employee seeking data to do their job
  3. Government agency with oversight responsibility
  4. Commercial/non-profit organization providing voluntary oversight
  5. Press, news media, blogs, and mash-ups using data to generate ‘buzz’

3. Key Partnerships – Engage Congress to participate in Data.gov

To some, Data.gov can be viewed as an end-run around the many congressional committees who have official responsibility for oversight of federal agency performance. Aside from general concepts of government transparency, Data.gov could (should) be a very valuable resource to our legislators.

Towards that end, I recommend that Data.gov open a dialogue with Congress to help ensure that Data.gov addresses the data needs of these oversight committees so that Senators and Congressmen alike can make better informed decisions that ultimately affect agency responsibilities, staffing, performance expectations, and funding.

4. Data Quality – Need process for assuring ‘good data’ on Data.gov

On Page 9 of the CONOP, the example of Forbes’ use of Federal data to develop the list of “America’s Safest Cities” brings to light a significant risk associated with providing ‘raw data’ for public consumption. As you are aware, much of the crime data used for that survey is drawn from the Uniformed Crime Reporting effort of the FBI.

As self-reported on the “Crime in the United States” website, “Figures used in this Report are submitted voluntarily by law enforcement agencies throughout the country. Individuals using these tabulations are cautioned against drawing conclusions by making direct comparisons between cities. Comparisons lead to simplistic and/or incomplete analyses that often create misleading perceptions adversely affecting communities and their residents.”

Because Data.gov seeks to make raw data available to a broad set of potential users; How will Data.gov address the issue of data quality within the feeds provided through Data.gov? Currently, federal agency Annual Performance Reports required under the Government Performance and Results Act (GPRA) of 1993 require some assurance of data accuracy of the data reported; will there be a similar process for federal agency data made accessible through Data.gov? If not, what measures will be put in-place to ensure that conclusions drawn from the Data.gov data sources reflect the risks associated with ‘raw’ data? And, how will we know that the data made available through Data.gov is accurate and up-to-date?

5. Measuring success of Data.gov – a suggested (simple) framework

The OMB Open Government Directive published on December 8, 2009 includes what are (in my opinion) some undefined terms and very unrealistic expectations and deadlines for federal agency compliance with the directive. It also did not include any method for assessing progress towards the spirit and intent of the stated objectives.

I would like to offer a simple framework that the Data.gov effort can use to work (collaboratively) with federal agencies to help achieve the objectives laid out in the directive. The framework includes the following five questions:

  1. Are we are clear about the performance questions that we want to answer with data to be made available from each of the contributing federal agencies?
  2. Have we identified the availability of the desired data and have we appropriately addressed security and privacy risks or concerns related to making that data available through Data.gov?
  3. Do we understand the burden (level of effort) required to make each of the desired data streams available through Data.gov and is the funding available (either internally or externally) to make the effort a success?
  4. Do we understand how the various data consumer groups (the ‘public’) will want to see or access the data and does the infrastructure exist to make the data available in the desired format?
  5. Do we (Data.gov and the federal agency involved) have a documented and agreed to strategy that prepares us to digest and respond to public feedback, ideas for innovation, etc., received as a result of making data available through Data.gov?

I would recommend this framework be included in the next version of the Data.gov CONOP so as to provide a way for everyone involved to a) measure progress towards the objectives of the OMB directive and b) provide a tool for facilitating the dialogue with federal agencies and Congress that will be required to make Data.gov a success.