For the past year in particular, and many more before that, I have invested no small part of my life into evangelizing open data. (Indeed we wrote a book about it at CfA, gave far too many talks, and I even took an appointment in local government focused on this effort.) Open data is what I do — or at least what I used to do. Looking back now, I have begun to wonder whether we might have been looking at things in the wrong way, or more to the point, whether I have been looking at things, just, wrong.
To explain, let’s take a short trip back through the brief history of open data in the public sector.
(N.B. Many others were more close to these projects than myself in the earlier days, so I plead humble ignorance for any oversight or overstatement.)
The Sunlight Era
Near the time of this President’s inaugural, a host of organizations were gaining steam around the notion of “open government.” Chief among them had to have been the Sunlight Foundation, a well-run and well-respected transparency / good government non-profit with the appropriate modus operandi that “sunlight is the best disinfectant.” Their charge was to hold governments — and in particular public officials — accountable to their decisions and those decisions’ impacts through holding them up in the light of day. Such a premise is nothing new in American democracy; the fourth estate has championed this ground for centuries through serious (sometimes dangerous) journalism, Freedom of Information Act requests, and a stubborn commitment to the real story. Around the end of this century’s first decade, however, three forces conspired to dramatically upend the press-led model for transparency: first, new media platforms decimated publications’ business models, particularly those for smaller, local outlets, leading to limited man hours for deep analysis or research; second, new technologies, ranging from simple modes like internet searching to cleverer ones like internet scraping and data science, opened up the doors to a new breed of “hacks and hackers” — tech-savvy inquisitors able to leverage their desktops instead of a newsroom to find a story; and third, and most relevantly governments started to get into the game themselves. What do I mean by that? The notion of “open government” quickly became synonymous with “open data,” and agencies began to freely publish their own internal documents, reports, and datasets online for public consumption. Enter the data portal. Data.gov was likely the most notable entry — many others soon followed, bolstered by executive or legislative mandates. Groups like Sunlight and others serves as intermediaries, pushing and pulling: sometimes knocking on the marble halls for more data, and others welcoming new publications with visualizations, apps, and tools to show its value. Tremendous work was done by Sunlight and others to made this data more readable and useful, including their OpenStates project, Influence Explorer and others.
Crucially, though, this era was marked technologically by manual open data publication. Yes, that sounds contradictory. But it isn’t. (In fact it’s not too distant from where we are today — more on that later). That is to say, data published tended to come from either difficult exports from legacy systems — where quality and timeliness were questioned — or from transcriptions from what had been paper based systems. (Indeed, this too happens now more often than one cares to admit.) More broadly, the ecosystem of players both inside and outside of government understanding the definition of and value in open data was so limited that much of the focus was on politics — elections, ethics, etc — and not government — buses, bills, and budgets.
So for good reason, our first house of open data infrastructure was build more out of papier-mâché than concrete or steel.
The Hackathon Era
The upshot? Data did continue to trickle out, and in fact, momentum started to grow at the state and local levels for similar efforts. But this era was also marked by the emergence of “data wranglers” and “manipulators,” those passionate soles willing to drudge through the raw data, clean it up, and re-issue it for “popular” (read: hacker) use.
And so the door opened to a new kind of civic engagement: civic hacking. Cities, in particular, caught on to this trend, working hand-in-hand with local, volunteer developers to build new and interesting civic applications. New York City launched its Big Apps competition; Chicago began its Hack Nights; and even the Feds — under Todd Park’s tremendous leadership — hosted “datapaloozas” to immerse in and then challenge developers and entrepreneurs to show an ROI from the data, to build new apps and businesses.
Indeed, I saw first-hand the energy and enthusiasm tech-savvy citizens brought to bear in this effort, not only in the thousands that would apply for the Code for America fellowship, but in the tens of thousands that signed up and participated in local (Brigade) events in their own cities. The fruits of their labors cannot go unnoticed. Just this week, the Washington Post published an article on transportation innovation using open data; previously hackers in Philadelphia not only built a transit app but managed to have their official transportation agency adopt it as their official platform. In Chicago, the community there has been on fire for years: the OpenGov Hack Night brings out 50–60 hackers weekly to build apps, and that they did: apps such as LargeLots that helps residents find low-cost residences; Expunge.io that clears up a juveniles criminal record (especially if they didn’t know they had one or if there’s anything they could have done about it); and ClearStreets that helps Chicagoans navigate their city as street sweepers clear the snow in the winter. (As a moment of personal pride, I’ll add that even in Los Angeles in the past year, the Hack for LA community has started monthly events and hackathons, each one selling out, with dozens of apps coming through, ranging from immigration reform tools to anti-bullying support.)
These innovations show promise for the open data movement moving forward, as they illustrate tangibly (or as tangibly as it can be on your iPhone) what can happen in the real world from open data.
If I were, however, to push back, a bit, on this optimism, I’d note that more often than not, the data being used to build these applications does not come from a city-hosted open data portal. No, indeed, often special requests for data are need, sometimes even hacks or scrapers (e.g. Philly); in other cases, the data comes not from the “data portal” as it were but from an agencies website (e.g. the immigration app simply rewrote the content from the federal immigration site in a more palatable, multi-lingual form); and even in a few, no civic open data is needed, such as the case of a drought hacking tool that uses sensors to measure your lawn and then pulls data from retailers such as HomeDepot and others to recommend water conserving plants.
This is a long way of saying that the data we need is often not the data we have — at least not what we have on our data portals. Why? I return to the data systems problem scene in the Sunlight era: departments — particularly those in cash strapped agencies — are burdened with legacy systems and overtasked IT professionals, who have to make the difficult decision of keeping the wifi or radio systems working (true story) or writing an ETL script from their years old Oracle database. Add to that the political pressures they come under to meet the demands from electeds or appointed pushing hard for this open data, even in the light of such urgent demands, these dedicated public servants are, simply put, between a rock and a hard place. Thus, I lay no blame on the still mostly manual process for open data publication — that’s simply the administrative and political environment we live in.
Yet this means that as we ask these passionate volunteers in the community to go to war for us and their communities with their rare skills, we are arming them with pencils and poster boards, not APIs or streaming feeds — let alone serious technology, data, and problem sets to make a difference.
This is a paradigm that must change. We owe it to not only those civic hackers but also to ourselves and the commitments we have made to open government. Some cities have shown a path forward: Chicago, notably, has built a remarkable data management team and ETL framework (opensourced it even!) that undoubtably has led to the unprecedented growth and impact of their civic tech community. I can only hope more cities follow this path forward, if they have the technical know-how and systems’ access to do so.
That — I must say — is a big if. As I’ve mentioned before, I grew up in a small, rural town of roughly 15,000 people (it’s dropped to 13,000 now). I can’t see Centralia, Illinois, building a data management team and erect an ETL infrastructure in the same skyline of Chicago’s. Nor do I think they should have to.
The web, as we are seeing, is moving every day towards more open access and interoperability. Most popular consumer platforms have even more popular APIs. Indeed, the civic tech landscape is becoming spotted with new startups providing open access to their data. Excitingly, these companies, too, are taking aim at many of the legacy systems in place, both on paper and on premise. Captricity helps take paper forms and transcribe them — though a bit of machine learning and Mechanical Turk — into clean, machine readable data. So that’s a start. (Indeed in our office one metric was number of pieces of paper removed.) Beyond that, though, this new era, I would say, is marked by an insistence on digital services.
Championed most famously be the UK Government Digital Services group, led by Mike Bracken, the notion of digital services is rather plain: transform the analog, paper-based, or legacy systems agencies used to interact with residents and make them open, citizen-centric, and simple. Think updating your passport or purchasing a parking pass. It’s as obvious as it is essential. The UK began their “digital transformation” a few years back, and has made steady progress, as seen here. So much progress in fact, that the U.S. Federal Government has launched its own set of sister programs, the US Digital Service and 18F. These teams of in-house developers and designers dive in to create digital-first interfaces to government. Sometimes that’s as “easy” as a better looking website, or other times as “hard” as ensuring the exchanges on Healthcare.gov began to work after their early hiccups. Indeed, it was Healthcare.gov that arguably gave life to this American breed of the digital services effort; after such a dramatic failure, a commitment was made, “Not again.” Now the digital interface is considered alongside the policy and the business process, meaning that the tech and data is not simply tacked on to the end but baked into the entire process.
So what does this mean for data? Much of the most relevant data is transactional: internally, that means bills payed, employees hired, etc; externally, that means crimes reported, traffic, and transit usage.
Once we remove our governmental transactions from on-paper, through snail-mail, or hidden behind the wrong side of the counter, we enter enter the real opportunities that 21st century technology affords: open-by-default, interoperable APIs, and device and location independence. All this is powered by data being created and managed through modern software, not legacy systems. Put simply, a digital-first strategy means data-first too.
To give a bit of color, consider a few of the civic companies currently being used or considered by governments for digital services: with ScreenDoor agencies can take any application process, run it online, and post immediately to common open data portals such as Socrata and CKAN; with SeeClickFix, 311 reports don’t need to be pulled from a legacy CRM, instead their open API enables direct export in real-time; and with OpenCounter, instead of permitting records coming from multiple departments systems (as they do now) the app can offer a direct feed into open data. Looking ahead, as more internet of things systems are rolled out through cities, such as TAP cards for public transit or real-time, location-based parking systems, rich, high-volume, and interesting data feeds should be at the ready. And ready not just for civic hackers usage, but also — and crucially — for agencies themselves to use for smarter, faster analytics and management.
Digital services provide the solid foundation a government’s data program demands.
This is the harmony of digital government and open government. I am increasingly convinced that we will not be able to have one without the other, and indeed, the opportunity for innovation, once we align these sister imperatives, is remarkable.
We Went the Wrong Way (Maybe)
In closing, I’m a bit loathe to admit, that we may have not listened close enough to our friends across the pond. If you track the progress of government innovation in the UK, you see a narrative that effectively flips ours on its head. Whilst we started with open data — primarily pushing from the outside — they made early and significant investments on digital services from the inside. GDS launched nearly 5 years ago and began going first after website after website one at a time, integrating them into a beautiful Gov.UK; then with a sturdy, single digital presence, they turned to digital services. And just recently they appointed the head and founder of GDS as the country’s first Chief Data Officer. His task was to weave together the data coming from those digital services — and others like them — into a powerful platform for the government and citizens to use to help make Britain better. (Note: he has now stepped down after a remarkable career in Whitehall.) Put another way, they did services first, data later.
Ours might have been a roundabout journey to a similar place. We benefit, however, from the palpable momentum and serious thought that has emerged from these past years. We have experimented, and we have learned. Ours is a big, diverse, complicated country, and our movement is housed in an ever-growing tent. Every day, more and more sign up for the USDS; every day, I come across another thoughtful article about how civic tech needs to think about diversity, community, and fairness; which is why I think, every day, I see another local or state agency pick up the shared mantle of open and digital government.