Home / Post / Essays

Alexa — let’s talk about open data standards

February 10, 2017

Most people probably do not realize they rely on publicly available government everyday. Be it checking the weather before you head out in the morning or deciding on which bus or train route to take: you are using data — or information if you’d prefer — generated by the government and made freely available for public use. You likely did not realize this information even came from the government; you did not pull up your local transit authority’s website nor is the National Weather Service’s website bookmarked in your browser…

No, you access this critical information through the consumer apps and platforms that don’t feel at all like government systems: the weather app on your smartphone, DarkSky, Google/Apple Maps, Waze, etc. These are apps that go with the grain of your regular life, more as a consumer, less as a citizen (or #GovTech nerd). Indeed, these consumer apps, powered by civic data, typically showcase the greatest utility of open data, because the data comes to you.

Those two examples came easy to me, because they are now so integrated, they are commonplace, expected. That information is where we need it when we need it.

But this is hardly the rule when it comes to government data — calling it even the exception seems generous. Out of the hundreds of thousands of datasets released by tens of thousands of government agencies every year, only a handful make their way into a consumer experience in any meaningful way.

Why? Because data is messy business. And government data even more so. Because while “the government” as a singular monolith is more a caricature than a reality. Data is siloed across agencies within just one jurisdiction let alone across a region, state, or country. (I once heard an apocryphal claim that for every policeman in Canada, the United States has a police department.)

Syncing up data across the multitude of jurisdictions within the country is an uphill battle to say the least, but one worth fighting. Why? Because if we can’t, it becomes next to impossible for government data to make its way into consumer experiences. Scale and success are synonymous for consumer apps and city-by-city customization an antonym. (That’s more a B2B play.)

So how square this round peg? Data standards. Or common data schemas and metadata. While this sounds wonky, it simply means getting all relevant agencies to publish their information — say bus routes or restaurant inspection scores — in spreadsheets with consistent headers. Or put even more simply: using an Excel template.

Easy enough in theory, this is remarkably hard. Consider public transportation information — routes, stops, and schedules for buses, subways, and light rail — just that, that simple information would have to rely on at least nine data streams, all hopefully synced up, and all then standardized from region to region. This hardly happens.

This is a windmill I’ve tilted at for some time. I can safely point to two examples for local governments where this has effectively happened: transit data and restaurant inspection scores. (The scarcity of examples is why those are a common refrain in this discussion.)

The best summary of the transit data example — probably the most widely adopted and oldest — comes from one its chief architects, Bibiana McHugh in Beyond Transparency (an open data anthology published by CfA). Read it, you should, if you’re interested in this area. The Cliff Notes version is straightforward. It’s called the “killer app” stratagem for data standards. Basically a big, popular third party (i.e. Google) builds a technology apt for consumption of public data (i.e. public transit schedules), and because if it’s potential scale and impact, that company is able to partner with an agency (or a few) to develop a schema and make it into the de facto standard by sheer force of appeal. More and more people were using Google Maps for directions, and so sooner enough, more and more agencies got onboard with the (then) Google Transit Feed Specification, GTFS — now General Transit Specification Feed. (Naturally as the standard spread, it dropped its platform specific name for, well, a more general one.)

Similar gambits have been attempted with traction — though limited when compared to GTFS — for restaurant inspection scores with Yelp and for building inspection data with Trulia. These are limited less in their specific usage with the “killer app” but in their slow uptake by other developers for other apps, which would indicate a healthy ecosystem around the standard. (GTFS for instance is used by NextBus, a widely popular, open source transit platform.)

All in all, we see then a handful or so examples of data standards taking off, even though arguably there are dozens if not hundreds or thousands of examples where readily accessible government data would be useful for citizens in their everyday life: ranging from notifications of road closures, garbage pickups, or special events to alerts on crime, fires, and other emergencies. And everything in between.

One might push back and argue that since equivalent data standards haven’t emerged for these datasets maybe they lack salience. But I would argue that a few key phenomenon in today’s technological environment have stayed this invisible hand:

Consumer apps increasingly play in a “winner takes all” marketplace, which limits the number of potential “killer apps” out there.
Given the length of time it took a standard like GTFS at take hold (5+ years) there is limited appetite for such a long game by established and emergent players alike.
Social media streams such as Twitter or Facebook have become a kind of consumer platform integration for data (although this makes any kind of structured usage of the “data” next to impossible).

Long story short, there’s a lack of incentive on either side — government or company — to invest in data standardization.

Until now, I hope.

What affords me this optimism? It’s not a what, but a who: Alexa.

Or maybe Siri. Or Cortana. Or Google Assistant. (Ok, that last one doesn’t really work rhetorically…)

These devices do more than simply articulate answers to web searches; they talk to you. They understand (when they can) what you are looking for and instead of serving up a list of links, they deliver the music, answer, or information you’re looking for. When they can. I consider the advancement similar to that from WebTV to Apple TV. If you recall, the former merely projected your typical web experience — well atypical, it was way worse than a traditional monitor — onto your television; Apple TV created a custom, controlled experience designed for the television: optimized for video (which is what you likely wanted), guarded by design standards, and done in partnership with the content providers to ensure a great experience. Similarly the voice assistants of today do indeed have limited functionality, but limited with a purpose: to optimize the experience of the user. You can only get certain answers from services that have stepped up to program them with human friendly, conversational language.

That’s why I was excited to see a city government develop its own offering for Amazon’s Alexa — as hopefully the start of more to come.

Recently the City of Los Angeles launched an Alexa skill that can answer questions about “City Council, Council Committee, and featured events occurring within the City.”

This may seem trivial. But I contend that this opens the door to significantly more interesting, new ways to engage with you local government. It’s not that this information isn’t available online; it is. It’s that now we are beginning to integrate that information into more casual, friendly and direct experiences. That little black cylinder sits in your living room or office. Increasingly then we will become more accustomed to engaging with typical consumer services through it — for instance getting the news or the weather — and slowly but surely more transactional services will emerge — already you can order a pizza or an Uber. You no longer need to know URLs or even really care where the information is coming from; it’s just there.

Now returning to the public sector, the challenge of getting users to go to or download a government specific app is largely the rationale for pursuing third party data integrations. It’s just hard to compete with Facebook, Google, Yelp, etc when it comes to public interest. What these in-home devices, synced up with public data or even services, do is subtlety insert government resources into your daily routine in a way that doesn’t feel like government at all:

Reflect the kinds of “human” interaction we engage government staff with
Are inherently localized and designed to answer with relevant, local answers
Can become increasingly personalized, meaning that you don’t have to enter your name or address over and over again

That’s a leap frog opportunity for governments that are still struggling to even build website or web services on par with the private sector.

For this to work, however, at scale, standards must come into play. If every city has to build its own “skill” and encourage people to download their specific rev, then adoption will be just as slow as user numbers on government apps (for reference 311 apps, which are rather common, have alarmingly low usage statistics because again we live in a “winner takes all” environment). Instead, the smart play is the standards play. If we can work on identifying the core pieces of information citizens want, and the essential services they need, and develop lightweight systems for sharing that standard information — GTFS for instance just requires a government to upload a zip file of excel sheets — then we can inch toward a world where governments can better do their jobs in a digital age: by being where you need them, when you need them, in a friendly, human way.

N.B.: Congratulations to the City of Los Angeles’ terrific Information Technology agency, specifically Ted Ross and Joyce Edison, for making the Alexa skill a reality — and in so doing inspiring my thinking on this topic.

Alexa — let’s talk about open data standards

Let's connect