As we enter 2024, it is clear that 20th century institutions are not able to create, manage, or share the data needed to cooperate on the global challenges we face in the 21st century. While many governments, research institutions, civil society organizations, and some commercial organizations have opened up data in hopes that it will be put to good use, there’s little evidence that this is actually happening. We lack institutions capable of consolidating and harmonizing data at global scale, and it’s time to fix this.
I’ve written before that opening data is not enough. Through many conversations over the past year, I’ve found a few four-legged animal metaphors to explain why our existing institutions aren’t producing the data we need and how we might do better in the future.
A Unicorn is defined as a private company valued over $1B through venture funding. For the purposes of this exercise, let’s use a looser definition: a Unicorn is an extremely rare company that provides colossal returns on investment by creating a data monopoly. The Unicorn is an investor’s favorite animal.
Unicorn is the right word for these companies. They’re magical examples of how global capital markets can marshal resources, navigate risks, and leverage the Internet to transform or create entire industries. They’re also extremely rare: Bain claims that fewer than 1% of startups valued at $1B are profitable at scale. Their rarity is why this 10-year-old joke from the Onion is still funny: Economists Advise Nation’s Poor To Invent The Next Facebook.
A Unicorn is born when a company consolidates enough data to connect two sides of a market. Amazon connects sellers and buyers. Media platforms like Meta, TikTok, and Google connect consumers to advertisers. Uber connects drivers and people who need rides. AirBnB connects people with beds and people who need beds. All of these businesses used capital investments to create technologies like apps, devices, and even entire operating systems that serve to gather data that they can use to generate profits for their investors. While Unicorns show us how valuable data can be, their rarity highlights how expensive and difficult it is to create great data.
These are amazing businesses for investors, and they have created great experiences for consumers, but they can collapse under their own weight. Once a Unicorn has saturated its market, it becomes harder for it to grow at rates that will satisfy its investors. At this point, it will be obligated to raise its fees or diminish the quality of its services in order to continue growing its revenues. We’ve all experienced this through price hikes and increased advertising on the services we use. There’s even a name for this process: enshittification (see also, platform capitalism).
I’m not mad about Unicorns. I’m glad they exist. It’s good that we’ve figured out how to finance audacious ideas that can benefit so many people, and there are many businesses that are well served by equity investing. Many Unicorns generate so much wealth that they can afford to create open source projects or give away services that would have been unimaginable just years ago. Radiant Earth certainly wouldn’t exist without the services and largesse of big tech companies.
That said, there’s a simple reason why we shouldn’t count on Unicorns to solve global problems: it’s not their job. Unicorns exist to generate returns for their investors, and we can anticipate both positive and negative externalities as they do so.
We should be clear-eyed about the inherent limitations of Unicorns as we consider the systems we need to collaborate on global challenges. I have heard countless pitches from sustainability data startups that explicitly seek to be a platform that will connect corporations and governments with the data they need to meet their sustainability goals. We shouldn’t waste our time building these capitalized platforms. If they succeed at dominating their market, they’ll almost certainly enter a profit maximizing phase that will limit access to data or result in poor data management practices. We’ve already seen this happen with some “sustainability intelligence” services.
There are plenty of governments and NGOs that are rightfully skeptical of investor-owned businesses ability to manage data needed to solve global challenges, which brings us to our next animal…
The Show Pony
Show Pony is the term I use to describe most of the data platforms created by public sector institutions. Show Ponies have cool websites. Some of them boast portals or application programming interfaces (APIs). Sometimes they have mobile apps. They’ll often feature an interactive dashboard, sometimes with a beautiful map. They look great.
I call these Show Ponies because they don’t exist as part of a natural ecosystem. They exist to be touted. The Show Pony’s natural habitat is a conference. You rarely see them in the real world. Unlike with the Unicorns, I won’t name examples because I know and love many people who work on Show Ponies for a living.
Show Ponies are typically built with limited grant funding that is allocated on a project basis. Sometimes they’re created merely to be a proof of concept. In other cases, their funders hope that “if you build it, they will come.” But because Show Ponies are usually funded by governments or non-profit organizations, they rarely have a revenue model. So even if they do gain traction and users, a Show Pony’s continued existence depends on continued support from governments or philanthropy rather than their users. This is a fragile existence, and the Internet is littered with neglected Show Ponies that aren’t being maintained.
The way we fund Show Ponies also hampers global cooperation. Show Ponies produce data that reflect the biases, capabilities, and priorities of their funders. They produce data in different formats, with different quality standards, on different time scales, under different licenses. This variability makes it prohibitively expensive for most organizations to consolidate and harmonize data from Show Ponies to produce intelligence they can use.
I’m not mad about Show Ponies either! They’re evidence of what’s possible and the kind of change that we want to see in the world. But it’s time to stop funding Show Ponies and find another way to build global data infrastructure.
Now that I’ve described two straw animals, let me tell you about my favorite animal. We should know by now that grants rarely produce durable technology services and that equity investing isn’t the only way to finance technology. It is time to blaze a middle path in which philanthropic capital funds a variety of data service providers that are accountable to paying customers but aren’t owned by anyone. I call them Gazelles.
Here’s why I like the term:
Gazelles are small. It’s time to admit that no single organization will ever be up to the task of consolidating and stewarding all of the world’s data. Instead, we need small entities that can focus on discrete pieces of global data sharing infrastructure – e.g. data standards, universal identifiers, data publishing, data consolidation and stewardship.
Gazelles travel in herds. The global challenges we face are complex and they require close collaboration among many stakeholders. This requires an egoless commitment to cooperation that celebrates innovation no matter where it comes from. And if you’ll forgive the morbid side of this metaphor, herds have redundancy built into them. Some Gazelles might fail and go away, some of them may be replaced, but the herd won’t stop moving forward.
Gazelles are fast. Every day, it gets easier and less expensive to create scalable global digital platforms due to advances in network connectivity, the rise of ubiquitous mobile computing, competition within the commercial cloud sector, and continually expanding open source software communities. I should note that we have a few Unicorns and public sector entities to thank for this progress. If we’re building technology today, we have to move quickly. A Gazelle doesn’t have time to write 10-year strategic plans.
Gazelles are wild. A Gazelle should be governed by a nonprofit or trust that protects it from being owned by anyone, but it should be accountable to paying customers in order to survive. By being financially self-sustaining, Gazelles can avoid the fate of pet projects that end up neglected when their executive sponsor loses interest or is replaced.
Fortunately, Gazelles are also real. Existing organizations that could be classified as Gazelles include:
- 2i2c - a US non-profit that designs, develops, and operates Jupyter Hubs in the cloud for communities of practice in research & education.
- Clay - a project to produce open source machine learning models that can analyze Earth science data. (Note that Radiant Earth is the fiscal sponsor of Clay.)
- Climate Policy Radar - A UK not-for-profit that organizes, analyzes, and democratizes data on global climate law and policy, accelerating the transition to a low-carbon, resilient and just world.
- Common Crawl - a US non-profit that maintains a free, open repository of web crawl data that can be used by anyone.
- First Street Foundation - A US non-profit that makes climate risk data accessible, easy to understand, and actionable for individuals, governments, and industry.
- Global Legal Entity Identifier Foundation - A Swiss Foundation that provides Legal Entity Identifiers to provide open, standardized and high quality data about business worldwide.
- Open Supply Hub - A US non-profit that makes supply chain data open, accessible, and trusted for the public benefit.
- PLACE - A US-based non-profit that exists to map the urban world in ultra-high resolution and make these maps open, reliable and accessible. (Note that I am an advisor to PLACE.)
- Source Cooperative - A data publishing utility that allows trusted organizations and individuals to share data products using standard HTTP methods. (Note that Source Cooperative is one of our own initiatives!)
The Year of the Gazelle
I want 2024 to be the year of the Gazelle. I want funders to recognize the urgent necessity to create radically new 21st century institutions to enable the global cooperation we need. It has never been cheaper or easier to create global data sharing infrastructure – it’s time to figure out how to create it sustainably and ethically.
We have a lot of work ahead of us to make this happen. Foremost, we need a better definition of the concept. This post is a rough outline of what a new data sharing institution might look like, but it’s still too vague. In the coming months, we will be working on a Gazelle Manifesto with clearer criteria for Gazelles that address things like corporate governance and revenue models. Fortunately, we have a lot of inspiration to use for this, such as the The Principles of Open Scholarly Infrastructure, the concept of Focused Research Organizations (FROs), and the Institutional Architecture Lab.
We will also explore these ideas as we participate in a new learning network focused on climate data sharing launched by the Digital Impact Alliance at COP28.
If you’re interested in sponsoring a Gazelle or helping us refine these concepts, please get in touch. You can reach us at firstname.lastname@example.org and I encourage you to reach out directly to any of the organizations listed above.
Thanks to all of the people who have inspired and helped stress test these ideas with me, including Priya Vora, Davide Ceper, Michal Nachmany, Ian Schuler, Robert Cheetham, Bruno Sánchez-Andrade Nuño, Dan Hammer, Natalie Grillon, Chris Holdgraf, Chris Holmes, Jessica Seddon, Geoff Mulgan, Mala Kumar, Jeff Maki, Jack Hardinges, Joe Flasher, Christa Hasenkopf, and Peter Rabley. Very special thanks to Craig Mills for spurring me on and coming up with the initial vision of a new generation of nonprofits: “as agile as antelopes, tech-savvy, and uncompromisingly collaborative.”