What Is web 2.0 Design Patterns and Business Models for the Next Generation of Software 09/30/2005 The bursting of the dot-com bubble in the fall of 2001 marked a turning point for the web. Many people concluded that the web was overhyped, when in fact bubbles and consequent shakeouts appear to be a common feature of all technological revolutions. Shakeouts ty pically mark the point at which an ascendant technology is ready to take its place at center stage. The pretenders are given the bum's rush, the real success stories show their strength, and there begins to be an understanding of what separates one from the other. The concept of"Web 2.0" began with a conference brainstorming session between O Reilly and MediaLive International. Dale Dougherty, web pioneer and o'Reilly VP, noted that far from having"", the web was more important than ever, with exciting new applications and sites popping up with surprising regularity. What's more, the companies that had survived the collapse seemed to have some things in common. Could it be that the dot-com collapse marked some kind of turning point for the web, such that a call to action such as"Web 2.0 might make sense? We agreed that it did, and so the Web 2.0 Conference was born. In the year and a half since, the term"Web 2.0 has clearly taken hold, with more than 9. 5 million citations in Google. But there's still a huge amount of disagreement about just what Web 2.0 means, with some people decrying it as a meaningless marketing buzzword, and others accepting it as the new conventional wisdom This article is an attempt to clarify just what we mean by Web 2.0. In our initial brainstorming, we formulated our sense of Web 2.0 by example: Web 1.0 Web 2.0 doubleclick Google Adsense Ofoto -- Flickr BitTorrent mp3.com > Napster Britannica Online -- Wikipedia personal websites -- blogging evite -- upcoming org and EVDB omain name speculation -- search engine optimization
What Is Web 2.0 Design Patterns and Business Models for the Next Generation of Software by Tim O'Reilly 09/30/2005 The bursting of the dot-com bubble in the fall of 2001 marked a turning point for the web. Many people concluded that the web was overhyped, when in fact bubbles and consequent shakeouts appear to be a common feature of all technological revolutions. Shakeouts typically mark the point at which an ascendant technology is ready to take its place at center stage. The pretenders are given the bum's rush, the real success stories show their strength, and there begins to be an understanding of what separates one from the other. The concept of "Web 2.0" began with a conference brainstorming session between O'Reilly and MediaLive International. Dale Dougherty, web pioneer and O'Reilly VP, noted that far from having "crashed", the web was more important than ever, with exciting new applications and sites popping up with surprising regularity. What's more, the companies that had survived the collapse seemed to have some things in common. Could it be that the dot-com collapse marked some kind of turning point for the web, such that a call to action such as "Web 2.0" might make sense? We agreed that it did, and so the Web 2.0 Conference was born. In the year and a half since, the term "Web 2.0" has clearly taken hold, with more than 9.5 million citations in Google. But there's still a huge amount of disagreement about just what Web 2.0 means, with some people decrying it as a meaningless marketing buzzword, and others accepting it as the new conventional wisdom. This article is an attempt to clarify just what we mean by Web 2.0. In our initial brainstorming, we formulated our sense of Web 2.0 by example: Web 1.0 Web 2.0 DoubleClick --> Google AdSense Ofoto --> Flickr Akamai --> BitTorrent mp3.com --> Napster Britannica Online --> Wikipedia personal websites --> blogging evite --> upcoming.org and EVDB domain name speculation --> search engine optimization
page views ost per click screen scraping web services stems wikis directories(taxonomy) tagging (" folksonomy syndication The list went on and on. But what was it that made us identify one application or approach as"Web 1.0 and another as"Web 2.0"?(The question is particularly urgent because the Web 2.0 meme has become so widespread that companies are now pasting it on as a marketing buzzword, with no real understanding of just what it means. The question is particularly difficult because many of those buzzword-addicted startups are definitely not Web 2.0, while some of the applications we identified as Web 2.0, like Napster and BitTorrent, are not even properly web applications!)We began trying to tease out the principles that are demonstrated in one way or another by the success stories of web 1.0 and by the most interesting of the new applications 1 The web as platform many important concepts, Web 2.0 doesn't have a hard bour ut rather, a gravitational core. You can lize Web 2.0 as a set of principles and practices that tie together a veritable solar system of sites that demonstrate some or all of those principles, at a varying distance from that core Web 2.0 Meme Ma dogs: Parti cipa Not publishing Bittorrent ustomer saiser Wikipedia: Rich User Exmor enabling he long Red cli Trust Tho Wob as platform User PosIoning You control your own dsta Trust your users a techno ogy 5, not packaged software Architecture of Particlpation The Long Ta Cost-effective scalable bove the level of a sInglo devce Gata as the"intel Inside Rich User Experience The perpetual beta te more peaple Hackab时ly ghiNo ome nights reserv behavor no predetermined
page views --> cost per click screen scraping --> web services publishing --> participation content management systems --> wikis directories (taxonomy) --> tagging ("folksonomy") stickiness --> syndication The list went on and on. But what was it that made us identify one application or approach as "Web 1.0" and another as "Web 2.0"? (The question is particularly urgent because the Web 2.0 meme has become so widespread that companies are now pasting it on as a marketing buzzword, with no real understanding of just what it means. The question is particularly difficult because many of those buzzword-addicted startups are definitely not Web 2.0, while some of the applications we identified as Web 2.0, like Napster and BitTorrent, are not even properly web applications!) We began trying to tease out the principles that are demonstrated in one way or another by the success stories of web 1.0 and by the most interesting of the new applications. 1. The Web As Platform Like many important concepts, Web 2.0 doesn't have a hard boundary, but rather, a gravitational core. You can visualize Web 2.0 as a set of principles and practices that tie together a veritable solar system of sites that demonstrate some or all of those principles, at a varying distance from that core
Figure 1 shows a"meme map"of Web 2.0 that was developed at a brainstorming session during FOO Camp, a conference at O'Reilly Media. It's very much a work in progress, but shows the many ideas that radiate out from the web 2.0 core For example, at the first Web 2.0 conference, in October 2004, John Battelle and I listed a preliminary set of principles in our opening talk. The first of those principles was"The web as platform. "Yet that was also a rallying cry of Web 1.0 darling Netscape, which went down in flames after a heated battle with Microsoft. What's more two of our initial Web 1.0 exemplars, DoubleClick and Akamai, were both pioneers in treating the web as a platform. People dont often think of it as"web services", but in fact, ad serving was the first widely deployed eb service, and the first widely deployed"mashup"(to use another term that has gained currency of late) Every banner ad is served as a seamless cooperation between two websites, delivering an integrated page to a reader on yet another computer. Akamai also treats the network as the platform, and at a deeper level of the stack, building a transparent caching and content delivery network that eases bandwidth congestion. Nonetheless, these pioneers provided useful contrasts because later entrants have taken their solution to the same problem even further, understanding something deeper about the nature of the new platform. Both Double Click and Akamai were Web 2.0 pioneers, yet we can also see how it's possible to realize more of the ossibilities by embracing additional Web 2.0 design pattems Let' s drill down for a moment into each of these three cases, teasing out some of the essential elements of difference Netscape vs. Google If Netscape was the standard bearer for Web 1.0, Google is most certainly the standard bearer for Web 2.0, if only because their respective IPOs were defining events for each era. So lets start with a comparison of these wo companies and their positioning Netscape framed"the web as platform" in terms of the old software paradigm: their flagship product was the web browser, a desktop application, and their strategy was to use their dominance in the browser market to establish a market for high-priced server products. Control over standards for displaying content and applications in the browser would, in theory give Netscape the kind of market power enjoyed by Microsoft in the PC market. Much like the horseless carriage framed the automobile as an extension of the familiar, Netscape promoted a"webtop"to replace the desktop and planned to populate that webtop with information updates and applets pushed to the webtop by information providers who would purchase Netscape servers. In the end, both web browsers and web servers turned out to be commodities, and value moved"up the stack to services delivered over the web platform Google, by contrast, began its life as a native web application, never sold or packaged, but delivered as a service, with customers paying, directly or indirectly, for the use of that service. None of the trappings of sale, just usage. No porting to different platforms so that customers can run the software on their equipment, just a massively scalable collection of commodity PCs running open source operating systems plus megrown applications and utilities that no one outside the company ever gets to see
Figure 1 shows a "meme map" of Web 2.0 that was developed at a brainstorming session during FOO Camp, a conference at O'Reilly Media. It's very much a work in progress, but shows the many ideas that radiate out from the Web 2.0 core. For example, at the first Web 2.0 conference, in October 2004, John Battelle and I listed a preliminary set of principles in our opening talk. The first of those principles was "The web as platform." Yet that was also a rallying cry of Web 1.0 darling Netscape, which went down in flames after a heated battle with Microsoft. What's more, two of our initial Web 1.0 exemplars, DoubleClick and Akamai, were both pioneers in treating the web as a platform. People don't often think of it as "web services", but in fact, ad serving was the first widely deployed web service, and the first widely deployed "mashup" (to use another term that has gained currency of late). Every banner ad is served as a seamless cooperation between two websites, delivering an integrated page to a reader on yet another computer. Akamai also treats the network as the platform, and at a deeper level of the stack, building a transparent caching and content delivery network that eases bandwidth congestion. Nonetheless, these pioneers provided useful contrasts because later entrants have taken their solution to the same problem even further, understanding something deeper about the nature of the new platform. Both DoubleClick and Akamai were Web 2.0 pioneers, yet we can also see how it's possible to realize more of the possibilities by embracing additional Web 2.0 design patterns. Let's drill down for a moment into each of these three cases, teasing out some of the essential elements of difference. Netscape vs. Google If Netscape was the standard bearer for Web 1.0, Google is most certainly the standard bearer for Web 2.0, if only because their respective IPOs were defining events for each era. So let's start with a comparison of these two companies and their positioning. Netscape framed "the web as platform" in terms of the old software paradigm: their flagship product was the web browser, a desktop application, and their strategy was to use their dominance in the browser market to establish a market for high-priced server products. Control over standards for displaying content and applications in the browser would, in theory, give Netscape the kind of market power enjoyed by Microsoft in the PC market. Much like the "horseless carriage" framed the automobile as an extension of the familiar, Netscape promoted a "webtop" to replace the desktop, and planned to populate that webtop with information updates and applets pushed to the webtop by information providers who would purchase Netscape servers. In the end, both web browsers and web servers turned out to be commodities, and value moved "up the stack" to services delivered over the web platform. Google, by contrast, began its life as a native web application, never sold or packaged, but delivered as a service, with customers paying, directly or indirectly, for the use of that service. None of the trappings of the old software industry are present. No scheduled software releases, just continuous improvement. No licensing or sale, just usage. No porting to different platforms so that customers can run the software on their own equipment, just a massively scalable collection of commodity PCs running open source operating systems plus homegrown applications and utilities that no one outside the company ever gets to see
At bottom, Google requires a competency that Netscape never needed: database management. Google isn't just a collection of software tools, it's a specialized database. without the data, the tools are useless; without the software, the data is unmanageable. Software licensing and control over APIs--the lever of power in the previous era--is irrelevant because the software never need be distributed but only performed, and also because without the ability to collect and manage the data, the software is of little use. In fact, the value of the softwa is proportional to the scale and dy lism of the data it he/ps to Google's service is not a server--though it is delivered by a massive collection of internet servers--nor a browser--thoug h it is experienced by the user within the browser. Nor does its flagship search service even host the content that it enables users to find. Much like a phone call, which happens not just on the phones at either end of the call, but on the network in between, Google happens in the space between browser and search engine and destination content server, as an enabler or middleman between the user and his or her online experience. While both Netscape and Google could be described as software compa nies, it's clear that Netscape belonged to the same software world as Lotus, Microsoft, Oracle, SAP, and other companies that got their start in the 1980s software revolution, while Google' s fellows are other internet applications like eBay, Amazon, Napster, and yes, Doubleclick ys. Overture and Adsense Like Google, Double Click is a true child of the internet era. It harnesses software as a service, has a core competency in data management, and, as noted above, was a pioneer in web services long before web services even had a name. However, Double Click was ultimately limited by its business model. It bought into the 90s notion that the web was about publishing, not participation; that advertisers, not consumers, ought to call the shots; that size mattered, and that the internet was increasingly being dominated by the top websites as measured by Media Metrix and other web ad scoring companies. As a result, Double Click proudly cites on its website"over 2000 successful implementations" of its software. Yahoo! Search Marketing(formerly Overture)and Google Adsense, by contrast, already serve hundreds of thousands of advertisers apiece Overture and Google's success came from an understanding of what Chris Anderson refers to as"the long tail, the collective power of the small sites that make up the bulk of the web,s content. Double,s offerings require a formal sales contract, limiting their market to the few thousand largest websites. Overture and google figured out how to enable ad placement on virtually any web page. What's more, they eschewed publisher/ad-agency friendly advertising formats such as banner ads and popups in favor of minimally intrusive, context-sensitive rvice and algorith to the edges and not just the center, to the long tail an not just the head A Platform beats an Application Every Time n each of its past confrontations with rivals, Microsoft has successfully
At bottom, Google requires a competency that Netscape never needed: database management. Google isn't just a collection of software tools, it's a specialized database. Without the data, the tools are useless; without the software, the data is unmanageable. Software licensing and control over APIs--the lever of power in the previous era--is irrelevant because the software never need be distributed but only performed, and also because without the ability to collect and manage the data, the software is of little use. In fact, the value of the software is proportional to the scale and dynamism of the data it helps to manage. Google's service is not a server--though it is delivered by a massive collection of internet servers--nor a browser--though it is experienced by the user within the browser. Nor does its flagship search service even host the content that it enables users to find. Much like a phone call, which happens not just on the phones at either end of the call, but on the network in between, Google happens in the space between browser and search engine and destination content server, as an enabler or middleman between the user and his or her online experience. While both Netscape and Google could be described as software companies, it's clear that Netscape belonged to the same software world as Lotus, Microsoft, Oracle, SAP, and other companies that got their start in the 1980's software revolution, while Google's fellows are other internet applications like eBay, Amazon, Napster, and yes, DoubleClick and Akamai. DoubleClick vs. Overture and AdSense Like Google, DoubleClick is a true child of the internet era. It harnesses software as a service, has a core competency in data management, and, as noted above, was a pioneer in web services long before web services even had a name. However, DoubleClick was ultimately limited by its business model. It bought into the '90s notion that the web was about publishing, not participation; that advertisers, not consumers, ought to call the shots; that size mattered, and that the internet was increasingly being dominated by the top websites as measured by MediaMetrix and other web ad scoring companies. As a result, DoubleClick proudly cites on its website "over 2000 successful implementations" of its software. Yahoo! Search Marketing (formerly Overture) and Google AdSense, by contrast, already serve hundreds of thousands of advertisers apiece. Overture and Google's success came from an understanding of what Chris Anderson refers to as "the long tail," the collective power of the small sites that make up the bulk of the web's content. DoubleClick's offerings require a formal sales contract, limiting their market to the few thousand largest websites. Overture and Google figured out how to enable ad placement on virtually any web page. What's more, they eschewed publisher/ad-agency friendly advertising formats such as banner ads and popups in favor of minimally intrusive, context-sensitive, consumer-friendly text advertising. The Web 2.0 lesson: leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head. A Platform Beats an Application Every Time In each of its past confrontations with rivals, Microsoft has successfully
Not surprisingly, other web 2.0 success stories demonstrate playedthe platformcard, trumpingeven themost dominant applications this same behavior. eBay enables occasional transactions of windows allowed Microsoft to displace Lotus 1-2-3 with Excel only a few dollars between single individuals, acting as an wordPerfect with Word, and Netscape Navigator with Internet Explorer. automated intermediary Napster(though shut down for legal reasons) built its network not by building a centralized This time, though, the clash isnt between a platform and an application, song database, but by architecting a system in such a way but between two platforms, each with a radically different business that every downloader also became a server, and thus grew model: On the one side, a single software provider, whose massive the network installed base and tightly integrated operating system and APIs give contro over the programming paradigm; on the other, asystem without Akamai ys, Bittorrent an owner, tied together by a set of protocols, open standards and agreements for cooperation. Like Double Click, Akamai is optimized to do business with the head, not the tail, with the center, not the edges. While it Windows represents thepinnacle oft proprietary control via software APIs serves the benefit of the individuals at the edge of the web by Netscapetried towrestcontrol fram Mirosot using the same techniques smoothing their access to the high-demand sites at the that Mcrosoft itself had used against cher rival, and talled Bu Apache, center, it collects its revenue from those central sites which held tothe open standards of the weh has prospered. The battle s longer unequal, a platform versus a single application, but platform BitTorrent, like other pioneers in the P2P movement takes a versus platform, with the question being which platform, and more radical approach to internet decentralization. Every client is profoundly, which architecture, and which business model, is better also a server, files are broken up into fragments that can be suited to the opportunity ahead. served from multiple locations, transparently harnessing the network of downloaders to provide both bandwidth and data windows was a brilliant solution to the problems of the early PCeraIt to other users. The more popular the file, in fact, the faster it leveled the playing feld for application developers, salving a host of can be served, as there are more users providing bandwidth problems that had previously bedeviled the industry. But a single and fragments of the complete file monolithic approach, controlled by a single vendor, is no longer a solution, it's a problem. Communications-oriented systems, as the BitTorrent thus demonstrates a key Web 2.0 principle: the internet-as-platform most certainly is, require interoperability. Unlessa service automatically gets better the more people use it. vendor cancontrol bothends of every interaction, While Akamai must add servers to improve service, every lock-in via software APIs are limited BitTorrent consumer brings his own resources to the party There's an implicit "architecture of participation, a built-in Any Web 2. 0 vendor that seeks to lock in its application gains by ethic of cooperation, in which the service acts primarily as an controlling the platform will, by definition, no longer be playing to the intelligent broker, connecting the edges to each other and strengths of the platform ng the power of the users themselves. This is not to say that there are not opportunities for lock-in and 2. Harnessing Collective competitive advantage, but we believe they are not to be found via control oversoftwae API and protocols. There is a new gameafoot. The Intelligence companies that succeed in the Web 2. era wll be thasethat understan therules of that game, rather than trying togo backto the rules of the PC software era The central principle behind the success of the giants born in the Web 1.0 era who have survived to lead the Web 2.0 era appears to be this, that they have embraced the ower of the web to harness collective intelligence
Not surprisingly, other web 2.0 success stories demonstrate this same behavior. eBay enables occasional transactions of only a few dollars between single individuals, acting as an automated intermediary. Napster (though shut down for legal reasons) built its network not by building a centralized song database, but by architecting a system in such a way that every downloader also became a server, and thus grew the network. Akamai vs. BitTorrent Like DoubleClick, Akamai is optimized to do business with the head, not the tail, with the center, not the edges. While it serves the benefit of the individuals at the edge of the web by smoothing their access to the high-demand sites at the center, it collects its revenue from those central sites. BitTorrent, like other pioneers in the P2P movement, takes a radical approach to internet decentralization. Every client is also a server; files are broken up into fragments that can be served from multiple locations, transparently harnessing the network of downloaders to provide both bandwidth and data to other users. The more popular the file, in fact, the faster it can be served, as there are more users providing bandwidth and fragments of the complete file. BitTorrent thus demonstrates a key Web 2.0 principle: the service automatically gets better the more people use it. While Akamai must add servers to improve service, every BitTorrent consumer brings his own resources to the party. There's an implicit "architecture of participation", a built-in ethic of cooperation, in which the service acts primarily as an intelligent broker, connecting the edges to each other and harnessing the power of the users themselves. 2. Harnessing Collective Intelligence The central principle behind the success of the giants born in the Web 1.0 era who have survived to lead the Web 2.0 era appears to be this, that they have embraced the power of the web to harness collective intelligence: played the platform card, trumping even the most dominant applications. Windows allowed Microsoft to displace Lotus 1-2-3 with Excel, WordPerfect with Word, and Netscape Navigator with Internet Explorer. This time, though, the clash isn't between a platform and an application, but between two platforms, each with a radically different business model: On the one side, a single software provider, whose massive installed base and tightly integrated operating system and APIs give control over the programming paradigm; on the other, a system without an owner, tied together by a set of protocols, open standards and agreements for cooperation. Windows represents the pinnacle of proprietary control via software APIs. Netscape tried to wrest control from Microsoft using the same techniques that Microsoft itself had used against other rivals, and failed. But Apache, which held to the open standards of the web, has prospered. The battle is no longer unequal, a platform versus a single application, but platform versus platform, with the question being which platform, and more profoundly, which architecture, and which business model, is better suited to the opportunity ahead. Windows was a brilliant solution to the problems of the early PC era. It leveled the playing field for application developers, solving a host of problems that had previously bedeviled the industry. But a single monolithic approach, controlled by a single vendor, is no longer a solution, it's a problem. Communications-oriented systems, as the internet-as-platform most certainly is, require interoperability. Unless a vendor can control both ends of every interaction, the possibilities of user lock-in via software APIs are limited. Any Web 2.0 vendor that seeks to lock in its application gains by controlling the platform will, by definition, no longer be playing to the strengths of the platform. This is not to say that there are not opportunities for lock-in and competitive advantage, but we believe they are not to be found via control over software APIs and protocols. There is a new game afoot. The companies that succeed in the Web 2.0 era will be those that understand the rules of that game, rather than trying to go back to the rules of the PC software era
Hyperlinking is the foundation of the web As users add new content, and new sites, it is bound in to the structure of the web by other users discovering the content and linking to it. Much as synapses form in the brain, with associations becoming stronger through repetition or intensity the connections grows organically as an output of the collective activity of all web users Yahool, the first great internet success story, was born as a catalog, or directory of links, an aggregation of the best work of thousands, then millions of web users. While Yahoo! has since moved nto the business of creating many types of content, its role as a portal to the collective work of the net's users remains the core of its value Google's breakthrough in search, which quickly made it the undisputed search market leader, was PageRank, a method of using the link structure of the web rather than just the characteristics of Bay s product is the collective activity of all its users; like the web itself, e Bay grows organically in ny's activity can happen. What's more, eBay,s competitive advantage comes almost entirely from the critical mass of buyers and sellers, which makes any new entrant offering similar services significantly less attractive. Amazon sells the same products as competitors such as Barnesandnoble. com, and they receive the same product descriptions, cover images, and editorial content from their vendors. But Amazon has nade a science of user engagement. They have an order of magnitude more user reviews, invitations to participate in varied ways on virtually every page--and even more importantly they use user activity to produce better search results. While a Barnesandnoble. com search is likely to lead with the companys own products, or sponsored results, Amazon always leads with"most popular real-time computation based not only on sales but other factors that Amazon insiders call the"flow around products. With an order of magnitude more user participation, it's no surprise that Amazons ales also outpace competitors. Now, innovative companies that pick up on this insight and perhaps extend it even further, are making their mark on the web Wikipedia, an online encyclopedia based on the unlikely notion that an entry can be added by any web user,and edited by any other, is a radical experiment in trust, applying Eric Raymonds dictum (originally coined in the context of open source software) that with enough eyeballs, all bugs a shallow, to content creation. wikipedia is already in the top 100 websites, and many think it will be in the top ten before long. This is a profound change in the dynamics of content creation Sites like del icio. us and Flickr, two companies that have received a great deal of attention of late, have pioneered a concept that some people call"folksonomy"(in contrast to taxonomy), a style of Elaborative categorization of sites using freely chosen keywords, often referred to as tags. Tagging allows for the kind of multiple, overlapping associations that the brain itself uses, rather than rigid categories. In the canonical example, a Flickr photo of a puppy might be tagged both"puppy"and cute"--allowing for retrieval along natural axes generated user activity Collaborative spam filtering products like Cloudmark aggregate the individual decisions of email user about what is and is not spam, outperforming systems that rely on analysis of the message It is a truism that the greatest internet success stories don't advertise their products. Their ado ption is driven by viral marketing"--that is, recommendations propagating directly from one user to
• Hyperlinking is the foundation of the web. As users add new content, and new sites, it is bound in to the structure of the web by other users discovering the content and linking to it. Much as synapses form in the brain, with associations becoming stronger through repetition or intensity, the web of connections grows organically as an output of the collective activity of all web users. • Yahoo!, the first great internet success story, was born as a catalog, or directory of links, an aggregation of the best work of thousands, then millions of web users. While Yahoo! has since moved into the business of creating many types of content, its role as a portal to the collective work of the net's users remains the core of its value. • Google's breakthrough in search, which quickly made it the undisputed search market leader, was PageRank, a method of using the link structure of the web rather than just the characteristics of documents to provide better search results. • eBay's product is the collective activity of all its users; like the web itself, eBay grows organically in response to user activity, and the company's role is as an enabler of a context in which that user activity can happen. What's more, eBay's competitive advantage comes almost entirely from the critical mass of buyers and sellers, which makes any new entrant offering similar services significantly less attractive. • Amazon sells the same products as competitors such as Barnesandnoble.com, and they receive the same product descriptions, cover images, and editorial content from their vendors. But Amazon has made a science of user engagement. They have an order of magnitude more user reviews, invitations to participate in varied ways on virtually every page--and even more importantly, they use user activity to produce better search results. While a Barnesandnoble.com search is likely to lead with the company's own products, or sponsored results, Amazon always leads with "most popular", a real-time computation based not only on sales but other factors that Amazon insiders call the "flow" around products. With an order of magnitude more user participation, it's no surprise that Amazon's sales also outpace competitors. Now, innovative companies that pick up on this insight and perhaps extend it even further, are making their mark on the web: • Wikipedia, an online encyclopedia based on the unlikely notion that an entry can be added by any web user, and edited by any other, is a radical experiment in trust, applying Eric Raymond's dictum (originally coined in the context of open source software) that "with enough eyeballs, all bugs are shallow," to content creation. Wikipedia is already in the top 100 websites, and many think it will be in the top ten before long. This is a profound change in the dynamics of content creation! • Sites like del.icio.us and Flickr, two companies that have received a great deal of attention of late, have pioneered a concept that some people call "folksonomy" (in contrast to taxonomy), a style of collaborative categorization of sites using freely chosen keywords, often referred to as tags. Tagging allows for the kind of multiple, overlapping associations that the brain itself uses, rather than rigid categories. In the canonical example, a Flickr photo of a puppy might be tagged both "puppy" and "cute"--allowing for retrieval along natural axes generated user activity. • Collaborative spam filtering products like Cloudmark aggregate the individual decisions of email users about what is and is not spam, outperforming systems that rely on analysis of the messages themselves. • It is a truism that the greatest internet success stories don't advertise their products. Their adoption is driven by "viral marketing"--that is, recommendations propagating directly from one user to
another. You can almost make the case that if a site or product relies on advertising to get the word out it isn't Web 2.0. Even much of the infrastructure of the web--indluding the Linux, Apache and Perl, PHP,ol Python code involved in most web servers--relies on the peer-production of open source, in themselves an instance of collective, net-enabled intelligence. There are more than 100, 000 open source software projects listed on SourceForge. net. Anyone can add a project, anyone can dow load and use the code, and new proje cts migrate from the edges to the center as a result of users putting them to work, an organic software adoption process relying almost entirely on The lesson: Network effects from user contributions are the key to market dominance in the web 2.0 era Blogging and the Wisdom of Crowds One of the most highly touted features of the Web 2.0 era is the rise of blogging. Personal home pages have been around since the early days of the web, and the personal diary and daily opinion column around much longer than that, so just what is the fuss all about At its most basic, a blog is just a personal home page in diary format. But as Rich Skrenta notes, the chronological organization of a blog"seems like a trivial difference, but it drives an entirely different delivery advertising and value chain. One of the things that has made a difference is a technology called Rss Rss is the most significant advance in the fundamental architecture of the web since early hackers realized that cgi could be used to create database-backed websites. Rss allows someone to link not just to a page, but to subscribe to it, with notification every time that page changes. Skrenta calls this"the incremental web. others call it the"live web Now, of course, dynamic websites"(i.e database-backed sites with dynamically generated content) replaced static web pages well over ten years ago. What's dynamic about the live web are not just the pages, but the links. A link to a weblog is and notification for each change. An Rss feed is thus a much stronger link than, say a bookmark or a link to a The architecture of Participati systems are designed t age parbcpation In hs pape, Th a brge databar. The frst, demonstra ted by Yahoo, is to pay people to doit. The to perform the same task. The Open Directory Project, an open source Yahoo competitor,istheresult.ButNapsterdemonstratedathirdway.BecauseNapster titsdetaults to automatically musicthstwasdownloaded, every user automatically helped to buld the vaue of the shared database This same
another. You can almost make the case that if a site or product relies on advertising to get the word out, it isn't Web 2.0. • Even much of the infrastructure of the web--including the Linux, Apache, MySQL, and Perl, PHP, or Python code involved in most web servers--relies on the peer-production methods of open source, in themselves an instance of collective, net-enabled intelligence. There are more than 100,000 open source software projects listed on SourceForge.net. Anyone can add a project, anyone can download and use the code, and new projects migrate from the edges to the center as a result of users putting them to work, an organic software adoption process relying almost entirely on viral marketing. The lesson: Network effects from user contributions are the key to market dominance in the Web 2.0 era. Blogging and the Wisdom of Crowds One of the most highly touted features of the Web 2.0 era is the rise of blogging. Personal home pages have been around since the early days of the web, and the personal diary and daily opinion column around much longer than that, so just what is the fuss all about? At its most basic, a blog is just a personal home page in diary format. But as Rich Skrenta notes, the chronological organization of a blog "seems like a trivial difference, but it drives an entirely different delivery, advertising and value chain." One of the things that has made a difference is a technology called RSS. RSS is the most significant advance in the fundamental architecture of the web since early hackers realized that CGI could be used to create database-backed websites. RSS allows someone to link not just to a page, but to subscribe to it, with notification every time that page changes. Skrenta calls this "the incremental web." Others call it the "live web". Now, of course, "dynamic websites" (i.e., database-backed sites with dynamically generated content) replaced static web pages well over ten years ago. What's dynamic about the live web are not just the pages, but the links. A link to a weblog is expected to point to a perennially changing page, with "permalinks" for any individual entry, and notification for each change. An RSS feed is thus a much stronger link than, say a bookmark or a link to a single page. The Architecture of Participation Some systems are designed to encourage participation. In his paper, The Cornucopia of the Commons, Dan Bricklin noted that there are three ways to build a large database. The first, demonstrated by Yahoo!, is to pay people to do it. The second, inspired by lessons from the open source community, is to get volunteers to perform the same task. The Open Directory Project, an open source Yahoo competitor, is the result. But Napster demonstrated a third way. Because Napster set its defaults to automatically serve any music that was downloaded, every user automatically helped to build the value of the shared database. This same approach has been followed by all other P2P file sharing services
RSS also means that the web browser is not the only means one of the key lessons of the web 2.0 era k this: Hers addvalue. But only a of viewing a web page. While some RSs aggregators, such as small percentage of users will go to the trouble of adding vala yourapplicaton Bloglines, are web-based, others are desktop clients, and still via explicit means. Therefore, web 2.0 companies set induskve defauts for thers allow users of portable devices to subscribe to aggregating user dat and budding velue as a side-erect of ordnary use oft constantly updated content. appicaton. As noted above, they buld systems that get better the more people RSS is now being used to ot just notices of new blog entries, but also all kinds g quotes, weather data, and photo availability. This use is Napster, part of its fundamental architecture. actually a return to one of its roots: RSs was bon in 1997 out of the confluence of Dave Winers"Really Simple This architectural insight may also be more central to the success of open Syndication"technology, used to push out blog updates, and software than the more frequently ated appeal to volunteerism. The architecture Netscape's"Rich Site Summary", which allowed users of the internet, and the World wide Web, as wel as of open create custom Netscape home pages with regularly updated projects like Linux, Apache, and perL is such pursuing theirown data flows. Netscape lost interest, and the technology was sh"interest buld mle cive value asan by product Each of carried forward by blogging pioneer Userland, Winers projects has a small defined extension mechanisms, and an approach company. In the current crop of applications, we see, though, that lets amy well-behaved component be added by anyone, growing the outer the heritage of both parents layers of what Larry Wall, the creator of Perl refers to as"the oter words, these tednoboges demonstrat network effects, simply through th But RSS is only part of what makes a weblog different from that they have been designed an ordinary web page. Tom Coates remarks on the gnificance of the permalink demonstrates, by consistent effort It may seem like a trivial piece of functionality now, but it was as the Assodates program), it is passible to overby such an architectureon a effectively the device that turned weblogs from system that would not normally seem to possess ease-of-publishing phenomenon into a conversational mess of overlapping communities. For the first time it became relatively easy to gesture directly at a highly specific ost on someone else's site and talk about it. Discussion emerged. Chat emerged. And- as a result-friendships emerged or became more entrenched. The permalink was the first-and most successful -attempt to build In many ways, the combination of Rss and permalinks adds many of the features of NNTP, the Network News Protocol of the Usenet onto Http, the web protocol The blogospherecan be thought of as a new peer-to-peer equivalent to Usenet and bulletin-boards, the conversational watering holes of the early internet. Not only can people subscribe to each others' sites, and easily link to individual comments on a page, but also via a mechanism known as trackbacks, they can see when anyone else links to their pages, and can respond, either with reciprocal links, or by adding comments Interestingly, two-way links were the goal of early hypertext systems like Xanadu. Hypertext purists have celebrated trackbacks as a step towards two way links. But note that trackbacks are not properly two-way--rather, they are really(potentially) symmetrical one-way links that create the effect of two way links The difference may seem subtle, but in practice it is enormous. Social networking systems like Friendster, Orkut and LinkedIn, which require acknowledgment by the recipient in order to establish a connection, lack the same scalability as the web. As noted by Caterina Fake, co-founder of the Flickr photo sharing service, attention is
RSS also means that the web browser is not the only means of viewing a web page. While some RSS aggregators, such as Bloglines, are web-based, others are desktop clients, and still others allow users of portable devices to subscribe to constantly updated content. RSS is now being used to push not just notices of new blog entries, but also all kinds of data updates, including stock quotes, weather data, and photo availability. This use is actually a return to one of its roots: RSS was born in 1997 out of the confluence of Dave Winer's "Really Simple Syndication" technology, used to push out blog updates, and Netscape's "Rich Site Summary", which allowed users to create custom Netscape home pages with regularly updated data flows. Netscape lost interest, and the technology was carried forward by blogging pioneer Userland, Winer's company. In the current crop of applications, we see, though, the heritage of both parents. But RSS is only part of what makes a weblog different from an ordinary web page. Tom Coates remarks on the significance of the permalink: It may seem like a trivial piece of functionality now, but it was effectively the device that turned weblogs from an ease-of-publishing phenomenon into a conversational mess of overlapping communities. For the first time it became relatively easy to gesture directly at a highly specific post on someone else's site and talk about it. Discussion emerged. Chat emerged. And - as a result - friendships emerged or became more entrenched. The permalink was the first - and most successful - attempt to build bridges between weblogs. In many ways, the combination of RSS and permalinks adds many of the features of NNTP, the Network News Protocol of the Usenet, onto HTTP, the web protocol. The "blogosphere" can be thought of as a new, peer-to-peer equivalent to Usenet and bulletin-boards, the conversational watering holes of the early internet. Not only can people subscribe to each others' sites, and easily link to individual comments on a page, but also, via a mechanism known as trackbacks, they can see when anyone else links to their pages, and can respond, either with reciprocal links, or by adding comments. Interestingly, two-way links were the goal of early hypertext systems like Xanadu. Hypertext purists have celebrated trackbacks as a step towards two way links. But note that trackbacks are not properly two-way--rather, they are really (potentially) symmetrical one-way links that create the effect of two way links. The difference may seem subtle, but in practice it is enormous. Social networking systems like Friendster, Orkut, and LinkedIn, which require acknowledgment by the recipient in order to establish a connection, lack the same scalability as the web. As noted by Caterina Fake, co-founder of the Flickr photo sharing service, attention is One of the key lessons of the Web 2.0 era is this: Users add value. But only a small percentage of users will go to the trouble of adding value to your application via explicit means. Therefore, Web 2.0 companies set inclusive defaults for aggregating user data and building value as a side-effect of ordinary use of the application. As noted above, they build systems that get better the more people use them. Mitch Kapor once noted that "architecture is politics." Participation is intrinsic to Napster, part of its fundamental architecture. This architectural insight may also be more central to the success of open source software than the more frequently cited appeal to volunteerism. The architecture of the internet, and the World Wide Web, as well as of open source software projects like Linux, Apache, and Perl, is such that users pursuing their own "selfish" interests build collective value as an automatic byproduct. Each of these projects has a small core, well-defined extension mechanisms, and an approach that lets any well-behaved component be added by anyone, growing the outer layers of what Larry Wall, the creator of Perl, refers to as "the onion." In other words, these technologies demonstrate network effects, simply through the way that they have been designed. These projects can be seen to have a natural architecture of participation. But as Amazon demonstrates, by consistent effort (as well as economic incentives such as the Associates program), it is possible to overlay such an architecture on a system that would not normally seem to possess it
only coincidentally reciprocal. Flickr thus allows users to set watch lists--any user can subscribe to any other users photostream via RSS. The object of attention is notified, but does not have to approve the connection. If an essential part of Web 2.0 is harnessing collective intelligence, turning the web into a kind of global brain, the blogosphere is the equivalent of constant mental chatter in the forebrain, the voice we hear in all of our heads. It may not reflect the deep structure of the brain, which is often unconscious, but is instead the equivalent of conscious thought. And as a reflection of conscious thought and attention, the blogosphere has gun to have a powerful effect First, because search engines use link structure to help predict useful pages, bloggers, as the most prolific and timely linkers, have a disproportionate role in shaping search engine results. Second, because the blogging community is so highly self-referential, bloggers paying attention to other bloggers magnifies their visibility and wer.The "echo chamber" that critics decry is also an amplifie If it were merely an amplifier, blogging would be uninteresting. But like wikipedia, blogging harnesses collective intelligence as a kind of filter. What James Suriowecki calls"the wisdom of crowds"comes into play and much as Page Rank produces better results than analysis of any individual document, the collective attention of the blogosphere selects for value While mainstream media may see individual blogs as competitors, what is really unnerving is that the competition is with the blogosphere as a whole. This is not just a competition between sites, but a competition between business models. The world of Web 2.0 is also the world of what Dan Gillmor calls"we, the media, a world in which"the former audience", not a few people in a back room, decides what's important 3. Data is the next intel inside Every significant internet application to date has been backed by a specialized database: Google's web crawl, Yahoo!'s directory (and web crawl), Amazons database of products, eBay's database of products and sellers, MapQuest,s map databases, Napster's distributed song database. As Hal Varian remarked in a personal conversation last year, "SQL is the new HTML. " Database management is a core competency of Web 2.0 companies, so much so that we have sometimes referred to these applications as"infoware"rather than merely software This fact leads to a key question: Who owns the data? n the internet era, one can already see a number of cases where control over the database has led to market control and outsized financial returns. The monopoly on domain name registry initially granted by government fiat to Network Solutions (later purchased by Verisign) was one of the first great moneymakers of the internet. While weve argued that business advantage via controlling software APIs is much more difficult in the age of the nternet, control of key data sources is not, especially if those data sources are expensive to create or amenable to increasing returns via network effects. LookatthecopyrightnoticesatthebaseofeverymapservedbyMapquest,mapsyahoocommapsmsn.com, or maps. google. com, and you'll see the line Maps copyright NavTeq, TeleAtlas, or with the new satellite
only coincidentally reciprocal. (Flickr thus allows users to set watch lists--any user can subscribe to any other user's photostream via RSS. The object of attention is notified, but does not have to approve the connection.) If an essential part of Web 2.0 is harnessing collective intelligence, turning the web into a kind of global brain, the blogosphere is the equivalent of constant mental chatter in the forebrain, the voice we hear in all of our heads. It may not reflect the deep structure of the brain, which is often unconscious, but is instead the equivalent of conscious thought. And as a reflection of conscious thought and attention, the blogosphere has begun to have a powerful effect. First, because search engines use link structure to help predict useful pages, bloggers, as the most prolific and timely linkers, have a disproportionate role in shaping search engine results. Second, because the blogging community is so highly self-referential, bloggers paying attention to other bloggers magnifies their visibility and power. The "echo chamber" that critics decry is also an amplifier. If it were merely an amplifier, blogging would be uninteresting. But like Wikipedia, blogging harnesses collective intelligence as a kind of filter. What James Suriowecki calls "the wisdom of crowds" comes into play, and much as PageRank produces better results than analysis of any individual document, the collective attention of the blogosphere selects for value. While mainstream media may see individual blogs as competitors, what is really unnerving is that the competition is with the blogosphere as a whole. This is not just a competition between sites, but a competition between business models. The world of Web 2.0 is also the world of what Dan Gillmor calls " we, the media," a world in which "the former audience", not a few people in a back room, decides what's important. 3. Data is the Next Intel Inside Every significant internet application to date has been backed by a specialized database: Google's web crawl, Yahoo!'s directory (and web crawl), Amazon's database of products, eBay's database of products and sellers, MapQuest's map databases, Napster's distributed song database. As Hal Varian remarked in a personal conversation last year, "SQL is the new HTML." Database management is a core competency of Web 2.0 companies, so much so that we have sometimes referred to these applications as " infoware" rather than merely software. This fact leads to a key question: Who owns the data? In the internet era, one can already see a number of cases where control over the database has led to market control and outsized financial returns. The monopoly on domain name registry initially granted by government fiat to Network Solutions (later purchased by Verisign) was one of the first great moneymakers of the internet. While we've argued that business advantage via controlling software APIs is much more difficult in the age of the internet, control of key data sources is not, especially if those data sources are expensive to create or amenable to increasing returns via network effects. Look at the copyright notices at the base of every map served by MapQuest, maps.yahoo.com, maps.msn.com, or maps.google.com, and you'll see the line "Maps copyright NavTeq, TeleAtlas," or with the new satellite
magery services, "Images copyright Digital Globe. These companies made substantial investments in their databases( NavTeq alone reportedly invested $750 million to build their database of street addresses and directions. Digital Globe spent $500 million to launch their own satellite to improve on government-supplied imagery )NavTeq has gone so far as to imitate Intel's familiar Intel Inside logo: Cars with navigation systems ar the imprint, "NavTeq onboard Data is indeed the Intel Inside of these applications a sole source component in systems whose software infrastructure is largely open source or otherwise commodified The now hotly contested web mapping arena demonstrates how a failure to understand the importance of plications core data will eventually undercut its competitive positi mapping category in 1995, yet when Yahoo!, and then Microsoft, and most recently Google, decided to enter the market, they were easily able to offer a competing application simply by licensing the same data Contrast, however, the position of Amazon. com. Like competitors such as Barnesandnoble. com, its original database came from ISBN registry provider R.R. Bowker. But unlike MapQuest, Amazon relentlessly enhanced the data, adding publisher-supplied data such as cover images, table of contents, index, and sample material Even more importantly they harnessed their users to annotate the data, such that after ten years, Amazon, not Bowker, is the primary source for bibliographic data on books, a reference source for scholars and librarians as well as consumers. Amazon also introduced their own proprietary identifier, the AsIN, which corresponds to the IsBN where one is present, and creates an equivalent namespace for products without one. Effectively, Amazon embraced and extended their data suppliers Imagine if MapQuest had done the same thing, hamessing their users to annotate maps and directions, adding layers of value. It would have been much more difficult for competitors to enter the market just by licensing the se oa The recent introduction of Google Maps provides a living laboratory for the competition between application vendors and their data suppliers. Google's lightweight programming model has led to the creation of numerous value-added services in the form of mashups that link Google Maps with other internet-accessible data sources Paul Rademacher's housingmaps. com, which combines Google Maps with Craigslist apartment rental and home purchase data to create an interactive housing search tool, is the pre-eminent example of such a mashup At present, these mashups are mostly innovative experiments, done by hackers. But entrepreneurial activity follows close behind And already, one can see that for at least one class of developer Google has taken the role of data source away from Navteq and inserted themselves as a favored intermediary. we expect to see battles between data suppliers and application vendors in the next few years, as both realize just how important certain classes of data will become as building blocks for Web 2.0 applications The race is on to own certain classes of core data: location, identity, calendaring of public events, product re is significant cost to create th opportunity for an Intel Inside style play, with a single source for the data. In others, the winner will be the company that first reaches critical mass via user aggregation, and turns that aggregated data into a system service For example, in the area of identity Pay Pal, Amazons 1-click, and the millions of users of communications ystems, may all be legitimate contenders to build a network-wide identity database.(In this regard, Google's
imagery services, "Images copyright Digital Globe." These companies made substantial investments in their databases (NavTeq alone reportedly invested $750 million to build their database of street addresses and directions. Digital Globe spent $500 million to launch their own satellite to improve on government -supplied imagery.) NavTeq has gone so far as to imitate Intel's familiar Intel Inside logo: Cars with navigation systems bear the imprint, "NavTeq Onboard." Data is indeed the Intel Inside of these applications, a sole source component in systems whose software infrastructure is largely open source or otherwise commodified. The now hotly contested web mapping arena demonstrates how a failure to understand the importance of owning an application's core data will eventually undercut its competitive position. MapQuest pioneered the web mapping category in 1995, yet when Yahoo!, and then Microsoft, and most recently Google, decided to enter the market, they were easily able to offer a competing application simply by licensing the same data. Contrast, however, the position of Amazon.com. Like competitors such as Barnesandnoble.com, its original database came from ISBN registry provider R.R. Bowker. But unlike MapQuest, Amazon relentlessly enhanced the data, adding publisher-supplied data such as cover images, table of contents, index, and sample material. Even more importantly, they harnessed their users to annotate the data, such that after ten years, Amazon, not Bowker, is the primary source for bibliographic data on books, a reference source for scholars and librarians as well as consumers. Amazon also introduced their own proprietary identifier, the ASIN, which corresponds to the ISBN where one is present, and creates an equivalent namespace for products without one. Effectively, Amazon "embraced and extended" their data suppliers. Imagine if MapQuest had done the same thing, harnessing their users to annotate maps and directions, adding layers of value. It would have been much more difficult for competitors to enter the market just by licensing the base data. The recent introduction of Google Maps provides a living laboratory for the competition between application vendors and their data suppliers. Google's lightweight programming model has led to the creation of numerous value-added services in the form of mashups that link Google Maps with other internet-accessible data sources. Paul Rademacher's housi ngmaps.com, which combines Google Maps with Craigslist apartment rental and home purchase data to create an interactive housing search tool, is the pre-eminent example of such a mashup. At present, these mashups are mostly innovative experiments, done by hackers. But entrepreneurial activity follows close behind. And already, one can see that for at least one class of developer, Google has taken the role of data source away from Navteq and inserted themselves as a favored intermediary. We expect to see battles between data suppliers and application vendors in the next few years, as both realize just how important certain classes of data will become as building blocks for Web 2.0 applications. The race is on to own certain classes of core data: location, identity, calendaring of public events, product identifiers and namespaces. In many cases, where there is significant cost to create the data, there may be an opportunity for an Intel Inside style play, with a single source for the data. In others, the winner will be the company that first reaches critical mass via user aggregation, and turns that aggregated data into a system service. For example, in the area of identity, PayPal, Amazon's 1-click, and the millions of users of communications systems, may all be legitimate contenders to build a network-wide identity database. (In this regard, Google's