The Rise of Chrome

Since Chrome’s initial release in 2008 it has grown in market share, and now controls more than half of the desktop browsers market. In contrast with Internet Explorer, the previous dominant browser, the growing dominance of Chrome was not achieved by marketing practices such as bundling the browser with a pre-loaded operating system. The shift to Chrome therefore raises the question of how Chrome achieved this remarkable feat, while other browsers such as Firefox and Opera were left behind. We show that both the performance of Chrome and its conformance with relevant standards are typically better than those of the two main contending browsers, Internet Explorer and Firefox. In addition, based on a survey of the importance of 25 major features, Chrome product managers seem to have made somewhat better decisions in selecting where to put effort. Thus the rise of Chrome is consistent with technical superiority over the competition.


Introduction
The most notable use of the Internet is the World Wide Web (WWW). The web was created by Tim Berners-Lee and his colleagues at CERN (The European Organization for Nuclear Research) in 1989. In order to consume information from the web, one must use a web browser to view web pages. The first web browser (which was in fact named WorldWideWeb) was developed at CERN as part of the WWW project [1]. But the first popular browser, which set the growth of the web in motion towards the wide use we see today, was Mosaic, which was developed by Marc Andreessen and Eric Bina at the National Center for Supercomputing Applications (NCSA) in 1993 [40].
The open nature of the web makes it possible for different browsers to co-exist, possibly providing different features, user interfaces, and operating system support. Over the years different browsers have competed for the user's choice. The competition between browsers has led to several "browser wars" -periods of fierce competition between different web browsers that are characterized by technological innovation and aggressive marketing, typically leading to the eventual dominance of one browser and the fall of another. In recent years we have witnessed such a shift (albeit somewhat protracted) from Microsoft's Internet Explorer to Google's Chrome.
The reasons for this shift are most probably a mix of technical reasons and marketing reasons. Our goal is to explore the technical aspects, and see whether they can explain the growing popularity of Chrome. In particular, we wanted to assess the technical quality of chrome and compare it with the quality of its rivals. To do so we downloaded all the version of Chrome, Firefox, and Internet Explorer that were released over a period of five years, and compared them using a set of benchmarks which together provide a rather comprehensive coverage of browser functionality and features. As far as we know our work is by far the widest study of its kind.
In a nutshell, we find that Chrome is indeed technically superior to other browsers according to most commonlyused benchmarks, and has maintained this superiority throughout its existence. Also, based on a survey of 254 users, the features pioneered by Chrome ahead of its competitors tend to be those that the users consider more important. Thus Chrome's rise to dominance is consistent with technical superiority. However, one cannot rule out the large effect of the Google brand and the marketing effort that was invested as factors that contributed greatly to the realization of Chrome's technical potential.

Browsers History
Not long after the release of the Mosaic web browser in 1993 it became the most common web browser, keeping its position until the end of 1994. The factors contributing to Mosaic's popularity were inline graphics, which showed text and graphics on the same page, and popularizing the point and click method of surfing. Moreover, it was the first browser to be cross-platform including Windows and Macintosh ports. Amazingly, by the end of 1995 it's popularity plummeted to only 5% of the web browser market [28]. This collapse in Mosaic's popularity was concurrent to the rapid rise of Netscape Navigator which was released in December 1994 and managed in less than two years to reach around 80% market share (different sources cite somewhat different numbers).
Several factors are believed to have caused the fast adoption of Netscape by users. First, it was a natural followup of Mosaic as it was developed by the same people. Second, Netscape introduced many technological innovations such as on-the-fly page rendering, JavaScript, cookies, and Java applets [28]. Third, Netscape introduced new approaches to testing and distribution of web browsers by releasing frequent beta versions to users in order to test them and get feedback [43].
Netscape's popularity peeked in 1996 when it held around 80% market share. But in August 1995 Microsoft released the first version of Internet Explorer based on an NCSA Mosaic license. A year later, in August 1996, with the release of Internet Explorer 3, a browser war was on. By August 1999 Internet Explorer enjoyed 76% market share [41].
During this browser war it seems that Internet Explorer did not have any technological advantage over Netscape, and even might have been inferior. Therefore, other reasons are needed to explain Internet Explorer's success. One reason was that Netscape's cross platform development wasn't economical: instead of focusing on one dominant platform (Windows) it had approximately 20 platforms which caused a loss of focus. Meanwhile, Microsoft focused on only one platform. Second, Microsoft bundled Internet Explorer with Windows without a charge, and as Windows dominated the desktop operating systems market Explorer was immediately available to the majority of users without any effort on their part. In an antitrust investigation in the U.S., Microsoft was found guilty of abusing its monopoly in the operating systems market by bundling Internet Explorer with Windows for free. Lewis describes this as follows [33]: "Adding Internet Explorer to Windows 95 and calling it Windows 98 is innovation in Gates' terminology, but it is monopolizing according to DOJ." Settling the antitrust case took several years (October 1997to November 2002, during which Internet Explorer deposed Netscape as the most popular browser. And once Internet Explorer was entrenched, it's market share grew even more due to a positive feedback effect. The standard tags used in HTML (Hyper-Text Markup Language, in which web pages are written) are defined by the W3C (World Wide Web Consortium). However, both Microsoft and Netscape extended the HTML standard with their own special tags, thus creating two competing sets of HTML tags and behaviors. Web developers with limited resources then had to choose one of these tag sets, and as Internet Explorer usage grew they opted to use Internet Explorer's extensions, thereby making it ever more preferable over Netscape [35].

Browsers Usage Statistics
In the six years since its release Chrome has dethroned Internet Explorer, and Firefox's market share has also decreased, as shown in Figure 1. Data for such statistics is obtained as follows. Browser usage can be tracked using a counter embedded in the HTML source code of web sites. The counting is implemented using a request to a counting service, enabling the counting service to also extract the browser information from the request and to use it to tabulate browser usage statistics. The data shown is from one of these services, StatCounter.com [11].
There are two main methods to interpret web browsers usage. The first method is to measure how many page loads came from each type of browser in a certain period of time. The second method counts how many unique clients (installations) were active in a certain period of time. Therefore, if a user visits 10 web pages, the first method will count these visits as 10 uses of the browser, while in the second method will count them as one user. Since the two methods measure different parameters their results may differ. The first method favors browsers that are used by heavy users, while the second method just counts the number of unique users without taking their activity into account, which may be a drawback if we consider users who use the web extensively to be more important. Moreover, identifying unique users is non-trivial and requires manipulating the raw data. We therefore use the raw counts data, and specifically the data for desktop browsers only not including mobiles and tablets. (The nick in the graphs at August 2012 represents the beginning of collecting data about tablets separately.) As shown in the graph, Chrome's market share has risen consistently over the years, largely at the expense of Internet Explorer. As of January 2015, Chrome was responsible for 51.7% of the page loads while Internet Explorer was responsible for 21.1%, Firefox for 18.7%, and other browsers for 8.4%.
Note that in fact any site can track the distribution of browser usage among the users who access that site. Such tracking may lead to different results if the visitors to a certain site prefer a certain browser. For example, w3schools.com (a site devoted to tutorials about web development) also publishes data about browser usage. Their results for January 2015 are that 61.9% use Chrome, 23.4% use Firefox, and only 7.8% use Internet Explorer [12]. This probably reflects a biased user population of web developers who tend to work on Linux platforms rather than on Windows. At the opposite extreme, netmarketshare.com claims that only 23% of the market uses Chrome, while fully 58% still use Internet Explorer (these figures are again for January 2015) [13]. There is a danger that the StatCounter data is also biased, but it is thought to provide a good reflection of common usage by the public on popular web sites, because its data is based on counters embedded in many different sites. Further justification is given in the threats to validity section.

Research Questions
Despite possible differences in usage statistics, it is clear that Chrome is now a dominant player in the web browser market. The question is how this dominance was achieved, and in particular whether it is justified from the technical point of view. We divide this into the following specific questions: 1. Is Chrome technically superior to its competitors? Specifically, (a) Is the performance of Chrome superior to that of its competitors as measured by commonly accepted browser performance benchmarks? (b) Is the start-up time of Chrome competitive with the start up times of its competitors? (c) Does Chrome conform to web standards better than its competitors as measured by commonly accepted browser conformance benchmarks?
2. Given that the browser market is not static and web usage continues to evolve, (a) Did Chrome introduce features earlier than its competitors? (b) Were the features that Chrome introduced first more important than those introduced by its competitors?
To answer these questions we tested the three major browsers which together account for 91% of the market share. Thus we did not initially test Opera and Safari, whose market share is very low; Safari is also less relevant as it is tightly linked to the Mac OS X platform, so it does not compete with Chrome for most users 1 . The question regarding the release of important features earlier also involved a wide user survey to assess the relative importance of different features. Note that the performance and conformance evaluations are not tied to one point in time, but rather they are evaluated over the whole period when chrome achieved its rise in market share. As a result we also found interesting information about the consistency (and sometimes inconsistency) of browser benchmarks which was not anticipated in advance.

Technical Performance
In this section we present the methodology and results pertaining to answering research question (1), namely the relative performance of Chrome and the competing browsers.

Experimental Design
Timing is important to web page designers, because it affects the user experience [39]. But the precise definitions of performance metrics are complicated to pin down [34]. As a result quite a few different benchmarks have been designed and implemented over the years. Instead of proposing yet another benchmark, we used several of the more  widely accepted and commonly used benchmarks to evaluate the technical performance of the different browsers, selecting a set which cover a wide range of functionalities.
These benchmarks are divided to two categories. The first category is performance, and tests the performance of different aspects of the browsers. This included general browser performance, aspects of JavaScript processing, and in particular support for the HTML5 <canvas> tag. The second category is conformance, and tests the conformance of the different browsers to common standards such as the HTML5 and CSS3 standards. Note, however, that the tests typically check only that elements of the standard are recognized, and not the quality of the implementation. This is because assessing the quality may be subjective and depend on graphical appearance. In addition we implemented our own methodology for measuring startup times, as this was not covered by the available benchmarks. The benchmarks and their response variables are listed in Table 1.
Note that the benchmarks do not include low-level issues such as memory usage. The reason is that these benchmarks are not intended to characterize the interaction between the browser and the underlying hardware platform, but rather the interaction of the browser with the user. We selected these benchmarks because market share depends on users who are more influenced by general performance and features, not by details of hardware utilization.
In order to assess the technical performance of the competing browsers during the period when Chrome gained its market leadership, we measured all the releases of these browsers using all the benchmarks. Specifically, the measurements covered all Chrome versions from 1 to 31, all Firefox versions from 3 to 26, and all Internet Explorer versions from 8 to 11, meaning all the browsers versions in a five year span starting in mid 2008 until the end of 2013 ( Figure 2). This is the period from the first release of Chrome until it achieved around 50% market share.

Execution of Measurements
The measurements were conducted on two identical Core i5 computers (Lenovo ThinkCentre M Series with i5-3470 CPUs running at 3.20 GHz) with 4GB RAM each, and Windows 7 Professional SP1 operating systems. One machine ran Windows 7 32 bit and the other ran Windows 7 64 bit. The browser versions used on the 32 bit system were Chrome 1-12, Firefox 3-5, and Internet Explorer 8-9, i.e. all the browsers released up to May 2011. The browsers versions used on the 64 bit system were Chrome 13-31, Firefox 6-26, and Internet Explorer 10-11. The versions were divided between the machines since we encountered some compatibility issues with earlier versions   To make sure that the measurements are consistent between 32 bit and 64 bit operating systems and eliminate operating system bias we have checked a third of the browser versions on both systems, focusing on versions with a six months release gap. Examples for two of the benchmarks are shown in Figure 3. SunSpider 1.0.2 is an example of a relatively big difference between the results on the two platforms (which is still quite small), and PeaceKeeper is an example of a very small difference. In general we did not see any dramatic differences between the platforms. We therefore do not present any more results about such comparisons.
On all performance benchmarks we ran 3 repetitions of each measurement, while for the start-up times measurements we ran 20 repetitions. Error bars are used in the graphs to show the standard error. In all cases, the benchmarks and tests were the only thing that ran on the test machines. The test machines had an Internet connection outside our firewall, so they were not on the local departmental network. This was done for security reasons, as our systems group refused to allow the use of old (and probably vulnerable) browsers within the firewall.
Not all the measurements ran properly with all the versions, especially with earlier versions. The problems were due to the fact that most of the benchmarks were designed and written later than some of the browser early versions, and used some features or technology that were not yet implemented in those early versions. The details are given in Table 2.

SunSpider
SunSpider is a well known benchmark developed by WebKit, an open source web browser engine project. Its goal is to measure core JavaScript performance and enable the comparison of different browsers or successive version of the same browser. WebKit designed this benchmark to focus on real problems that developers solve with JavaScript [14]. Therefore the benchmark does not include microbenchmarks to evaluate specific language features, but rather tasks such as generating a tagcloud from JSON input, 3D raytracing, cryptography, and code decompression. Moreover, each of the tests is executed multiple times to ensure statistical validity. However, perhaps due to these repetitions, the behavior of the benchmark may actually not mimic real JavaScript work on production sites [36].
The benchmark measures the time to perform a set of tasks, so lower values are better. In the study we chose to use version 1.0.2, which is the current version and was introduced by WebKit in order to make the tests more reliable [15,16]. However, version 1.0.2 didn't work on old browser versions (Table 2). Therefore, we used version 0.9.1 on old browser versions [17], specifically those that were tested on the 32 bit machine.
Using SunSpider 0.9.1 we find that when Chrome was introduced it scored significantly better than Internet Explorer and Firefox. In the second version tested of Firefox (Firefox 3.5) the score was greatly improved but still lagged the parallel Chrome version. Although Internet Explorer 8 was released a couple of months after Chrome 1 it was five times slower. It took more than two years for Firefox and Internet Explorer to catch up with Chrome's parallel version (Figure 4a). In fact, Internet Explorer 9 not only caught up with Chrome but surpassed it. This superior performance has been attributed to its JavaScript optimization for dead code elimination, which some say was specifically done to boost SunSpider performance [18,19].
In the SunSpider 1.0.2 tests Internet Explorer continued to show significantly better results compared to its rivals. Firefox and Chrome showed similar results most of the time (Figure 4b). For some reasons Chrome versions 30 and 31 had problems with this benchmark, but these were fixed in Chrome 32.

BrowserMark 2.0
BrowserMark 2.0 is a general browser benchmark developed by Rightware (Basemark), a purveyor of benchmarking and evaluation technology for the embedded systems industry. Originally designed to test mobile and embedded devices, it is nevertheless commonly used to also test desktop browsers. The benchmark tests general browser performance including aspects such as page loading, page resizing, standards conformance, and network speed, as  well as WebGL, Canvas, HTML5, and CSS3/3D. The calculated score combines all of these and higher scores are better.
The early versions of Internet Explorer and Firefox did not work with this benchmark (which is understandable given that the benchmark version we used was released only in November 2012). All of the browsers tested showed a distinct improvement trend as new versions were released ( Figure 5). Chrome in all of its versions was better than the equivalent rivals and showed a steady improvement over time. Internet Explorer also showed an improvement over time but always came in last from all the browsers tested. Firefox performance was between Chrome and Internet Explorer. Interestingly, it showed an inconsistent behavior, with the general improvement in benchmark score mixed with local decreases in score.

CanvasMark 2013
CanvasMark 2013 is a benchmark for performance testing the HTML5 <canvas> tag [20]. This tag is a container for graphics, which are typically drawn using JavaScript. The benchmark is composed of several stress tests, using elements that are commonly used in games such as operations on bitmaps, canvas drawing, alpha blending, polygon fills, shadows, and drawing text. Each test starts with a simple scene and adds elements until the browser is reduced to a rendering rate of 30 frames-per-second (the rate decreases as the scene becomes more complex). The score is a weighted average of the time the browser managed to perform at above 30 frames-per-second. Higher scores are better.
In this benchmark's documentation there was a note for Chrome users using Windows, encouraging them to change a setting in order to get better results due to a bug in the GPU VSync option for the Windows version of Chrome. However, we did not change the setting since we want to test the versions as the average user would.
The results of running the benchmark show that Chrome exhibited inconsistent results over time ( Figure 6). A great improvement was achieved from version 4 to 7 (version 4 is the first shown, because the benchmark did not run on version 1-3). In contrast there was a sharp decline from version 10 to 12. Later, an improvement occurred from version 14 to 17, immediately followed by a sharp decline of 50% of the score in version 18. But in spite of all these inconsistencies it was still better than Firefox and Internet Explorer during this time. Internet Explorer showed an improvement from version 9 to version 10, when it became the best-performing of the three browsers, due to a deterioration in Chrome's scores. Chrome surpassed Internet Explorer again only in the last version tested. Firefox had the lowest scores, and does not show any improvement over time.

PeaceKeeper
PeaceKeeper is a browser benchmark developed by FutureMark, a purveyor of mostly hardware benchmarks for desktop and mobile platforms [21]. (Rightware, the company that developed BrowserMark, was a spinoff from FutureMark.) It includes various tests designed to measure the browser's JavaScript performance, but given that JavaScript is so widely used in dynamic web pages, it can actually be considered to be a general benchmark for browser performance. The tests include various aspects of using JavaScript on modern browsers, such as the <canvas> tag, manipulating large data sets, operations on the DOM tree (the Document Object Model, which describes the structure of a web page), and parsing text. The score reflects processing rate (operations per second or frames per second rendered), so higher is better. In addition the benchmark includes various HTML5 capability checks, such as WebGL graphics, being able to play various video formats, and multithreading support.
Chrome scored noticeably better results compared to its rivals for this benchmark, throughout the period of time that we checked ( Figure 7). However, note that PeaceKeeper did not run on early versions of Chrome (Table 2). Also, while there was a general trend of improvement, it was not monotonic. Firefox and Internet Explorer scored similar results, both showing an improvement over time but still lagging behind Chrome.

Start-up Time Measurement Methodology and Results
An important feature of all browsers, which may affect user satisfaction, is their startup times when they are launched. As we did not find a suitable benchmark that evaluates startup times we conducted specialized measurements to test the browser's cold start-up times. A cold start-up time is when the browser starts for the first time since the operating system was booted.
We tested the start-up times as follows. We wrote a script that runs during the operating system start-up. This script launches the browser one minute after the script starts running. The lag is meant to let the operating system finish loading. A time stamp is created just before launching the browser in order to mark the start time. The browser was set to open with a specially crafted page when it came up. The script passed the time stamp to the crafted page via a URL parameter. The crafted page creates a second time stamp indicating the start of the page processing. The difference between the two time stamps was defined as the browser start-up time. The start-up times are then sent to a server for logging. Advantages of this procedure are, first, that it is independent of network conditions, and second, the test is similar to the user's real experience of launching the browser and loading the first page.
The first versions of Chrome were the fastest to load (Figure 8). However, as Chrome's development advanced, it's start-up times crawled up. In Chrome version 7 the start-up times improved dramatically, but then continued to crawl up from version 13. In version 29 there was a spike in the start-up time, a 2.5 fold increase compared to the

HTML5 Compliance
HTML (Hyper-Text Markup Language) is the language used to describe web pages, and the current version is HTML5. HTML5 introduced features like the <canvas> tag for use by multimedia applications, and integrated SVG (Scalable Vector Graphics) and MathML (for mathematical formulas). The first working draft of HTML5 was published in 2008, and the standard was finally approved in 2014, so its definition process fully overlaps the period of Chrome's rise.
The HTML5 Compliance benchmark consists of three parts. The main part is checking the conformance of the browser to the HTML5 official specification. The second part is checking specifications related to HTML5 such as WebGL. The third part is checking the specification for experimental features that are an extension to HTML5 [22]. The score is the sum of points awarded for each feature that is supported.
The results for this benchmark show that all the browsers improve over time. Firefox had the best score until version 3.6, and after that Chrome version 4 and up had the best score ( Figure 9). Internet Explorer always had the lowest score.

CSS3 Test
CSS (Cascading Style Sheets) is the language used to describe the style of HTML pages. For example, using CSS one can set the style for web page headings and make it different from the default of the browser. The current version is 3, although level-4 modules are being introduced.
CSS3 Test checks how many CSS3 elements in the W3C specification does a certain browser recognize [23]. This means CSS3 Test only checks the recognition itself but does not check the implementation or the quality of the implementation, namely whether the resulting rendition of the web page indeed looks like it should.
Interestingly Chrome's score did not change in the first three years, though it still managed to have a better score than its rivals. From version 15 Chrome consistently improved until the last version tested, remaining better than its rivals all along (Figure 10). Firefox showed several improvements in a stepwise manner. Internet Explorer   had the lowest score in the first version tested (version 9) but improved its score dramatically in versions 10 and 11, achieving essentially the same level as Firefox.

Browserscope Security
Browserscope is a community-driven project which profiles various aspects of web browsers. One of these is the obviously important feature of security. Specifically, Browserscope Security is a collection of tests meant to check "whether the browser supports JavaScript APIs that allow safe interactions between sites, and whether it follows industry best practices for blocking harmful interactions between sites" [24]. For example, one of the tests checks whether the browser has native support for JSON parsing, which is safer than using eval. The score is simply how many tests passed. While this is not strictly a conformance test, as there is no official standard, we include it due to the importance of security features on the Internet.
The results are that all three browsers exhibited a general (although not always monotonic) improvement in their security results over time. The relative ranking according to these tests is very consistent between browser versions (Figure 11). Across practically the whole period Chrome had the highest score, Firefox had the lowest, and Internet Explorer was in between. The only exception is a large dip in score for Chrome versions 2 and 3, where version 2 was the worst of all parallel browser versions. This was surprising because of the overall consistency, and the fact that in the first version released Chrome had the highest score compared to its rivals.

Additional Results with Opera
Our main measurements focused on the three top browsers, which together control more than 90% of the desktop market. But when considering the relative importance of technical issues as opposed to marketing, we felt the need to also consider the smaller browsers. This is especially important in the early years, when Chrome's market share was low, and the question was what enabled Chrome to surge ahead while other browsers were left behind.
We therefore conducted a few additional measurements using Opera. We focused on Opera and not on Safari for two reasons. First, Opera has a reputation for being an innovative and technologically advanced browser. Second, Safari is specifically targeted for Apple platforms, and therefore is not really part of the same desktop market as the other browsers we are studying.
Not all versions of Opera were tested, as many of the benchmarks did not run properly on early versions. The results were that Opera performance was generally inferior to that of Chrome (two examples are shown in Figure 12). In some benchmarks, notably BrowserMark and Browserscope Security, its scores were actually lower than for all other browsers for many years. The sharp improvement in BrowserMark shown in Figure 12 is probably due to the move to using WebKit (and thus the same rendering engine as Chrome) in version 15 [25]; similar improvements were also seen in some other benchmarks in this version. In other benchmarks, such as HTML5 Compliance and CSS3 Test, Opera's results were similar to those of Firefox throughout. The only benchmark in which Opera was the best browser for a considerable period was CanvasMark, but this period only started in 2012 (and performance dropped in version 15).

Feature Selection and Release
Another aspect in which the browsers differ from one another is their feature sets: all obviously have the same basic features allowing users to browse the web and display web pages, but new features are added all the time as web usage continues to evolve. However, not all features have the same importance, so it is advantageous for a browser to have the most meaningful features as early as possible.
In this section we present the methodology and results pertaining to answering research question (2), namely which browsers released features early and which browsers lagged in releasing features. In addition we wanted to evaluate the importance of each feature. We used an online survey to assess the importance of each feature to the end users.

Experimental Design and Methodology
The investigation of the features embodied in each browser and their release times involved the following steps: # Feature Explanation Pre Chrome 1 1 Bookmark management Allows the user to organize/delete/add bookmarks 2 Password management Allows the browser to remember credentials to a certain web site upon the user's request 3 Search engine toolbar Easy access to a search engine from the browser tool bar 4 Tabbed browsing The ability to browse multiple web pages in a single browser window 5 Pop-up blocking Blocks pop-ups that the user didn't explicitly ask for 6 Page zooming Scale the text of a web page 7 History manager Manages history of recent web pages that the user browsed 8 Phishing protection Block or warn when surfing to web pages that masquerade as another (legitimate) website 9 Privacy features Manages the user preferences regarding passwords, history, and cookie collection 10 Smart bookmarks bookmarks that directly give access to functions of web sites, as opposed to filling web forms at the respective web site for accessing these functions 11 Tabbing navigation The ability to navigate between focusable elements with the tab key Released at same time 1 Access keys Allows you to navigate quickly through a web page via the keyboard 2 Adaptive address bar Suggest webpages as you type an address or search keywords from your history or from a search engine 3 Full page zoom Scales the whole page, including images and CSSs 4 Hardware acceleration Allows the GPU to help the browser to speed up certain tasks that the GPU is more capable for 5 Incognito Allows the user to browse the web with reduced identifiable trace (notably doesn't allow cookies) 6 Reopen closed tabs Reopen a recently closed tab 7 Full screen Displays the page in full screen mode Table 3: Features not used in comparisons as they did not reflect differences between browsers.
1. Listing of major features of modern browsers.
2. Establishing the release date of each feature by each browser 3. Identification of features that differentiate between the browsers 4. Conducting an online survey of web users to assess the relative importance of the different features to end users 5. Performing a statistical analysis of the relative importance (according to the survey) of features that each browser released earlier or later than other browsers.
The following subsections provide details regarding these steps.

Feature Selection
We identified 43 features which in our opinion represent a modern browser. These features are listed in Table 3 and   Chrome 1.0 was our starting point, 11 features included in this version which had already been included also by the competing browsers were excluded from consideration, as they did not confer any competitive advantage to any browser in the context of our study. For example, this included multiple tab browsing. Seven further features were excluded because they were released at about the same time by all three browsers, so they too did not confer any competitive advantage (Table 3 and see below). Subsequently, the study was conducted based on the 25 remaining features. These features are listed in Table 4. Note that the features are listed in a random order.

Feature Release Margins
As the three browsers are developed by different organizations, the release dates of new versions are of course not coordinated. We therefore faced the challenge of defining what it means for one browser to release a feature ahead of another. We elected to use a conservative metric for this concept. We had already dated the release of each of the 25 selected features in each browser (Table 5). We then developed a metric which states whether a certain browser released a feature ahead of a competitor by "a meaningful margin" and/or whether a certain browser lagged a competitor by "a meaningful margin". A browser was awarded a "win" if it released a feature ahead of all its competitors, and a penalty or "loss" was given if a browser lagged all its competitors or did not released a certain feature at all. Note that each feature can have a maximum of one "winner" and a maximum of one "loser". If a feature had neither a "winner" nor a "loser" it was excluded from the study as no browser had a competitive advantage or disadvantage.
"A meaningful margin" was defined as more than one release cycle, that is, when it took the competitors more than one version to include the feature after it was initially introduced. For example, "personalized new tab" was introduced in Chrome 1. At the time the most recent versions of Internet Explorer and Firefox were 7 and 3, respectively. The feature was subsequently released in Internet Explorer 9 and Firefox 13, meaning that this was a meaningful margin. Had the feature been released in Internet Explorer 8 or Firefox 3.5 it would not have counted as a meaningful margin, despite being later than Chrome 1. Furthermore, Firefox lagged Internet Explorer in the release of the feature in a meaningful margin (Figure 11). So in this case Chrome received a "win" and Firefox received a "loss". All the release versions and their identification as wins or losses are shown in the results in Table 5.
Note that the definition of the release margin is based on releases of new versions, and not on absolute time. This definition gives an advantage to browsers that are released infrequently. For example, any innovations included in Chrome versions 2 to 9 -a span of nearly two years -and included in Internet Explorer 9 would not be considered to have a significant margin, because Microsoft did not release any versions during all this time. Consequently our results may be considered to be conservative.

Feature Importance Survey
To assess the relative importance of the 25 different features, we created an online survey that lists and explains these features. Survey participants were asked to evaluate the importance of each feature relative to other listed features on a discrete scale of 1 (least important) through 5 (most important). The features were listed in the same random order as in Table 4.
The intended audience were people who spend many hours a day on the World Wide Web. The survey was published on Reddit (sub-reddit /r/SampleSize) [26] and on CS Facebook groups of the Hebrew University and Tel-Aviv University in Israel. 254 people answered the survey, and the distribution of results is shown in Table 5. The statistical analysis was performed on all of the participants.

Statistical Analysis Procedure
Opinion surveys like the one we conducted are commonly analyzed by calculating the average score received by each entry, and considering these to be the averages of samples of different sizes and unknown variance. Then a test such as Welch's t-test is used to check whether or not these averages are significantly different. However, such an approach suffers from a threat to construct validity, because averaging implicitly assumes that the scale is a proper interval scale, meaning that the difference between 1 and 2 is the same as between 2 and 3, 3 and 4, and 4 and 5. But given that these numbers represent subjective levels of importance, this is not necessarily the case. Moreover, different people may use the scale differently. Therefore both the averaging and the statistical test are compromized.
Another problem with human users is that some of them are hard to please, and always use only the bottom part of the scale, while others are easy to please, and focus on the top part of the scale. To reduce this danger our survey participation instructions included the following: "For every feature please choose how important this particular feature is compared to other features in the survey. Please try to use the full scale from 'least important' to 'most important' for the different features. You can change your marks as often as you wish before submitting." And indeed, checking our data we found that most respondents actually used the full scale from 1 to 5, with an average near 3. These findings imply that we do not need to perform adjustments to the data to compensate for potentially different behaviors [38].
Nevertheless, comparing average scores is still not justifiable. We therefore use an analysis method due to Yakir and Gilula, where brand A is judged to be superior to brand B if the distribution of opinions about A dominates the distribution of opinions about B in the stochastic order sense [42]. Note that in our case the "brands" are not Microsoft, Google, and Mozilla, but rather the sets of features which represent the "wins" or "losses" of each browser. This will be clear in the results of Subsection 5.3.
Mathematically stochastic order is expressed as ∀s : F A (s) ≤ F B (s), where F A and F B are the cumulative distribution functions of the opinions regarding A and B, respectively. Graphically, the plot of F A is lower and to the right of the plot of F B , and it accumulates more slowly. In simple terms this means that for each level of opinion 1 to 5 the probability that A receives a score of at least this level is higher than the probability that B receives such a score. However, in many cases one distribution does not dominate the other (and their graphs cross each other). It is then necessary to adjust the data by grouping brands and/or score levels together until dominance is achieved [30,32,31,37].
In more detail, the analysis procedure is as follows [42]: 1. Identify subsets of homogeneous brands. Ideally, for all pairs of brands, the distribution of scores of one brand will dominate the distribution of scores of the other brand. This will induce a full order on the brands. But in reality there may be certain subsets of brands that are incomparable, and do not dominate each other. These subsets need to be identified, and the ranking will then be between subsets instead of between individual brands.
The subsets are found by an agglomerative single-link clustering algorithm. Initially all pairs of individual brands are compared, and the chi-squared statistics computed. If the minimum statistic value obtained is below a predefined critical value, the two distributions are considered the same and the brands are combined into a joint subset. In subsequent steps, when subsets are being considered, the maximal statistic among all pairs (where one brand comes from the first subset and the other brand from the second subset) is compared to the critical value.
The suggested critical value is the upper α percentile from the chi-square distribution with J − 1 degrees of freedom, where J is the number of score levels in the distribution (in our case, 5) and α = 0.1 (or another value chosen by the analyst).
A chi-square-based test is then applied to test whether the obtained partitioning is significant, as described in [42]. The result of this step is then one or more subsets of brands, which are heterogeneous relative to each other, but the brands within each subset are homogeneous.
2. Find the widest collapsed scale. Even when brands (or subsets of brands) have heterogeneous distributions of scores, the distributions may not dominate each other in the stochastic order sense. This happens if the distributions cross each other. However, it is always possible to create a dominance relationship by collapsing adjacent scores and thereby reducing the fidelity of the distributions.
The problem is that collapsing can be done in many different ways, and the selected collapsing may affect the resulting dominance order. We therefore need to define which collapsing is better. The suggested approach is to strive for minimal loss of information, in the sense of preserving as many of the original scores as possible.
Hence we are looking for the widest collapsed scale that nevertheless leads to dominance.
Technically, the procedure is as follows. Given all the subsets of brands, we consider all possible orders of these subsets. For each such order we find the collapsing that leads to dominance in this order (if such a collapsing exists). The order that is supported by the widest collapsing is then selected. This implies using the collapsing which retains the highest number of distinct scores.
3. Note the stochastic order between the subsets of brands. At this point a well-defined stochastic order is guaranteed to exist. This order is the result of the analysis.
4. Verify statistical significance. Collapsing score levels leads to loss of information relative to the original data. A chi-square-based test is used to demonstrate that the loss is not significant, and therefore the results will still reflect the original data. For details see [42].
In our case the brands are the features of the browsers. But we don't really care about ranking the individual features. Rather, we want to rank sets of features. For example, we can take the set of features that were Chrome "wins", and compare it to the set of features that were Chrome "losses". If the first set turns out to be more important to users, then this testifies that Chrome project managers chose wisely and invested their resources in prioritizing the more important features first.
To perform these calculations we used the Insight for R v0.4 software package which implements this approach 2 . Given the adjusted (collapsed) data, we also calculate the polarity index. The polarity index is the ratio of users who considered features important (levels 4 and 5) to the rest (levels 1 to 3). A polarity index less than 1 indicates that the balance is skewed towards not important, while a polarity index higher than 1 indicates that user opinion is skewed towards most important. Unlike average scores, the polarity index has a direct quantitative meaning and therefore the indexes of different brands can be compared to each other.

Early Release of Features
In order to analyze which browser released features earlier than its competitors we identified the "wins" and "losses" of each browser, as indicated in Table 5. Our results show that Chrome received a "win" in 6 features and Firefox in 5 features. In contrast, Internet Explorer did not receive any "wins", and 14 features did not have a "winner". Chrome received a "loss" in 5 features, Firefox in 6 features, and Internet Explorer in 13 features. Here only one feature was not ascribed as a "loss" to any of the browsers (the "web translation" feature).
These results already show that Chrome tended to release new features ahead of the other browsers, with Firefox being a very close second. Internet Explorer lagged far behind both of them, as it did not release any feature ahead of the competition and it was the last to release half of the features in the study.

Importance Comparisons
Mere counting of "wins" and "losses" as done above does not indicate whether the features released early by Chrome were indeed the more important ones. We therefore conducted an analysis of importance by comparing the distributions of importance scores given to the sets of features that were "wins" and "losses". Specifically, we performed an analysis of the "wins" of different browsers, an analysis of their "losses", and a specific analysis of the "wins" versus the "losses" of Chrome.
Wins The results of comparing the user opinions regarding the feature sets where each browser "won" is shown in Table 6. A stochastic order of the response levels was present without any adjustments, with Chrome ranked first and Firefox second. Since Internet Explorer did not have any "wins" it was ranked last. The Polarity Index of Chrome and Firefox were 0.67 and 0.40, respectively. While both are smaller than 1, the features in which Chrome received a "win" were still more important to the end user, since the Polarity index was higher. The direct quantitative meaning is that for Chrome users considered the "winning" features to be important 2 5 of the time, whereas for Firefox they considered them to be important only about 2 7 of the time.
Losses Given limited resources the developers of a browser cannot do everything at once, so the implementation of select features must be delayed. Under such circumstances it is best to delay those features that will not be missed by many users, namely those that are considered less important. Therefore, a lower ranking and a lower   Table 7: Comparison between Chrome and Firefox/Internet Explorer "losses". Note that is this case being ranked lower is better. The polarity index is calculated differently than in other cases because scores 3 and 4 were collapsed.
polarity index are favorable when comparing feature sets which are "losses". The "loss" scores distributions of Firefox and Internet Explorer showed the same trends and could not be distinguished one from the other, so they were clustered together. In order to achieve dominance the ranking algorithm collapsed importance score levels 3 and 4 ( Table 7). The result after these adjustments was that Firefox and Internet Explorer were ranked on top and Chrome was ranked lower. This means that the features in which Firefox and Internet Explorer received a "loss" were more important to the end users. However, it should be noted that the differences in the distributions were actually very small, so this difference is most probably meaningless. The Polarity Index could not be calculated in the regular way due to the unification of levels 3 and 4. The results given in the table are therefore the ratio of levels 3 to 5 to levels 1 and 2, making them higher than in other comparisons. They are close to each other, but still Chrome is a bit lower, which is better in this case.
Chrome Wins and Losses Finally, we compared the features that Chrome "won" with those that it "lost". In order to achieve a stochastic order the algorithm collapsed levels 1 and 2 together. Interestingly the "losses" won, meaning that they were considered more important (Table 8). The Polarity Index of the "wins" and the "losses" were 0.67 and 0.96, respectively, meaning the features which Chrome released ahead of its rivals were considered important to the users about 40% of the time, whereas those in which it lagged behind were considered important nearly 50% of the time. Thus the prioritization used in developing Chrome was better than that of its rivals (as shown in the two previous analyses), but it was far from perfect.

Summary of Results
We tested the performance of the three dominant browsers, Chrome, Firefox, and Internet Explorer, and to a lesser degree also the Opera browser, using a wide set of commonly used benchmarks and across a long period of time. The results, presented in Subsection 4.3 through Subsection 4.6 and summarized in  Chrome is better Browserscope Security Chrome is better, Firefox worst Table 9: Summary of benchmark results. generally had an advantage over its competitors, both in terms of performance and in terms of conformance with standards.
More specifically, Chrome achieved better results throughout in five of the tests: BrowserMark 2.0, Peace-Keeper, HTML5 Compliance, CSS3 Test, and Browserscope Security. Firefox achieved better results only in the start-up times test, and that only towards the end of the study period. Interestingly, Chrome start-up times results may indicate that Chrome suffers from a feature creep impacting its start-up times. Internet Explorer achieved better results only in SunSpider, in the second half of the study period. Moreover, Chrome was not worse than both competing browsers in any of the benchmarks, while Firefox and Internet Explorer were each the worst browser in two cases.
In addition we compared the release dates and importance of 25 specific features, as described in Subsection 5.3. Eleven features had a "winner", meaning that they were released by one browser ahead of the others by a meaningful margin. All but one also had a "loser", that is a browser that lagged behind by a significant margin. The relatively low fraction of features that had a "winner" (and the fact that 7 features were excluded from the study because they did not have a "winner" nor a "loser") indicates that the development of each browser is not isolated from its rivals. As a result, some features are released at about the same time by two or even all three browsers. On the other hand, some browsers still managed to release a fair number of innovative features: Chrome and Firefox received 6 and 5 "wins", respectively. Internet Explorer on the other hand did not receive any "wins" and had the most "losses", 13. Chrome and Firefox had 5 and 6 "losses", respectively.
Although Chrome and Firefox received similar numbers of "wins" the feature importance survey showed that features in which Chrome "won" were more important to the users than features in which Firefox "won". Likewise, features in which Chrome "lost" were less important to users than the features in which Firefox and Internet Explorer had "lost", but in the case of losses the difference was marginal. Interestingly, Chrome "losses" were actually more important to users than its "wins".
Ideally a browser should release the most important features to users first, and in case it has to lag in the release of certain features they should be of less importance to users. The results indicate that Chrome project managers were somewhat better at releasing important features first than the project managers of competing browsers. This means that they generally made better choices than their rivals. However, they did not manage to focus on only the important features, and when they lagged in feature release, these features were sometimes actually more important to users.

Implications for Software Development
While not the focus of our study, our results can be used to gleam some insights into basic questions in large-scale software development. This is based on the fact that the three main browsers were developed in rather different ways. However, this is somewhat speculative, and additional work is needed. One major question is the comparison of open source and proprietary software development. Our results regarding Firefox and Internet Explorer provide some evidence for the potential superiority of large-scale opensource projects. Up to 2009 Firefox was quickly gaining market share at the expense of Internet Explorer, and our benchmark results indicate that it appears to have had superior performance for most of them (this conclusion is restricted, however, by the fact that we did not measure Internet Explorer 6 and 7 and the early versions of Firefox). It also appears to have been more innovative, as reflected by having some "wins" in early introduction of new features, and much less "losses" than Internet Explorer. This is an important result, as it demonstrates that a large open-source project can in fact prioritize features better than a competing product developed in-house by a leading software firm. Of course this does not imply that this is always the case, but it provides an important case study as an instance.
However, in later years Chrome came to overshadow Firefox. To the degree that Chrome is an in-house product this implies that large company projects can also be better than open-source ones. The conclusion would then be that the main factor is not the project management style but rather the companies involved, in this case Microsoft as opposed to Google. But such a conclusion is tainted by the fact that Chrome is closely related to the open-source Chromium project. So maybe the most important factor is the various project managers and contributors. This calls for further investigation as noted in the future work section below.
Another sometimes contentious aspect of software development is the use of agile methodologies with a rapid release cycle as opposed to heavier plan-based methodologies with large-scale infrequent releases. Tabulating the browser version release dates indicates that Chrome and Firefox transitioned to rapid development methods, releasing a new version every 4-8 weeks (Figure 2). This meant that there were more releases and each release contained fewer new features, leading to more focus in the work on each new release. At the same time, with rapid releases the development teams could respond more quickly to their competitors' released features which they considered to be important, and also respond quickly to user feedback and requests. Microsoft retained the traditional slow release cycle for Internet Explorer, releasing only 4 versions during the 5 years of the study, compared with 31 released versions of Chrome. This may have contributed to Internet Explorer's downfall.

Threats to Validity
Various threats to validity have been mentioned in previous sections. Here we note them together and expand on them.
The first threat relates to the assessment that Chrome is the dominant browser, as shown in Figure 1. First, as noted, this data comes from StatCounter, and other counting services may reach different conclusions. Second, we focused on desktop systems, and the picture may be different on other platforms such as mobile. To address these concerns we checked other counting services and platforms, and found that most of them indicate that Chrome is of growing importance and often dominant. The most prominent dissenter is netmarketshare.com, which claims that Internet Explorer is still the dominant browser worldwide by a large margin (58% for Explorer vs. 23% for Chrome in January 2015) [13]. The difference is probably due to a much smaller sample (40 thousand web sites as opposed to 3 million for StatCounter) and differences in methodology, including an attempt to count unique users per day and to weight countries by their total traffic. We believe that the StatCounter data is more reliable, and specifically prefer to count activity and not users. The issue of the mobile market is mentioned below in the future work section.
Another threat to the validity of the work reported so far is its focus on purely technical aspects of browsers. We did not check the marketing aspect of the browsers, hence, we cannot separate the technical superiority from the brand name. For example, according to [27] an important aspect of Chrome's rise was "the great promotional efforts produced by Google" in the shape of promotional videos released on the web. Examples of 7 such videos are given, including the "Chrome speed tests" video released in May 2010 that went viral; at this time Chrome was just beginning its rise in market share, and the video may have contributed to its momentum. All 7 videos were released by April 2012, when Chrome had already overtaken Firefox but was still second to Internet Explorer. Additional work from a marketing perspective is needed to alleviate this concern.
A third threat is that we compared Chrome's performance with only two main rivals (Internet Explorer and Firefox) and partially also with a third (Opera). There are many other browsers as well, and maybe Chrome is not better than all of them. The focus on this set of contenders is justified by the fact that together with Chrome they account for well over 90% of the desktop browsers market. However, if any of the smaller browsers is indeed superior to Chrome and the others, this would testify to the importance of branding and marketing relative to technical considerations.
Using existing benchmarks is also a threat to validity, especially since their documentation is sometimes short on details of exactly what they measure and how. However, it should be noted that benchmarking browsers (and other system types for that matter) is not trivial. Therefore we preferred to rely on prominent benchmarks that have established themselves over the years instead of trying to devise our own -and risk threats to validity that result from our inexperience in such benchmarking. That being said, it should be noted that the benchmarks do not test all possible aspects of browser technology. For example, it is possible to conduct a more detailed study of compatibility issues [29] to try and quantify the problems that may occur with each browser.
Finally, a possible threat to validity concerning the introduction of new features is that such features could be introduced in plugins before being integrated into the browser core. This would cause the dates of the releases which first included these features to be misleading. However, we do not consider this to be a serious threat as even the most popular plugins are used by only a small fraction of users.

Future Work
A drawback of the current work is its focus on the desktop market. Obviously examining competing browsers in the mobile market would also be interesting. Using StatCounter.com data, it turns out that Chrome is now also the leading browser on mobile platforms [11]. However, its rise started much later, and accelerated considerably only in 2014, eventually surpassing both the Android and Safari browsers. It would be interesting to repeat our measurements with multiple releases of these browsers, and perhaps also with UC Browser, which is the fourthranked browser and also seems to be gaining market share, especially in emerging markets.
Another potentially interesting line of study is to try and compare the relative importance of technical considerations, marketing campaigns and practices, and brand name. It is widely accepted that Internet Explorer gained its market dominance by being bundled with the Windows operating system, and it is reasonable to assume that the strength of UC Browser in emerging markets is related to the strength of the company which developed it, Chinese mobile Internet company UC Mobile. Chrome most probably benefited from the Google brand name and from Google's marketing campaign. But how to separate these effects remains an open question.
In the late 1990s the browser war between Internet Explorer and Mozilla (later Firefox) was portrayed in colors of a race between proprietary software and open source software. Chrome is a unique combination of both. It was initially developed within Google, but then it was largely turned into an open source project. An open question is whether it was really turned over to the open source community, or remains largely under Google control, both in terms of code contributions and in terms of management. Thus an interesting direction for further work is to dissect the sources of advances made in Chrome (or rather Chromium), and to see how many of them can be attributed to developers outside Google.

Conclusions
We tested the technical performance of the three major browsers (Chrome, Firefox, and Internet Explorer) and compared the release times of 25 features. Overall it seems that all three browsers became better over time, as most of the benchmarks that were examined showed a clear improvement trend, and all the browsers evolved and received better results. It is also apparent that the release rate of versions became more frequent over the years (especially for Chrome and Firefox).
In conclusion, the cumulative evidence we have collected indicates that Chrome's rise to dominance is indeed consistent with technical superiority over its rivals and with insightful management of feature selection. However, we still cannot say that it is the result of technical superiority alone, as marketing and the Google brand probably also played an important role. Studying the marketing campaign may well be a worthwhile effort.