Journal of Informetrics

This study estimates the development of hybrid open access (OA), i.e. articles published openly on the web within subscription-access journals. Included in the study are the ﬁve largest publishers of scholarly journals; Elsevier, Springer, Wiley-Blackwell, Taylor & Francis, and Sage. Since no central indexing or standardized metadata exists for identifying hybrid OA an explorative bottom-up methodological approach was developed. The individual search and ﬁltering features of each publisher website and a-priori availability of data were leveraged to the extent possible. The results indicate a strong sustained growth in the volume of articles published as hybrid OA during 2007 (666 articles) to 2013 (13 994 articles). The share of hybrid articles was at 3.8% of total published articles for the period of 2011–2013 for journals with at least one identiﬁed hybrid OA article. Journals within the Scopus discipline categorization of Health and Life Sciences, in particular the ﬁeld of Medicine, were found to be among the most frequent publishers of hybrid OA content. The study surfaces the many methodological challenges involved in obtaining metrics regarding hybrid OA, a growing business for journal publishers as science policy pressures


Introduction
Open Access (OA) as a phenomenon has existed since the earliest days of the internet, although the term itself was formally established around the time when the Budapest Open Access Initiative was signed in 2002 (BOAI, 2002). Suber's (2012:4) definition of OA conveys the essence of most official definitions: "Open access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.". Essentially we talk about either the journal publisher making the article available directly (gold OA), or alternatively manuscript versions being uploaded to the web by authors (green OA) which can act as substitutes for readers lacking access rights to a subscription-journal. OA is also relevant for other forms of scientific reporting, such as books and data sets, but the border conditions are different for these and they are outside the focus of this study.
Since the early 1990s full OA journals have been launched in increasing numbers, using a number of alternative business models to secure the finances or resources needed to operate them (Laakso & Björk, 2012). For the year 2014 the number of full OA journals exceeded 9500, collectively publishing more than 482000 articles during the year (Crawford, 2015). Since the majority of scholarly journals and articles are still only available for subscribers an increasing number of institutional and subject repositories have also been created, in which authors can upload and preserve manuscript versions of their articles (Björk, Laakso, Welling & Paetau 2014).
A solution in-between full OA journals and subscription journals is being offered by most of the leading publishers. In so-called hybrid journals, authors can free their individual articles for anybody to read by making an optional payment to the publisher, while the rest of the journal's content remains reserved for subscribers only (Weber 2009). Information about such options are marketed to authors, especially at the stage when a manuscript has been accepted for publication. The common benefit enabled by payment is that the authors usually retain full copyright of the final published article and the article is labeled with a Creative Commons license, which explicitly outlines what readers can do with the article. Choosing the hybrid option is for many authors an easy option for complying with OA mandates set by funders and universities, policies which are increasingly common (Swan, Gargouri, Hunt, & Harnad, 2015). Hybrid OA has also been discussed in the context of acting as a transition mechanism for journals eventually converting to full OA publishing, whereby a journal can gradually shift over to full OA as uptake grows (Prosser, 2003).
Initial discussions and experiments around hybrid OA started as early as 1998-1999(Walker, 1998, but the concept was tested on a wider scale with Springer's "Open Choice" programme launched in 2004 (Springer, 2004). Springer set the pricing at 3000 USD per article (Velterop, 2007), which has since become more or less of a de facto industry standard. Hybrid OA has shown signs of escalating rapidly based on reported publication outlays of universities and research funders (e.g. Björk & Solomon, 2014;Pinfield, Salter, & Bath, 2015). Research funders, in particular in the UK, have signaled a readiness to remunerate the charges to authors and their universities. There has been an ongoing debate about the possible consequences of a potential rapid uptake of the hybrid OA option in the overall publishing and subscription costs of research intensive universities (Finch, 2012). Recent experiences of funders like Wellcome Trust (Björk & Solomon, 2014) and the Austrian Science fund (Reckling & Kenzian, 2015), show that the majority of their earmarked APC funding has gone to paying the charges of hybrid journals rather than the charges of full OA journals, which are comparatively less expensive to publish in.
One of the problems in the on-going debate about hybrid OA has been the lack of exact information on uptake of hybrid OA. It is difficult to distinguish, and in particular make exact counts of hybrid OA articles. Publishers have widely differing ways of tagging hybrid OA articles in their tables of content, and there is so far no uniform universally adopted standard. The few studies conducted so far have had to rely on partial data made available by individual publishers, or partial sampling of a wider population of publications. Their methodologies and results are summarized in the following section.

Earlier studies
Hardly any bibliometric studies have been conducted concerning the prevalence of hybrid OA alone, but different aspects of hybrid OA have been partially covered as part of broader and coarser studies.
In the EU-funded SOAP study report one chapter is devoted to hybrid OA (Dallmeier-Tiessen et al., 2010). The project team looked at the overall hybrid OA offering (number of journals) from 12 leading publishers. The actual number of articles published was determined by asking the publishers to supply figures. The number of journals was in 2009 1991 representing around 25% of the journal portfolios of the publishers in question, and the number of articles found was 4582.
In a study based on aggregating information made available by publishers and searches in PubMed Central, Björk (2012) identified 4381 journals and estimated the number of published articles to be 12089 for 2011. In a later study, Björk & Solomon (2014) found 8003 hybrid journals in 2013, almost double the amount reported for 2011 in Björk (2012).
Other studies have indirectly incorporated hybrid articles in the overall numbers of OA articles, but usually without trying to distinguish them from free manuscript versions (green OA) or promotionally free articles (Archambault et al., 2014;Gargouri, Lariviere, Gingras, Carr, & Harnad, 2012). Such studies usually start from a sample of scholarly articles from indexing services like Web of Science or Scopus and then try to determine if the full text is available freely. OA articles in full OA journals can be rather easily distinguished due to the indexing of OA journals in the Directory of Open Access Journals (DOAJ), but usually all other hits (whether in delayed OA journals, hybrid journals, promotionally free issues or any variant of green OA) are bundled as one category, unless they are classified by manual inspection.
Mueller-Langer and Watt (2014) conducted a study on the citation effect of hybrid OA among articles published in 15 economics journals offering a hybrid OA option. The authors included 14 journals from Springer and 1 from Oxford University press, with a total of 1329 articles published from December 2006 to December 2011. Based on manual identification 208 articles were found to be available hybrid OA. Hybrid OA shares of articles published in the 15 journals ranged from 3.02% to 18.06%, with a total hybrid OA share of 6.5% for all included articles. The authors note that the uptake figures for hybrid OA among the journals included in the sample were influenced by the pilot hybrid OA agreements that e.g. the Dutch Consortium of University Libraries, University of California, Max Planck Society (MPG), University of Goettingen, and the University of Hong Kong had made with publishers that commonly enable all affiliated authors from the organizations to get their articles published as hybrid OA without paying individual fees. Appendix A of Mueller-Langer and Watt (2014) contains the identified institutions and agreement time periods which are likely useful for interpretation of data on the growth of hybrid OA. The study concludes that, controlling for institution quality and citations to RePEc OA pre-prints, there was not a significant relationship between hybrid OA status of articles and received citations to said articles.
The theoretical potential and realized uptake of various OA publishing models was recently studied by Jubb et al. (2015), with a particular focus on research output by UK-affiliated authors but also providing comparative metrics on uptake levels globally. The part of the broad study most relevant to this article was based on disciplinary stratified random sampling (divided as per the four main UK Research Excellence Framework panels) of Scopus-indexed articles. The study utilized manual data collection of observations for 9400 articles in with UK-affiliated authors, and 5100 articles from authors globally. The study estimated that 52.3% of all articles globally for 2014 could have been published as hybrid OA, while the actual uptake for 2014 was estimated at 2.4% of such articles. For the articles with UK-affiliated authors the hybrid OA option was estimated to have been available for 67.4% of all articles, and the realized hybrid OA uptake for UK-affiliated authors was estimated at 6.5%. Strong growth in uptake could be observed when comparing to a previous sampling of hybrid OA for articles published in 2012 (78.9% relative increase globally, and 61.7% relative increase for the UK). The study provides strong evidence for the globally accelerating uptake of hybrid OA, and that certain countries like the UK have a higher proportional share of articles output as hybrid OA seen globally, which is likely influenced by science policy and research funding supporting the option. Sotudeh, Ghasempour, and Yaghtin (2015) studied the relationship between OA and citations by analyzing OA articles published by Elsevier and Springer during the years 2007-2011. The authors used the publisher websites to collect data about articles published OA in both subscription journals and full OA journals. A total figure of 15656 hybrid OA articles were reported for 2007-2011, where over 90% were articles published by Springer and only a small portion by Elsevier. The study provides an exclusively article-level and discipline-level analysis, omitting the measurement level of journals which within the articles are published. As such no insight is provided on the share of papers being published as hybrid OA within journals, nor were the years separated so as to provide annual figures for OA articles.
The lack of an index for OA articles published in hybrid journals can be assumed to be a major obstacle for why no extensive earlier studies have been conducted. No study so far has been based around both journal and article-level metrics, going from observations of individual articles to aggregate figures for the journal level of analysis. So far studies have been individual snapshots of specific points in time put together by disparate bits and pieces of data to paint partial pictures when it comes both to scope and chronology. There has been no longitudinal study based on a standardized bibliometric data collection methodology to study the uptake of hybrid OA publishing.

Aim & methods
The aim of the study was to produce an article-level measurement of the uptake of hybrid OA over the years 2007-2013, as well as to study the relationship between the uptake level with other factors such as subject field and impact of the journals in question.
Study of full OA journals is commonly facilitated by data retrievable from the Directory of Open Access Journals and study of OA uptake in general can be attempted by conducting a search for instance using Web of Science or Scopus indexed articles as a basis (see for instance Archambault et al., 2014). No such index exists for hybrid journals and searches are made difficult because many subscription journals in addition to hybrid OA articles also include other articles made open for promotional purposes, often on a temporary basis using a moving wall technique. Alternatively, publishers might incorporate a delayed OA policy where all articles become freely available after a set timeframe from publishing.
The study is informed by earlier studies of the number of hybrid journals offered by leading publishers (Björk, 2012;Björk and Solomon, 2014;Dallmeier-Tiessen et al., 2010;Sotudeh et al., 2015). Even though there are likely hundreds of publishers offering a hybrid OA option, the vast majority of individual journals can be assumed to be published by the leading commercial publishers. Hence the scope of the study was restricted to them.
Research question 1: What has the longitudinal uptake for hybrid OA been for the five largest scholarly journal publishers (Elsevier, Springer, Wiley-Blackwell, Taylor & Francis, Sage) during the time period of 2007-2013?
Research question 2: How have hybrid OA articles been distributed across scientific disciplines and journal impact metrics?
Research question 3: What is the relationship between hybrid OA uptake for a journal, its subject field and, and journal impact?
Journal impact will be operationalized by comparing the field-normalized, citation-based Source Normalized Impact per Paper (SNIP) value calculated annually by Scopus (Journalmetrics, 2016).

Article data
The foundation of the study is a dataset concerning hybrid OA articles and metadata collected from the open web. Data collection could not be conducted with a fully automated nor uniformly identical methodology for all five publishers since labeling concerning which articles have been published OA in a subscription journal is not standardized across publishers.
One central principle of the study was to leverage publicly available resources to the extent possible to facilitate replicability, a choice which also leads to highlighting the most problematic aspects of hybrid OA identification and measurement. Where publishers had made hybrid OA journal listings available they were used to narrow down the population of studied journals. Article data was collected between February 2014-December 2014. In the following the data retrieval and filtering methods for each publisher are described. All data was collected outside of any institutional network or login which could lead to subscription content being available for download. However, even so there is a lot of non-hybrid OA content that is available for download from subscription journal websites. In order to improve the signal-to-noise ratio in detecting actual hybrid OA articles among other journal content the following exclusion criteria were consistently enforced, omitting any entries that fulfilled the following criteria: Single page documents 2-page documents which contained the word "editorial" somewhere in the full-text Max 3 page documents which contained the word "errata", "erratum", "corrigendum", or "corrigenda" somewhere in the full-text Articles published prior to 2007 or after 2013 However, these steps were only a preliminary weeding for relevant content. Specific inclusion criteria are outlined in the following sections describing how each publisher sample was constructed.

Elsevier
As a starting point a list of journals providing the hybrid OA option was accessed at Elsevier's website through the following URL https://web.archive.org/web/20130516113934/http://cdn.elsevier.com/assets/pdf file/0008/109448/journal list.pdf (April 2014). Upon investigating the URL structure of Elsevier-hosted journals it became evident that journal websites can be accessed through "http://www.sciencedirect.com/science/journal/XXXXXXX/" where XXXXXXXX represents the journal ISSN. Furthermore, Elsevier's publisher-wide web platform has a convenient link displayed on the page of each journal which displays all OA articles in said journal, including among them all articles assumed to be hybrid OA. This specific page could be accessed for every journal by adding the suffix "open-access/" to the above URL structure.
From the original list of 1527 journals offering hybrid OA all publicly available PDF files were downloaded and put through the data refinement process outlined below. To minimize the inclusion of non-hybrid OA content only articles that included any of the following phrases where included: 'creative commons' "This is an open access article" "This is an open-access article" From random sampling among the freely available articles not explicitly marked as CC or OA it became apparent that they had been labeled as such in updated versions of the PDF files available from the publishers website (May 2015), for what reason this retrospective license addition had occurred remained unclear but in order to avoid re-downloading the content these articles were included in the sample.

Wiley-Blackwell
Wiley-Blackwell (referred to as only Wiley from here onwards) had published a convenient list of 1399 journals which provide the hybrid OA option at https://authorservices.wiley.com/bauthor/onlineopen order articleaccepted.asp (April 2014). By investigating the HTML source of the page a list of 1399 hybrid OA journals and their website URLs could be compiled. 8 of the journal entries on the original list were on closer inspection found to be either no longer published by Wiley, trade magazines or book series and were as discarded from further investigation. Wiley provides no mechanism for filtering OA articles per journal so the only option for the completion an exhaustive article-level study was to open the web page for each volume between 2007 and 2013 and query PDF links. To minimize the inclusion of non-hybrid OA content only articles that included any of the following phrases where included: "This is an open access article" "creative commons" "onlineopen" "© 20** The Authors ".

Springer
At the time of the study Springer had not published any exhaustive list of all hybrid OA journals on the web. To avoid scouring through all journals in Springers portfolio, which would have been very time-consuming considering the size of the publisher, we made an exception and reached out to Springer's Open Access Manager in order to obtain a list of all eligible titles. As a reply we got a list of dated May 2014 with 1569 journal titles and associated URLs to journal websites. Based on this list it was possible to extract the "journal ID" for each journal and create individual search queries for Springer's publisher-level search function found at http://link.springer.com/search to find out what content in these journals is not restricted to preview only. An example URL: http://link.springer.com/search?showAll=false&facet-journal-id=40521 &sortOrder=newestFirst&facet-content-type=Article&date-facet-mode=between&facet-start-year=2007&facet-end-year=2014. To minimize the inclusion of non-hybrid OA content only articles that included any of the following phrases where included: "This article is published with open access at Springerlink.com" "creative commons" "This is a 'Springer Open Choice' article"

Sage
Sage provides a listing of full immediate OA journals and participating hybrid OA journals at http://www.uk.sagepub.com/repository/binaries/pdf/SAGE-Choice-Participating-Title-List.pdf (April 2014). After removing full OA journals and one journal which was no longer published by Sage, 706 journals remained in the population of hybrid OA journals published by Sage. Since Sage also uses a standardized URL scheme for all their journals it was possible to generate hyperlinks to for every volume of every journal between 2010 and 2013 based on the participating hybrid OA list. The convention was the following http://XXX.sagepub.com/content/by/year/YYYY where XXX is the three letter abbreviation for the individual journal and YYYY the volume of interest. To minimize the inclusion of non-hybrid OA content only articles that included any of the following phrases where included: "creative commons" "© The Author" "The Author(s)" "Sage Choice" Searching for "open access" did not result in meaningful results

Taylor & Francis
For Taylor & Francis (T&F) there was no predefined list of hybrid OA journals to utilize as a starting point so a more labor-intensive approach was necessary. T&F offers a publisher-wide search facility at http://www.tandfonline.com/action/doSearch? where it is possible to narrow down the article search query to "Journals", "Only content I have full access to", and being published during the time of "2007-2013". To minimize the inclusion of non-hybrid OA content only articles that included any of the following phrase where included: "Open Select" Searching for "open access" did not result in meaningful results. While the methodology differed from how data was collected for the other publishers, the functionality of T&Fís search tool, the manageable journal portfolio (relative to some others), and clear marking of hybrid OA content made it viable rely on this method. Based on observing the collected material the reliability of the method was comparable to that of the other publishers.

Combining and refining the data
After each article matching the above criteria had been identified and retrieved, a bibliometric database containing the full reference information for each article was constructed containing each articles journal name, journal ISSN/E-ISSN, article title, publication year, author(s), volume, issue, page number, and DOI. DOI matching, i.e. matching each article with its unique DOI in order to retrieve associated reference information, was handled through the Papers 2 reference management software. Papers 2 has a robust matching functionality which is capable of identifying DOI information embedded in article metadata or within the article itself, and look up associated reference data from sources like Crossref, Scopus, Web of Science, and Google Scholar (Papersapp, 2014). Most articles were successfully matched without manual intervention, a small minority required manual lookup. Once a complete reference database of all identified articles had been created, a standard. bib file containing all reference data was exported from Papers 2 to JabRef (Jabref 2014) which enabled a simple conversion of the .bib file to a .csv file of the full bibliometric dataset. This file was then imported into IBM SPSS and Microsoft Excel for analysis. This description has so far covered how the unique data over hybrid OA articles was retrieved. At the stage of analysis we matched that manually-created data to journal metadata freely downloadable from SCImago (2014) and Scopus (2014), containing e.g. journal subject categories, annual article publication counts, and SNIP values.
Further filtering out likely noise from the data the following procedures where performed: • Only journals included in Scopus were included, to ensure that focus was on peer-reviewed academic journals, and standardized metadata was available for all journals. • Journals publishing under a total of 20 articles during 2011-2013 which OA share was over 1/3 of all content were excluded.
Such outliers were assumed to be hybrid OA false positives, where a large share of articles in very small journals were available OA. • Journals registered in the DOAJ were excluded, i.e. had at some point during the observation period flipped from subscription-access to full OA or had been full OA from the start. This is a weakness that is hard to avoid for a semiautomated retrospective study. Also journals which on closer inspection were openly marketed as OA journals but not registered to the DOAJ were also excluded. • An existing listing of journals previously identified to incorporate delayed OA, stemming from Laakso and Björk (2013), was cross-matched with the sample to exclude journals having a policy of systematically granting access to all content after a set embargo. Again, this is something that was done due to the retrospective nature of the study.

Results
This section presents the results of the study, first focusing on journal level and later on the article-level. The number of journals reported throughout are journals that contained at least one assumed hybrid OA article during the 2007-2013. Hybrid OA journals without any assumed hybrid OA articles are not included as such journals could not be identified with the selected approach to data collection. However, data regarding the total number of hybrid OA journals of the same five publishers is reported for the years 2009, 2012, and 2013 in Björk and Solomon (2014) which will be used as a point of reference to gauge uptake between potential and actual uptake.   (2007) to 2714 (2013) journals publishing at least one hybrid OA article. While this over ten-fold increase is substantial, the total hybrid OA article count in said journals has increased over twenty-fold in total, from 666 articles in 2007-13994 in 2013. There has been almost a doubling in the average number of hybrid OA articles published per journal per year, going from 2.79 in 2007-5.16 in 2013. Table 1 also contains compatible data points from Björk and Solomon (2014), which suggests that the% uptake of hybrid OA has dropped from 71% to 40%, much due to the rapid expansion of the hybrid OA option among the five publishers.  Fig. 2 reveals that hybrid OA content accounted for 0.1-4% of all citable documents during 2011-2013 for the majority of journals where more than one hybrid OA article could be found. The distribution follows a constant decrease across the categories with a long tail ending with one journal at hybrid OA content between 34.1%-36%.
Scopus categorizes indexed journals into one or several of five major categories: Life Sciences, Social Sciences, Health Sciences, Physical Sciences, and General. Many journals are categorized into several of these categories which means that analysis needs to account for this, analyzing the categories individually would lead to a lot of articles being counted more than once. Accounting for all permutations in categorization, Table 2 Table 2, both as averages for journals by year and for each subject category individually. Longitudinally it is only for year 2007 that the article-volume weighted SNIP value is below that of the simple average of all journals active in hybrid OA publishing in the year. This is an indication that journals above the average SNIP of all journals are publishing more hybrid OA articles and thus increasing the article-weighted number. The SNIP averages for subject categories reveal that most categories have a higher article-weighted average than one calculated without taking into account article volumes, the exception being: Physical, Life + Social + Physical, Social + Physical + Health, General, Life + Health + Physical + Social. Some of the categories have a small population size which likely causes the inverse trend to other categories, however, the Physical Sciences category is not small (N = 820) and the fact that the other outliers contain partial Physical Sciences classification suggests that the relationship between hybrid OA and SNIP might be different within this discipline since SNIP is designed to be a field-normalized indicator enabling cross-discipline comparisons. The article-level analysis will shed more light on potential explanations for the finding.
Scopus also provides a lower-level of classification of journals into 27 categories, again where each journal commonly belongs to multiple sub-categories. Wang and Waltman (2016) recently evaluated the strength of the connection between journals categorizations in Scopus and Web of Science based on inter-disciplinary citations of published papers, with results suggesting that the Scopus classification is lenient in including journals into multiple categories to which the connection through citations within the discipline are low. However, in absence of better options and in order to be able to make articlelevel calculations for the distribution of articles across these 27 categories, the article volume of each journal was divided and assigned equally to all subject categories which it was a member of. The results of this analysis can be found in Table 3. Due to the nature of division and adding up category totals small discrepancy from absolute numbers due to rounding of fractions (Total number of articles adds up to 14094 while actual observations through data collection was 13994, as presented earlier in Table 1).
Focusing on the journal discipline categories with a sufficient number of observations (>35), Fig. 3 visualizes the relationship between average OA% and journal average SNIP (year 2013), further highlighting some of the already indicated disparities between disciplines. Physical sciences have on average the highest relative SNIP values compared to Life, Health and Social Sciences, however, at the same time the% share of OA articles in journals is relatively the lowest in the Physical sciences. See Table 4 for excluded categories not visualized due to small population size.
By the results presented so far it is clear that the various disciplines have discrepancies when it comes not only absolute hybrid OA article output, but perhaps more interestingly the percentage of hybrid OA articles when compared to the total output of the associated journals during the timespan of 2011-2013. The total average is 5.2 hybrid OA articles for 2013, and 3.8% of all output between 2011 and 2013 being hybrid OA. The hybrid OA percentage ranges from the lowest of 1.6% within Materials Science (despite an average hybrid OA article count of 4.5 which is close to the overall total) and highest of 5.2% within the Arts and Humanities sub-category (while the average hybrid OA article count of the category was only 3 which is below the total average). What can be discerned from this overall is that the sizes of journals differ between the disciplines leading to varying results depending on whether one observes the situation based on article count or as a percentage article output, as such generalizations of results across all categories obscures some of the important insight that can be gained. This also supports the choice of using percentages as an additional measure to absolute counts for evaluating and comparing hybrid OA uptake between journals and disciplines. Counting only absolute articles skews the analysis in different ways due to the size differences in journals between subject categories.
Correlation between journal SNIP 2013 value and OA ratio (as% of all articles 2011-2013 found OA) was tested for with Pearsons Correlation, which found a weak but significant correlation between the two variables, r = −0.068 with a 95% confidence interval of [−0.1, −0.035]. In other words, as also the scatterplot in Fig. 4 visually suggests, there is overall not a strong linear relationship between journal impact and the OA ratio of articles in the journals, however, a slight negative relationship can be discerned based on the collected data (which has been collected and constructed with the bias of the journals having >0 hybrid OA articles published in them during [2007][2008][2009][2010][2011][2012][2013]. What also factors in when considering journal citation-based indicators and measurements influenced by output volume (such as the OA% here) is the inherent relationship Table 3 Article-level analysis (article volume of journals categorized into multiple subject categories split evenly among included categories. Small discrepancy from absolute numbers due to rounding of fractions   between journals of different sizes and citation-based indicators. Huang (2016) recently presented evidence for the positive correlation between quantity and impact when it comes to scholarly journals, which in this context would influence by journals having a higher SNIP value also tending to be larger in publication volume and thus have a lower relative OA content as measured by%. However, the bottom line is that there is not overall a strong linear relationship between SNIP and hybrid OA uptake as percentage of published articles.

Discussion
The study provides an experimental methodology for taking a measurement of the state of hybrid OA through a bottomup approach. The insight available previously has been fragmented and usually not longitudinal, and often limited to either only the journal-level or article-level of analysis. Through this study it was possible to discern the total growth for both journals and articles, and for each subject category individually. The relationship between hybrid OA uptake and journal impact (as operationalized by SNIP) proved to be weak on the total sample of journals. In the future, if hybrid OA metrics improve and become more readily available, conducting the analysis on a broader set of journals which also include hybrid OA journals without uptake of the hybrid OA option, could produce a different conclusion.
This study was considerably more time-consuming than initially expected, which was largely due to the variations in the degree of clarity that publishers mark openly available content. And even after identifying and retrieving the OA content it is not always evident if the content has always been OA, will remain OA in the future, or just happens to be so for the time being due to promotional reasons or technical glitches. Since the data collection of this study was performed there has  been slow and steady progress towards common metadata standards for identification of hybrid OA. A summary of the key developments is given by Chumbe, MacLeod, and Kelly (2015), who point out the key difficulties in getting publishers and discovery service providers to adopt common practices to discern between open and closed content. The authors created a promising web service, JournalTOCs, that could prove very valuable for the purpose of monitoring and studying hybrid OA (JournalTOCs, 2016). It is a web service that aggregates article and journal metadata feeds from scholarly journal publishers, including the five that are included in this study. The metadata currently contains info regarding hybrid OA status on the journal level, but comprehensively identifying individual articles as hybrid OA is still lacking, which is something that the metadata addition suggested by Chumbe et al. (2015) could rectify.
The major limitation of this study is that it makes the most of what is available -there are unavoidably uncertainties introduced when it comes to doing bottom-up data collection through the web where thousands of journals are scoured for marginally available openly accessible content. And even when content is found the task of determining what content is actually persistently meant to be out in the open is another step introducing non-perfect results. By documenting the experimental methodology as transparently as possible the intention has been to cater to those researchers who want to improve on any stage of the collection, identification, or analysis.
Future research into hybrid OA publishing could place focus on analyzing traits of individual articles, e.g. author affiliations or potential funding sources mentioned in article acknowledgements. A large-scale analysis of citation accrual to hybrid open access articles could also help shed light on what the relationship between hybrid open access and received citations is like, studies so far have been limited in scope. Mueller-Langer and Watt (2014) focused only on articles within economics, Sotudeh et al. (2015) only Elsevier and Springer exclusively at the article-level for years 2007-2011. Another study could focus on journal size, looking at the relationship between journal size on the uptake of hybrid OA. A comparison between hybrid OA uptake and publisher self-archiving policies for journals would also be valuable to better understand the current OA phenomenon, i.e. could authors of hybrid OA articles have uploaded an accepted manuscript of the article just as well? And if so, is there any relationship between strictness of self-archiving policies and hybrid OA uptake?
What has been reported here is only the inception of hybrid OA. In the beginning the alternative was sold as an optional service to authors, since then hybrid OA has become integrated into agreements and policies at different levels. The increasing offsetting deals with publishers, OA mandates, restrictions in journal self-archiving policies, and consortia similar to that of SCOAP3 (Romeu et al., 2016) will, or likely have already, fueled hybrid OA growth to much higher levels. Now that the content is increasingly open it should also be made as discoverable as possible.

Authors' contribution
Concieved and designed the analysis, collected the data, contributed data or analysis tool, performed the analysis, wrote