A web tool for exploring the usage of medicines in hospitals in England

Datasets on the amounts of different medicines used over time and location are a valuable resource, with the power to reveal insights into healthcare trends, cost efficiencies, and geographic disparities. In England, primary care prescription data has been openly accessible for analysis for some time through a web tool, providing significant benefits. Since 2020, the National Health Service in England has also released data on secondary care medicine usage, processed from stock control databases, which provides detailed information on medicine usage within hospitals. This is an important dataset, but until now has been available only in a raw form that requires considerable technical skills to be used for even the analysis of basic trends. I have built a web application that enables anyone to easily analyse trends in this data, which is available at hospitalmedicines.genomium.org.


Introduction
Datasets capturing amounts of medicines used by time and location provide an important resource for potential insights into drug utilisation patterns, healthcare practices, and public health trends.Since 2015, the OpenPrescribing project has provided a comprehensive platform for the analysis of primary care prescription data in England (OpenPrescribing.net,Curtis & Goldacre, 2018).It now receives >130,000 unique visitors per year, and has made substantial impacts on clinical practice (Walker et al., 2019).
Historically, secondary care medicine usage data -i.e.data related to prescriptions in hospitals -had not been made publicly available.The Bennett Institute, which builds OpenPrescribing, called in 2020 for secondary care stockcontrol data (which had for some time been collected by a company called Rx-info), to be made publicly available (Goldacre & MacKenna, 2020).In parallel (as described in the online responses to that article), an arrangement was developed for Rx-info's dataset to be made available through The NHS Business Services Authority (NHSBSA) Open Data Portal as the Secondary Care Medicines Data (SCMD) dataset, which was launched soon after in the same year.
The SCMD has been an important dataset, and has been used, for example as a denominator to estimate the rate of adverse drug reports (Sandhu et al., 2022), and to investigate asthma prescribing (Rowan & MacKenna, 2020).It has widespread potential applications: my own interest in this area came from my attempts to link temporal and geographic trends (primarily at the level of countries), between a mutagenic drug and the rate of specific mutational events in virus genomes (Sanderson et al., 2023).
However, the SCMD dataset requires significant processing in order to extract useful insights.For example, it is available as a set of CSV files, one per month, which must each be downloaded and then combined in order to generate temporal insights.The combined dataset currently comprises 17 million rows of data.Many analyses require combining the dataset with other information -hospitals are indicated by their ODS code, e.g.R1H, but users are likely to want to translate this to its textual value, e.g.Barts Health NHS Trust.Similarly, any aggregation by drug ingredient (e.g.aspirin), instead of specific formulation (e.g.aspirin 500mg granules sachets sugar free) requires joining to the NHS Dictionary of Medicines and Devices (dm+d).
Here, I describe a web tool for exploration of this dataset.This tool permits users to inspect temporal trends in hospital prescriptions, both nationally and at the level of individual NHS Trusts, and permits aggregation by ingredient.It is available at hospitalprescriptions.genomium.org(Sanderson, 2024).

Operation
The web application has two main modes.The Formulation mode allows users to search for specific "Virtual Medicine Products" which are genericised versions of real products, for example 'Ibuprofen 200mg capsules'.When the user selects a product, a graph of its national trends over time is shown, which can be visualised as a bar plot or as a line plot (with or without smoothing).Users can choose to view either the number of items prescribed (i.e. tablets, vials, etc.) or their 'indicative cost' -though the latter metric does not represent the real cost paid and sometimes appears to be affected by data entry errors, so may not be especially useful.
The Ingredient mode allows selection of any ingredient of medicine products (i.e.typically a drug, e.g.'Ibuprofen') -it then aggregates the total usage of this ingredient, generally in grams, across all products.This means that if a hospital switches from prescribing 2,000 50 mg tablets per month to prescribing 1,000 100 mg tablets, the resulting graph reflects that the total usage of the drug is unchanged.
In Formulation mode, users can break down usage into different hospitals, both by presenting the national picture coloured by hospital trust, or by filtering to an individual trust.This is also possible in the Ingredient mode, which adds additional features to break down usage by the specific product in which the ingredient is found, and by the route of administration of the product (i.e.oral, intravenous, etc. Implementation I created a PostgreSQL database instance to house SCMD data (the final size of the table is >1.3 GB) and related datasets from the dm+d and about hospitals.I populate it using a script which I make openly available.It performs the following operations: • downloads each month's data from the SCMD dataset • downloads the NHS dm+d: this requires a manual login to TRUD, although it should be automatable (OpenPrescribing code) • downloads hospital trust ODS mappings, giving the names of hospital trusts from their ODS code • creates a series of database tables using the above data • adds a table representing common units, and their mappings to standard units.(e.g.mg are mapped to 0.001 g) • adds a series of indexes to the database to speed up queries The web application is implemented in NextJS.It uses backend API routes that query the PostgreSQL database, and a frontend implemented in React.The most complex database queries are for ingredient-based searches, with database joins used to connect a specific ingredient to all of the specific formulations in which it appears, with its strength in each of the formulation accounted for and then aggregated by month and by hospital.
The frontend is React-based.Graphing is carried out with Observable Plot (ObservableHQ, 2023).Features for exporting data and downloading an SVG from the graph are available.State is captured in the URL to permit specific graphs to be bookmarked.

Use cases
In this section I will briefly discuss some trends visible in this dataset, to give some sense of its potential utility.In each case I provide a graph from the web application (though graphs are much better viewed live, with tooltips providing additional data).
For a quick sense-check on the dataset and its processing, I analysed drugs we expect to have temporal variation.For example, the anti-influenza drug oseltamivir (Tamiflu) shows winter spikes, apart from during the SARS-CoV-2 pandemic, in which social distancing largely suppressed influenza transmission (Figure 1).
Similarly, palivizumab a monoclonal antibody used prophylactically to protect again RSV, but only during its transmission season, shows its expected pattern (Figure 2).
We can also plot the usage of a drug where we expect less variation, such as paracetamol (acetaminophen), with a relatively constant ~10 tons used per month, with a moderate dent made by the COVID pandemic (Figure 3).
In contrast, ibuprofen shows a more marked reduction during the first two major SARS-CoV-2 waves and also usage that never fully returns to prepandemic levels (Figure 4).The first two waves of SARS-CoV-2 are highly visible in the dataset.Propofol, used as an anaesthetic for patients receiving mechanical ventilation, sees marked increases during early SARS-CoV-2 waves (Figure 5).
For dexamethasone we see a decrease from baseline during the first COVID-19 wave, and then a marked increase during the second wave, following the discoveries of the RECOVERY trial (The RECOVERY Collaborative Group, 2021) (Figure 6).
While some kinds of healthcare increased during the pandemic, others decreased.A profound reduction in the use of basiliximab, a monoclonal antibody used immediately following organ transplants to prevent rejection, can be seen during initial SARS-CoV-2 waves.The graph in Figure 7 is coloured by NHS Trust.
Some reductions are encouraging.Below we can see reductions in the use of anaesthetic gas desflurane.Short-term reductions    due to SARS-CoV-2 are visible, but the much larger trend of reduction reflects phasing out due to desflurane's potential to contribute to global warming: by some measures it is 3,714 times as potent as carbon dioxide in its contribution (Ryan & Nielsen, 2010) (Figure 8).
The ability to break down data by the route of drug administration can be important.Waves of SARS-CoV-2 saw a reduction in orally administered morphine (yellow), but increases in intravenous administration (red) (Figure 9).
Antibiotic trends show temporal trends, with an increase in the amount of piperacillin prescribed (Figure 10).
Amoxicillin shows clear increases, across trusts, during the streptococcus group A outbreak of winter 2022-2023 (UK Health Security Agency, 2023) (Figure 11).
Breaking this down by formulation reveals a particular striking trend for 250mg doses, which reflect prescriptions to young children (Figure 12).
Many medicines show increases during the time period for which data is available.Following a decline in prescribing of methylphenidate hydrochloride, a drug for the treatment of ADHD, at the start of the pandemic, usage has since increased substantially (Figure 13).
A range of recently developed or licensed drugs show increases reflecting a national roll-out.A whole host of monoclonal  antibodies (a text search for "mab" in Ingredients mode is one quick shortcut) show great increases during the period.
Breaking down erenumab (a migraine medication) usage by trust shows an initial period where prescriptions were predominantly at St Thomas' hospital (in yellow), and then a wide rollout following a NICE recommendation (Figure 14).
The novel anti-diabetes drug semaglutide begins use during the covered period, but also recent shortages (Figure 15).

Discussion
The Hospital Medicines Usage Data Explorer provides a useful interface to access important open data released by the NHS on drug usage.I hope it will be useful to many people who are interested in these data.

Limitations
My tool has many limitations.It does not provide normalisation to patient numbers, so is limited in its potential to allow comparisons across hospitals.It does not normalise for standard dosages and does not allow graph with multiple drugs,  limiting the ability to compare drugs.It also does not allow grouping drugs into classes.It does not directly display geographic trends.There are also limitations of the underlying dataset.The 'indicative cost' metric does not reflect the actual amount paid, due to confidential discounts.There also appear to be cases of errors or artifacts in the data on drug quantities, which manifest as an implausible spike in prescribing from a particular trust such that it makes up 90% of all prescribing nationally for a given month.I hope that making these effects more accessible, these issues may be made more visible and thereby corrected or avoided.

Future plans
My plans for extending this tool are limited.I agree with the sentiment (Goldacre, 2023) that core web services such as this should be developed and delivered by teams including professional software engineers and people with specialised domain knowledge.I hope that a similar service will soon be run by such a team, for example at Open Prescribing (Open Prescribing Hospitals).In the meantime I welcome pull requests, but am limited in my resources for extending the application.I plan to regularly update the dataset with future SCMD releases.
I hope that my tool can be useful, both in providing an opportunity to analyse this data until other similar tools exist, and in some of the features it provides, such as Ingredients mode, which I believe does not have a direct analog in OpenPrescribing, apart from in specific measures.I congratulate the author and I also thank the journal for providing a place to index such work.In my experience it can be difficult to find health journals that will accept descriptions of health software infrastructure.
The paper is very clear with links to relevant code for inspection -for the avoidance of doubt, I have not fully inspected the code nor evaluated if it is enough to reproduce the tool as this would require a substantial resource investment on my part.
The use cases in the paper give a good demonstration of the tool.I would encourage readers to actually use the tool itself to see their own examples as it is very user friendly.
As a personal preference I would make the following observation that the author may consider revising or adding detail -however I note that there is limited/no resource for more work on this tool and paper.I'd be happy for it to be indexed in its current state and the following are minor suggestions based on my own interests and penance.
Limitations section -I think the author overstates the limitations of the tool and mixes up perceived limitations of the underlying data with the tool.
Future plans -I think the direct analogue in OpenPrescribing might be https://openprescribing.net/chemical/ ChatGPT -As a very new technology it would be helpful to have more detail on how ChatGPT is used in the workflow.This would help those of us trying to incorporate new technologies to learn more about its use cases.A note in the methods would probably suffice.
Is the rationale for developing the new software tool clearly explained?Yes Is the description of the software tool technically sound?Yes Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?Yes Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?Yes Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?Yes Competing Interests: I don't have any competing resources as per the journal policy but I want to document that I am part of the OpenPrescribing team mentioned in the paper.I confirm that this potential conflict of interest did not affect my ability to write an objective and unbiased review of the article.
Reviewer Expertise: I have experience in medicines data and health data more broadly.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Kishore Kumar Jagadeesan
University of Bath, Bath, England, UK Theo Sanderson's article introduces a web tool for analyzing secondary care medicine usage data, which is a great initiative for improving data access in healthcare.However, there are several critical areas where the article needs significant improvement: Purpose of the Tool: The article says the tool is designed to make secondary care medicine usage data more accessible.However, it should better explain the specific problems it solves, such as handling complex raw data and providing a user-friendly interface that current tools lack.

Documentation and Setup:
The article includes a link to the source code, which is good.However, it doesn't provide detailed instructions or documentation on how to set up and use the tool.Without these, it's hard for others to replicate or use the tool with different data.

Explanation of Methods:
The article gives examples of how the tool works but doesn't explain the statistical methods used in detail.It's important to understand these methods to trust the results.
Additionally, there should be a discussion on how the tool deals with data biases.
Tool Performance: The article's conclusions about the tool's performance are based on a few examples, which are not enough.A systematic evaluation of how accurate, user-friendly, and scalable the tool is would provide stronger support for its effectiveness.
In summary, while the tool has great potential, the article needs significant improvements in technical details, documentation, and evaluation to meet professional standards.The author should take a more thorough approach to validate the tool's performance and its impact on healthcare data analysis.

Additional Citation:
For further information on similar tools and methodologies, you might consider looking at the PrAna R package, which is designed to aggregate and visualize England's national prescription data, focusing on primary care prescriptions.PrAna provides a comprehensive workflow for data preparation, conversion, and visualization, and offers detailed documentation for setup and use.More information about PrAna can be found in their publication and on their GitHub page (Jagadeesan et al., 2022).

Figure 3 .
Figure 3. Paracetamol monthly usage totals across all hospitals, visualised in ingredient mode.

Figure 4 .
Figure 4. Ibuprofen monthly usage totals across all hospitals, visualised in ingredient mode with "line" styling.

Figure 6 .
Figure 6.Dexamethasone monthly usage totals across all hospitals, visualised in ingredient mode with smooth line styling.

Figure 5 .
Figure 5. Propofol monthly usage totals across all hospitals, visualised in ingredient mode with smooth line styling.

Figure 7 .
Figure 7. Basiliximab monthly usage totals across all hospitals, coloured by hospital trust, visualised in ingredient mode.

Figure 9 .
Figure 9. Morphine sulfate monthly usage totals across all hospitals, coloured by route of administration of the product, visualised in ingredient mode.Green indicates oral routes, red indicates intravenous routes and teal indicates multiple potential routes.

Figure 10 .
Figure 10.Piperacillin monthly usage totals across all hospitals, visualised in ingredient mode.

Figure 11 .
Figure 11.Amoxicillin trihydrate monthly usage totals across all hospitals, visualised in ingredient mode, with each line indicating data from a different hospital.

Figure 12 .
Figure 12.Amoxicillin 250mg/5ml oral suspension sugar free monthly usage across all hospitals, visualized in formulation mode.

Figure 13 .
Figure 13.Methylphenidate hydrochloride monthly usage across all hospitals, visualized in ingredient mode.

Figure 14 .
Figure 14.Erenumab monthly usage across all hospitals, visualized in ingredient mode, broken down by NHS Trust.St Thomas' hospital in yellow.

Figure 15 .
Figure 15.Semaglutide monthly usage across all hospitals, visualized in ingredient mode.

Reviewer
Report 30 July 2024 https://doi.org/10.21956/wellcomeopenres.22982.r85658© 2024 Hamilton F. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Fergus HamiltonUniversity of Bristol, Bristol, England, UK Apologies for my delay with this.this is a useful technical report that describes an important and useful tool.I have shared widely with clinical colleagues who have found it useful.Well done!I have minor comments, only relating to minor clinical details:Piperacillin is always prescribed in the UK as part of the drug combination piperacillintazobactam and is a broad-spectrum antibiotic used in severe infection, this comment should ideally be added.

Is the rationale for developing the new software tool clearly explained? Yes Is the description of the software tool technically sound? Yes Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others? Yes Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes Competing Interests:
Streptococcus Group A is generally called Group A Streptococcus (or S.pyogenes).No competing interests were disclosed.
○Once again, thanks for this.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
https://doi.org/10.21956/wellcomeopenres.22982.r85659© 2024 Jagadeesan K.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.