Comparison of regulatory approaches for determining application limits for nitrogen fertilizer use in Germany

This study examined the suitability of three different indicators as entry points for agricultural regulation for limiting excess nitrogen (N) fertilizer inputs in Germany: net soil surface balance, gross farm-gate balance, and fertilization planning. Data on about 6000 farms in Germany were grouped into types for comparative analysis. The design of the regulatory approaches and the reliability of constituent parameters were then examined, and proportions of affected farms and mean N reduction requirements were identified. This revealed that: (a) design and purpose of the regulatory approaches differ, but the data requirements are very similar; (b) the parameters involved differ in reliability and integrity; and (c) the limits for maximum N fertilizer input at farm level vary with approach and farm type.


List of abbreviations
1. Introduction

Background
Crop targeted and balanced N fertilization is necessary for optimal plant nutrition and at the same time to reduce environmental impacts. Loss of reactive N compounds from farms is a major ecological challenge, these compounds threaten biodiversity, climate, and human health (Sutton and Bleeker 2013). The EU Nitrates Directive (91/676/EEC) aims to reduce nutrient losses from agricultural activities in order to protect groundwater and surface waters from nitrate emissions (European Commission 1991). In Germany, the Directive is implemented through the Fertilizer Application Ordinance (DüV) (DüV 2020). On the background of the judgement of the European Court of Justice from 21 June 2018 for inadequate implementation of the Nitrates Directive, the DüV was amended in 2020 (Kuhn et al 2020). The amended version abolishes the need for nutrient comparison in a nutrient SoilB, and tightens the rules on FertP. For regions exceeding the nitrate threshold value for groundwater of 50 mg NO 3 L −1 or above 37.5 mg NO 3 L −1 with an increasing trend (e.g. in high livestock regions in Northwestern and Southern Germany, or low precipitation region in East Germany), strict and harmonized measures to reduce water pollution by nitrates must be implemented (Wolters et al 2021). Since 2021, the quantity and quality of measuring stations for the classification of nitrate sensitive areas has been increased, and a standardized methodology has been prescribed (BMEL 2020). In order to achieve the German target for sustainable N management embedded in the German Sustainable Development Strategy, the Ordinance for Substance Flow Analysis (StoffB-ilV 2017), a FarmB framework, was introduced in 2018 (The Federal Government 2020). This study compared the three regulatory approaches (SoilB, FertP, FarmB) as performance indicators for nutrient management in terms of structure, control, and enforcement, and the effects on N management at farm level. The current state of the approaches and their potential, similarities, and differences in agrienvironmental policy were also compared. Nutrient policy in the EU, and especially in Germany, is undergoing major changes (Klages et al 2020a), most recently through the abolition of SoilB in German regulations (Klages et al 2020b). At member state level, the gross nutrient balance in 2017 was only 62 kg N ha −1 per utilized agricultural area (UAA) in Germany, while it was 187 kg N ha −1 in the Netherlands (Eurostat 2020a). However, the Netherlands has already implemented fertilization planning as a regulatory approach, according to COM requirements, while in Germany two regulatory approaches (FertP and SoilB) were applied contemporaneously until 2020. Due to the degree of nitrate pollution, lack of improvements in German groundwater bodies, and a dispute about required measures, Germany was challenged with infringement in 2013 (Salomon et al 2016) and found guilty in 2018 (European Court of Justice 2018). Around 20% of EU-wide infringement cases in 2019 were within the environment policy area (European Commission 2019b), many relating to the Nitrates Directive, e.g. in Germany, Greece, Belgium, and Austria (European Commission 2019a). This illustrates the enormous bureaucratic effort required to ensure implementation of the Nitrates Directive in EU Member States.
Germany is at a turning point: SoilB has been abolished by the national legislative authority and FertP has been strengthened, as it is considered the preferred approach under the EU Nitrates Directive. However, FarmB is legitimized by the national targets set for sustainability and climate protection, and, thus, by the sovereign of Germany, the Bundestag. The Federal Government has set the goal of 70 kg N ha −1 for FarmB of the agricultural sector to meet sustainable standards in the context of water and air quality, biodiversity and climate protection (The Federal Government 2020). This requires a N surplus reduction of about 20 kg N ha −1 (DESTATIS 2020). A policy-relevant question addressed in this paper is which performance indicator is best for nutrient management.
FertP, SoilB, and FarmB are approaches of agrienvironmental policies whose results are indicators for multiple purposes, e.g. monitoring or control (Klages et al 2020a). Fertilization planning and nutrient balancing inherently provide diverging views on the fertilization process: fertilization planning is performed ex ante in order to limit N excesses through timely and needs-based application, while nutrient balancing of N inputs and outputs is performed ex post. Nutrient balancing provides information about (a) production efficiency, e.g. on field (=UAA) (SoilB) or farm (FarmB) level, (b) environmental pressure (OECD 2013), and (c) links between nutrient use in agriculture, nutrient losses to the environment, and sustainable 1 soil nutrient usage (Eurostat 2017).

Soil surface balance (SoilB)
SoilB, a net soil surface balance related to accountable N inflow of the applied fertilizer, was a legally binding approach for German farmers until 2020. For calculation of SoilB, standard values had to be used, e.g. excretion factors for N and phosphates depending on animal category, development stage, feed composition, and nutrient concentrations in harvested crops. Individual estimates were also accepted, e.g. for nutrient concentrations in roughages in compliance with minimum values. N removal by harvested/grazed roughage crops was estimated considering animal category, development stage, type of husbandry and animal numbers. Substantial losses (15%/25%) could be deduced from the calculated N removal from the field/grassland by roughage. Also, manure-and digestate-specific gaseous N losses were deduced from standard excretion figures for volatilization in housing and storage (15%-45%), during field application (5%-10%), and on pasture (75%), depending on animal species. Since these are maximum factors to be used, lower ones can also be taken due to better application or aeration technology (DüV 2017, Häußermann et al 2020. An additional N input, BNF, was mostly deduced from tables as a function of leguminous species cultivated on arable land or their proportion in grassland (equation (1)). The target for net SoilB was an N surplus ⩽50 kg N ha −1 as a 3 year mean (DüV 2017). The N surplus in SoilB was taken as an agrienvironmental indicator of potential N emissions and a 'pressure indicator' of potential nutrient losses to the environment, i.e. it indicated the potential threat of reactive N compounds to the environment with its different media. Nutrient surpluses can lead to eutrophication and surface and groundwater pollution (Leip et al 2015, SRU 2015, Jansson et al 2019, while nutrient deficiencies can decrease soil fertility and increase erosion (Eurostat 2020b). Indicators such as field NUE can be deduced from SoilB (Löw et al 2020). The aim is to meet the national sustainability target of a maximum surplus of 70 kg N ha −1 by 2030 (The Federal Government 2020). In FarmB, nutrient accounting is based on invoices, delivery notes, and product declarations for nutrients (e.g. mineral fertilizers, feedstuffs) or standard values (e.g. nutrient content of animal products, excretion factors). All products containing N or phosphates that enter the farm from external sources are considered 'inputs' and all products containing N and phosphates that leave the farm are considered 'outputs' . Gaseous losses are not considered, as FarmB is a gross calculation. BNF on arable land is considered an input, whereas atmospheric N deposition is not directly considered (equation (2)). The actual gross farm-gate balance threshold is an N surplus ⩽175 kg N ha −1 as 3 year mean. Also, a farm-individual maximum N surplus can be calculated (StoffBilV 2017), which corresponds to the SoilB threshold (Klages et al 2017). In addition, loss factors are granted for organic fertilizers and roughage produce, and must be added to it. However, the impact is diminished by adding a 10% margin on the permitted maximum farm-individual balance value (StoffBilV 2017). FarmB aims to document nutrient flows on livestock farms in a transparent and comprehensible manner (BMEL 2019). The FarmB value is thus an indicator of the environmental pollution caused by N compounds, and is actually considered the most integrative and transparent indicator in nutrient management (Oenema et al 2003, Bach and Frede 2005, SRU 2015. Further indicators, e.g. farm NUE, may also be deduced from FarmB (Löw et al 2020).
(2) FertP is a mandatory, site-specific tool based on crop-specific nutrient demand values and nutrient availability from soil and previous crops. Depending on farm-specific (quantitative and qualitative) yield potential for the preceding 5 year period, individual on-farm nutrient demand can differ from the standard value. Actual fertilization demand is reduced by standard values representing the nutrient supply from soil, due to soil type or previous organic fertilization, based on soil analysis for plant-available N in spring. An overview of the exact methodology can be found in equation (3) and in section 2.1. The resulting fertilization demand for a growing season can be met by organic, organo-mineral, and mineral fertilizers, but must not be exceeded. Thus, FertP establishes a farmspecific maximum total N application. However, this requires knowledge of the nutrient concentration in the applied fertilizers. For manure or digestate, standard values from DüV can be used as an alternative to laboratory test results (DüV 2020). Contrasting SoilB, additional deductions (=minimum effectiveness) are calculated for N from organic fertilizers applied.
Additions = ∑ yield difference a,b , covering a , difference in raw protein b .
Deductions = ∑ yield difference a,b , N available in the soil a , N residual from organic fertilizers in previous years a,b . N residual from soil reserve a,b , N residual from BNF b , previous crops a , difference in raw protein b Under the Nitrates Directive, fertilization planning is the main control approach for limiting N inputs to EU farms. Nutrient balancing is currently only mandatory in Switzerland, Romania, and partly in Germany, but fertilization planning must be recorded in all EU countries (Klages et al 2020a). In the Netherlands, N balances are drawn up at the farm-gate level of dairy farms. This is not mandatory but an agreement between the milk processing industry and producers, requiring digital reporting of N balances ('ANCA tool') as a precondition for market access to the national milk processing industry (Aarts et

Research gap and objectives
In previous studies, FADN data have been used to generate farm-gate N balances for certain regions in EU Member States, e.g. in Flanders, Belgium (Nevens et al 2006), and in Hesse and Baden-Württemberg, Germany (Gamer and Zeddies 2006, Bach 2013, StickstoffBW 2015. In these studies, mineral fertilizer quantities were derived from fertilizer costs using N-coefficients, as documentation of detailed quantities only began in 2016/17. The present study is unique in using (a) exact data on the quantities of mineral fertilizers applied and (b) a broad spectrum of German farm types to (c) qualitatively and quantitatively compare and evaluate three important past and future agri-environmental performance indicators embedded in German regulatory law.
The overall aim was to show the systematics and identify similarities and differences in the three performance indicators for farm nutrient management (table 1), qualitatively with regard to robustness and integrity 2 , and quantitatively with regard to maximum permitted N fertilizer application rate. Accurate nutrient balances and fertilization plans were generated, primarily based on farm accounting records, and used to estimate discrepancies between actual application rates of N fertilizers and maximum permitted rates for the farm type according to the regulatory performance indicator. Thus, (a) the strictness of the approaches was evaluated and (b) reduction requirements in fertilizer use or scope for action on different farm types was identified, (c) taking into consideration data uncertainty and parameter reliability of the different parameters of the approaches. The hypotheses tested were: Hypothesis 1 (H1.). Different performance indicators can be used to establish restrictions on fertilizer inputs which lead to comparable results in theory.
Hypothesis 2 (H2.). The design of the requirements based on the different indicators leads to differing impacts in practice.
Hypothesis 3 (H3.). The underlying data and assumptions used to compute the indicators lead to systematic differences of the three indicator-based regulatory approaches in terms of data uncertainty and reliability.
Hypothesis 4 (H4.). The conclusions on the hypotheses H1 to H3 vary according to farm types.

Data
We used FADN data covering approximately 10 000 farms in Germany, representing different farm types and regions with comprehensive structural and financial data. Prior to analysis, an accuracy check on all 2 Integrity in this context means incorruptibility and accuracy of an approach according to guidance in the Paris Agreement UNFCCC (2016). data was made using a plausibility program provided by the Federal Ministry of Food and Agriculture (BMEL). For details, see BMEL 2018a, 2018b.
From the data, six farm types were identified using the EU/BMEL farm typology based on financial outputs: (a) arable farms, (b) dairy farms, (c) other cattle and grazing livestock farms, (d) mixed production systems, (e) pig and poultry farms, (f) permanent crop farms. For pig and poultry farms, only agricultural farms having UAA are listed, and not industrial farms. The key data for the farm types were weighted using type-specific extrapolation factors, to ensure consistency with sectoral totals (Hansen et al 2009, Haß et al 2020. These factors, derived from the national farm survey (DESTATIS 2017), were stratified using farm size, financial output, and farm type, to reduce the standard error in the results. SoilB, FarmB, and FertP were calculated to determine the permitted N fertilizer input for the financial year 2018/2019 3 , that may be applied either by mineral fertilizers or by the plant-available organic fertilizers. Farms with animals may reach the limit with organic fertilizers alone.
Since previous 5 year yield is considered for FertP, only farms with long-term representation in FADN were included (n = 6112). The required parameters are approach-specific and differ in terms of data reliability. FarmB and SoilB use similar data, but the system boundaries (farm-gate or soil surface) differ. FertP uses another logical access to the data, but the parameters used are also quite similar to FarmB and SoilB. Assessment of parameterand approach-specific data reliability, focusing on data origin, revealed differences (table 2). For further details on the source of data and the implementation based on German FADN see table A1 (available online at stacks.iop.org/ERL/16/055009/mmedia).

Statistical analysis
For explorative data analysis, the equivalent functions in Microsoft Excel Professional Plus 2010 were used. Mean and standard deviation for different farm types were calculated based on the functions in SAS (SAS 9.4) commercial statistics software.

Results
The impact of the respective performance indicators (as kg N ha −1 ) was calculated. Farm typespecific exceedance of the maximum surplus of 50 kg N ha −1 for SoilB, and of the maximum surplus of 175 kg N ha −1 and the farm-individual maximum surplus for FarmB, as legally binding thresholds, was then identified (DüV 2017, StoffBilV 2017). We also compared the amount of N applied above the N fertilizer requirement according to FertP, where the threshold value is the balance between foreseeable 3 The majority of farms (95%) base their accounts on the fiscal year. N requirements of the crops and the N supply to crops from soil and fertilizers. Land application of N fertilizer must not exceed the calculated N fertilizer requirement, which can be interpreted as a threshold level of zero. Figure 1 shows the results of FarmB and SoilB for representative farms in the FADN network, as boxplots (10th to 90th percentile). For FarmB, pig and poultry farms (119 kg N ha −1 ; mean surplus) and dairy farms (95 kg N ha −1 ) had considerably higher N surpluses than the other farm types. For SoilB, only pig and poultry farms (76 kg N ha −1 ) showed distinctly higher N surpluses. Both approaches revealed large variations in N surplus, even within farm type. However, due to the high gross N surplus in FarmB, the N input-limiting effects of the two approaches differed considerably. Figure 2 shows N applied compared with N permitted according to FertP, where values exceeding the 0-line indicate the exceedance of permitted N input thresholds and values below show that the permitted N input is not fully utilized.
The pattern for FertP and SoilB was generally similar (figure 2). For dairy farms (30%, 12 kg N ha −1 ; share of affected farms of this type, mean N reduction requirement related to total farm area of the farm group), permanent crop farms (30%, 10 kg N ha −1 ) and arable farms (16%, 6 kg N ha −1 ), FertP was the most limiting approach. SoilB was most demanding for pig and poultry farms (68%, 35 kg N ha −1 ), mixed production systems (28%, 11 kg N ha −1 ), and other cattle and grazing livestock farms (18%, 8 kg N ha −1 ) (figure 2). On average for the agricultural sector, 27% of the farms were affected with a mean reduction requirement of 10 kg N ha −1 for FertP, whereas the corresponding values were 23% and 9 kg N ha −1 for SoilB.
The current threshold in FarmB (175 kg N ha −1 surplus) was not demanding for most of the farm types. Only pig and poultry farms showed considerable surpluses greater 175 kg N ha −1 (21%, 12 kg N ha −1 ) (figure 3). On average for the agricultural sector, 6% of farms were affected by the FarmB threshold, with a mean reduction requirement of 2 kg N ha −1 . However, the farm-individual determination of the maximum N surplus gave comparable reduction requirements to FertP and SoilB. On average for the agricultural sector, 30% of the farms were affected by FarmB farm-individual threshold, with a mean reduction requirement of 15 kg N ha −1 . Pig and poultry farms (64%, 41 kg N ha −1 ) were most affected, followed by dairy farms (34%, 19 kg N ha −1 ) and other cattle and grazing livestock farms (25%, 12 kg N ha −1 ). Arable farms (17%, 5 kg N ha −1 ) and permanent crop farms (8%, 4 kg N ha −1 ) were least affected. For details of all N reduction requirements, see table A2.
The maximum N fertilizer input permitted by FarmB, SoilB, and FertP varied greatly between farm types, and showed differences in distribution among Atmospheric N deposition is reported in an appendix to the FarmB according to (StoffBilV 2017), but is not part of the calculated balance. a Data reliability score: high, e.g. receipt-based = 1; medium = 0; low, e.g. self-reported by farmers and hard to verify = −1. b Gaseous losses due to N emissions from volatilization in animal housing and manure storage, manure application to the land, and total N emissions from animal excretion on pasture. n.a. = not applicable, not an element of the respective indicator.
farm types. Farms with livestock were generally more affected by statutory thresholds. Regarding the approach-specific need for reduction, impacts of the three approaches on the permitted N input showed strong similarities, especially for SoilB and FertP. Unsurprisingly the FarmB generalized threshold level of 175 kg N ha −1 was meaningless for most farms, but the farm-individual threshold gave more restrictive results, especially for farms with livestock. Table 3 shows selected parameters used in calculation of the three performance indicators and an assessment of associated data uncertainties and reliability. Uncertainty is subject to the accuracy of determination of nutrient amounts, based on area, volumes of fertilizers, and farm products, and specific nutrient contents. By defining the uncertainty margin based on legal requirements on the accuracy of declared nutrient contents (Klages et al 2017), the effects on the respective performance indicator were revealed. The baseline scenario for FarmB, SoilB, and FertP shows aggregated average values for the agricultural sector calculated based on German FADN data

Discussion
In the following, parameters required for calculating FarmB, SoilB, and FertP and the respective data uncertainty and reliability are compared. First, elements common to three or two of the approaches are presented, followed by elements specific to a single approach. Consequences for the different performance indicators are then discussed.

Elements of all three approaches
Data on UAA, area of cultivated crops, and number of livestock are required for all approaches, in order to calculate related nutrient amounts, check the data for plausibility, or relate the result to cultivated area.

Mineral N fertilizer
This input shows high data certainty deriving from defined nutrient contents for mineral fertilizer. Receipt-based reporting of fertilizer purchases provides high reliability for this key element of N input. An important condition for verification is the control of enterprises involved in fertilizer sales. Thus, data uncertainty is relatively high. Data reliability is also limited, because classification of livestock categories and manure sampling are performed by farmers. For FarmB, exports and imports are part of the balance. For SoilB and FertP, exports are deducted from the amount of farm-internal manure and digestate, and imports are added to the remaining internal amount. While exporting farms might be keen to declare high amounts of nutrient exports, importing farms are reluctant to accept more nutrients declared than they receive. These opposing interests help to control the consistency of declarations. A precondition is the inclusion of all farms in nutrient accounting, including livestock farms and biogas facilities with no farmland. The latter are only addressed by FarmB (StoffBilV 2017). Organic fertilizers imported from other sectors, such as compost and sewage sludge, play a minor role in total N balances and are not represented in table 3. These fertilizers are regularly analyzed before export to farms.

Biological N fixation
While data on the area of legume crops are quite exact and reliable, yield-dependent rates of BNF vary, so the data are more variable and less reliable. For grassland and mixed green forages, the need to determine the proportion of leguminous plants such as clover, and the yield increases uncertainty and reduces data reliability. The different ways of assessing the amount  of BNF in the three approaches lead to additional variation.

Yield of marketed crops
Marketed crops are reported on the basis of receipts, so that volume and commodity type are defined. Therefore, data certainty and reliability are relatively high. If protein content is not reported, standard values for N content must be used. Problems may arise if the commodity type is not sufficiently specific for attributing the correct nutrient content. In FarmB, marketed crops are the most important element of N exports. In SoilB, yields are normally not differentiated into marketed crops and those use as fodder and forage. In FertP, yields are the basis for deriving cropand yield-specific N demand. Information on yields of marketed crops can help to check yield data in SoilB and FertP for plausibility.

Manure for internal farm use
For this element of SoilB and FertP, the same constraints as for manure and digestate imports and exports apply (see above). In intensive livestock farms, N amounts in manure are high, and thus also uncertainty of SoilB and FertP is high. The most common method for determining the amount of nutrients is the calculation based on standard factors. As there is no receipt-based accounting and mutual control between farms, as is the case for export and import of manure, declared amounts of nutrients in on-farm animal excretions and digestates may be even less reliable compared to traded manure.

Yields of fodder and forage crops for internal farm use
Fodder and forage are important N exports in SoilB of livestock farms, and an important basis for calculation of N demand in FertP. Crop area is reliably declared, but yields are difficult to quantify and vary widely, especially for forage crops and grassland. As internal flows, amounts are not documented by receipts, and even at farm level exact information is difficult to obtain. For dairy and cattle farms with high amounts of farm-internal production and use of forage, nutrient uptake by forage crops is regularly overestimated. Analysis of SoilB data within the WAgriCo-Project showed that higher proportions of maize and grassland in total farm land lead to high N removals in SoilB, which is not plausible in relation to the livestock herd and its N excretions (The WAgriCo Project 2008). In farm groups with forage production, N removal estimated in SoilB was up to 28 kg N ha −1 above an improved estimate of forage production. Quantitative estimation of forage produced and forage losses clearly results in high uncertainty, making it difficult to assess the actual amount of forage produced and used on-farm. Thus, SoilB can be considered non-robust because of the estimation of yields, especially for roughage (Baumgärtel et al 2007). An evaluation of DüV 2007 in 2012 showed overestimation of forage amounts by on average 40 kg N ha −1 on around 10 000 dairy and cattle farms (DüV 2007, Wendland et al 2012. Consequently, a requirement for verification of nutrient uptake by forage crops through cross-checking with forage needs of the farm´s animal herd was introduced in 2017 (DüV 2017). However, forage losses of 15% for field crops such as green maize and 25% for grassland were allowed, moderating the restrictions resulting from stricter nutrient balancing for dairy and cattle farms. A large proportion of these forage losses occur on the cultivated area and thus do not represent a nutrient export. Furthermore, off-site forage losses in storage and housing are collected and usually returned to the land (Klages et al 2017).

Fodder imported
Purchased fodder and forage is documented through receipts, so volume and commodity type are determined accurately and reliably. However, nutrient content may vary, especially in forage. If protein content is not reported, standard values must be used, which adds uncertainty.

Seeds and plant material
This element is of minor importance for the total balance. Input can be either be documented by receipts, or estimated based on the area of cultivated crops and standard values.

Livestock and animal products
Import and export of living livestock, animal losses, and export of animal products such as milk and eggs are reliably reported by receipts, from which number or volume and commodity type are known. For livestock, data on specific weight are sometimes lacking, so weight categories must be applied, adding uncertainty. Animal products sold are normally well documented, for milk including regular testing of protein content. In all other cases, standard values for N content in livestock and products are used for calculating the total amounts, which are comparatively certain and reliable.

Crop-and yield-specific N demand values
Setting specific nutrient demand values is crucial, as these values differ on national (Taube 2018) and European level (Nicholson et al 2018, Klages et al 2020a. In Germany, higher demand values than in the previous regulatory framework at regional level now apply (DüV 2020). Experts claim that the demand values used for FertP are too high, which might lead to systematic overfertilization in some cases (Taube 2018). The differences are not always apparent, due to different methodologies, and should be further evaluated.

Yields of marketed crops and fodder and forage crops
Consideration of previous 5 year yield for FertP is another crucial issue. An unwarranted increase in farm yield, which is difficult to monitor, could lead to upward adjustment of the calculated N requirement, creating corruptibility that may undermine the integrity of the approach. Problems of data uncertainty and reliability mentioned above for fodder and forage crops also have to be considered.

N supply from manure application in previous year
From this N amount, 10% is considered in FertP. Data certainty and reliably are as for manure for internal farm use. FertP also considers different kinds of N supply from soil:

Plant-available soil N in spring (N min )
The amount of plant-available N, usually determined for 0-90 cm soil depth (less for some vegetable species) at the beginning of the growing season (N min -value) is fully considered in FertP. The magnitude of the N min -value depends strongly on external factors such as location, weather, and sampling season, as well as sampling method, and transport to the laboratory. However, N min -value is often taken from officially published charts, but farm-specific measurements should be preferred. Due to high spatial variability found in many studies, sampling is difficult and the results are questionable (Baumgärtel 1993, Stenger et al 1996, Lorenz 2004.

Humus content
N mineralization in soil is considered using a few categories of soil humus content applied by farmers. Testing for soil organic carbon content is not mandatory. Thus, data uncertainty is high and reliability limited.

Previous crops, catch crops
N deriving from previous crops, such as legumes and catch crops, are included in FertP using simple standard values. Calculated values based on crop area are accurate and reliable, but might not depict real N provision by previous crops, which depend also on yields and soil management. For catch crops, the differentiation between harvested and unharvested areas is difficult to verify, so data reliability is more limited.

Crop residues
Crop residues of vegetables are considered in a similar way to residues of previous arable crops.

Consequences of data uncertainty and limited reliability for the three performance indicators
In the following, the consequences of data uncertainty and limited reliability is discussed, focusing on potential for improvement of single parameters in order to maximize the benefit of the indicators for future use.
The limited certainty and reliability of farminternal flows, primarily manure from farm livestock and fodder and forage produced and used on-farm, strongly reduce the certainty and reliability of SoilB and FertP. Factors for N losses and plant availability in manure (manure N efficiency) are individually adjustable (Klages et al 2020a). The amount of farm-internal fodder and forage is not strictly recorded and difficult to verify. Farmers themselves often do not have exact measurements of these amounts, especially in case of forage production. Thus, estimations are used. However, since SoilB has been abolished as part of DüV 2017, standard data for calculating forage intake by ruminant animals and horses are no longer available (DüV 2017(DüV , 2020. Consequently, also FertP lacks a legal basis for improved, plausible estimation of forage yields. For FertP, plant-available soil N in spring (N min -value) as part of soil-and crop rotationspecific N supply increases uncertainty and reduces reliability. Mandatory samplings at higher frequency and in higher numbers on each parcel could contribute to higher certainty about mineralized N amounts in FertP. However, to increase the reliability score, the sample should be taken by independent experts. Further, the calculation factors for N requirements of crops are critically discussed for being presumably overestimated (Taube 2018). Additionally, the N fertilization requirements determined in FertP may be exceeded through exemptions (poor plant development, adverse weather conditions), although not by more than 10% of permitted fertilizer N input (DüV 2020). Other important elements used in SoilB and for calculation of FertP, i.e. mineral N fertilizer input and yields of marketed crops, are quite certain, reliable, and verifiable on the basis of receipts.
Overall, data uncertainty is high, expressed as estimated minimum and maximum deviation in calculated values of total inputs, outputs, and the balance value for SoilB, and in N supply and input, N demand, and the difference between N inputs and fertilization requirements in FertP. For the sectoral average calculation shown in table 3, the estimated maximum deviations cumulate to 31 kg N ha −1 for SoilB and to 40 kg N ha −1 for FertP. The data reliability score with values between −1 and +1, is 0.3 for SoilB, and 0.2 for FertP. For performance indicators used in regulations, these results appear unsatisfactory.
FarmB relies mainly on receipt-based flows of mineral N fertilizer input, purchased fodder and forage, yields of marketed crops, and livestock and animal products. Elements such as BNF by leguminous crops and import/export of manure are difficult to determine, thus contributing to uncertainty and limited reliability. However, these are elements of all three approaches. In table 3, the estimated maximum deviations for the sectoral average FarmB cumulate to 11 kg N ha −1 and the data reliability score is 0.8. These superior results for FarmB as a performance indicator are because farm-internal N flows are not included in calculations, avoiding uncertainties and lack of reliability for these flows. FarmB shows added value in nutrient management, and appears most appropriate as a performance indicator for regulations.
In this context, SRU et al (2013) argued that the regulatory approach to nutrient balancing should be applied at farm-gate level, as SoilB offers great scope for inaccuracies and even manipulation. Becker and Beisecker (2017) noted that gross nutrient balances would be simpler to compile, fairer, and more comprehensible. The unique advantage of FarmB is that it is largely based on farm accounting data, so it can provide objective, standardized results (Wüstholz and Bahrs 2013). This provides high robustness, high transparency, and low manipulability (Scheck and Haakh 2008, SRU et al 2013, Becker and Beisecker 2017. The greater controllability of the information reported could improve enforcement by control authorities (SRU et al 2013). Thus SRU (2015) strongly recommends a gross approach, to make total on-farm N flows visible to farmers, and no deduction of environmentally relevant ammonia losses a priori.
FarmB is recognized as an integer approach by scientific, consulting, and official institutions, but can be improved to make the indicator values more robust and establish it as a mainstay of nutrition management and mandatory regulation (see also Klages et al (2017)), through: • Improved declaration and standardized documentation on nutrient contents in traded fertilizers, feed, and forage, with trade registers for these commodities. • Uniform documentation of quantities and qualities of traded manure, including small quantities (e.g. in a manure trade register). • Improved methodology for estimation of BNF.
• A uniform and comprehensible evaluation tool for meaningful mandatory regulation. • Sanctions for exceeding the maximum balance values. • Enabling control authorities to monitor also nonagricultural actors of fertilizer and fodder trade, in order to verify nutrient flows of purchased farm inputs.
In Germany, an extension of an mandatory FarmB to all farms >20 hectares or with >50 livestock units is envisaged by 2023 latest (StoffBilV 2017, The Federal Government 2019). As 45% of farms cultivated less than 20 hectares in 2016, representing only 7% of German agricultural area (DESTATIS 2019), the decision to keep small farms outside the scope of FarmB helps to avoid bureaucratic burden. In order to close loopholes of this regulation, small farms importing manure from larger farms are now obliged to establish a FarmB, too. Currently, an exceeding of the determined N fertilizer demand may entail a fine up to € 50 000, an excessive FarmB surplus may result in the order of a consultation within 6 months. However, the reporting of FarmB is not part of Cross Compliance, as it is not based on EU legislation.
This study clearly showed that the current uniform threshold of 175 kg N ha −1 in FarmB in Germany (StoffBilV 2017) is no challenge for most farms (figures 1 and 3), and will therefore not contribute to an increase in NUE. In fact, the introduced FarmB concept has to be seen as a first step. The determination of this unpretentious threshold was presumably set in order to get farms used to the novel procedure before scaling up, and is therefore politically rather than scientifically legitimized. The option of a farm-individual threshold value was offered to farmers alternatively in StoffBilV (2017). This target values lead to higher adaptation needs on livestock farms. The German Climate Action Program 2030 requires to make FarmB obligatory for most farms combined with a step-by-step alignment of the national FarmB with the target value of the sustainability strategy (70 kg N ha −1 ) in 2030 (The Federal Government 2019). So, concepts already exist for gradually reducing the FarmB threshold, for example by means of a staggered reduction from 120 to 90 kg N ha −1 depending on the amount of organic fertilizers produced, as published in a project on behalf of the German Federal Environment Agency (Taube et al 2020). A reduction of FarmB, and, thus, N emissions in general, is achievable by reducing inputs or increasing outputs. N inputs may be reduced by N-reduced feeding or by lower fertilizer inputs, which is possible, e.g. through higher manure N efficiencies due to management (e.g. splitting of fertilization) or technical options, just like NIRS (Millmier et al 2000, Huang et al 2007, or injected application (Webb et al 2013, Mencaroni et al 2021, or precision farming (Chmelíková et al 2021). Outputs may be increased by higher manure exports or by production growth, although unit-related production levels for dairy, meat and field crops are already very high in Western Europe, and rather await challenges related to climate change (Gauly et al 2013, Mauger et al 2015, Vollmann 2016. Experiences of practical application of FarmB stem from voluntary water protection initiatives based on intensive technical advice using FarmB for benchmarking purposes (Scheck and Haakh 2008, SRU et al 2013, Becker and Beisecker 2017. Scaling up FarmB as an element of mandatory rules on national level poses difficulties. In the Netherlands, FarmB was previously used in the MINAS as the basis for regulation (Schröder and Neeteson 2008), but MINAS has been abolished due to difficulties e.g. for farms trading manure in determining the N content of manure, or due to differences between soil types in the relationship between N surplus and nitrate concentration in groundwater (Oenema et al 1998). FarmB, as part of the 'ANCA tool' , is currently a requirement for dairy farms in the Netherlands wishing to participate in the national dairy market (Aarts et al 2015). The focus is on good collaboration between government, research, and the dairy sector. Neighboring countries have difficulty adopting this approach due to greater heterogeneity of the dairy sector and the lack of a central database for (automatically) collecting and storing data (Oenema and Korevaar 2018). Calculation and reporting methods for FarmB are currently being revised, and a new, uniform evaluation method for FarmB will replace the existing complex target system in Germany.

Conclusions
We used three regulatory approaches (SoilB, FarmB, FertP) to calculate performance indicator values based on German FADN data, and compared the values against legally defined thresholds for limiting N fertilizer input. Impacts of requirements based on FertP (ex ante approach) coincided fairly well with those of SoilB, while impacts of requirements based on FarmB were low because this recently introduced approach has a less restrictive first step. With another evaluation system, impacts of FarmB could be increased. The results confirmed H1 that different performance indicators can be used to establish restrictions on fertilizer inputs with comparable results. The current design of FarmB supports H2, that in practice the design of requirements leads to differing impacts. Assessment of data uncertainties and reliabilities and their consequences for the three performance indicators showed large differences between SoilB and FertP with low certainty and reliability, due to greater reliance on farm-internal data and flows, and FarmB, with better data quality due to largely receipt-based accounting. This confirmed H3 that underlying data and assumptions affect data uncertainty and reliability of the three indicator-based regulatory approaches. Also, the farm type-specific quantitative analysis showed lower variances and consistent FarmB and SoilB values for arable farms, because their nutrient flows (fertilizers purchased, crops sold) are well-documented and reliable. In contrast, livestock farms showed exceeding indicator values and greater variances than other farm types, which supports H4. In a follow-up study, we aim to identify describing socioeconomic variables and farm characteristics of efficient nutrient management.
In the context of the latest DüV amendment in 2020 and the abolition of SoilB due to the COM´s concerns and, as a consequence, the decision of the European Court of Justice (2018) about the respective methodology, FertP will be of particular importance in Germany as a key approach for nutrient management to meet the requirements of EU Nitrates Directive. Germany, thus, follows a European trend, as nutrient balances are only obligatory in Switzerland and Romania (Klages et al 2020a). In its sentence, the European Court of Justice (2018) argues that the German SoilB allowed crop N requirements to be exceeded, through the permitted N surplus. The underlying concept in FertP is an implicit threshold value of zero, as fertilizer inputs shall meet, but not exceed the plant needs. For SoilB a threshold value of 50 kg N ha −1 is defined, as it relates fertilizer inputs to nutrient removals in harvested crops, which according to DüV coefficients are lower than plant needs. Further, in SoilB more of the total N from organic fertilizers is accounted for, compared to FertP. Thus, although the two approaches appear to define different levels of ambition at the first glance, the resulting restrictions are almost the same. In our analysis we found similar restrictive impacts on N input levels for the two approaches.
As nutrient balances are recommended as a key indicator of farm environmental performance, the abolition of SoilB is not clearly considered beneficial. However, FarmB has been introduced in German law and will be rolled out for most farmland. It seeks evidence of inputs and outputs, and thus provides a more reliable basis for evaluation of farm nutrient management and for tracing farm nutrient flows. In particular, FarmB allows to better manage the uncertainties in forage farms with regard to their uncertain forage quantities.
However, a discussion is ongoing in Germany on whether FertP as an obligatory performance indicator is sufficient and what FarmB will provide, apart from an additional bureaucratic burden. We argue that digital and receipt-based systematic documentation of nutrient flows along the value chain within FarmB can considerably improve data acquisition and reliability, and reduce data uncertainties. Crosschecking FertP with FarmB data can help improve data on internal farm flows, making interpretation of FertP more reliable. Through (AI-supported) analysis, SoilB can be generated from FertP and FarmB data, and anomalies, inefficiencies, and their causes can be detected and analyzed in time, improving information for farmers, enabling advisory services to be offered to specific target groups, and allowing control authorities to operate more efficiently. In order to understand what (a good) N indicator performance is related to, additional socio-economic factors should be considered and benchmarking of farm NUE should be performed. Thus, further investigations are required to determine the scope for NUE improvements on farm-level and to better understand the impacts of policy measures on nutrient management, and to assure a targeted proceeding of control authorities.

Data availability statement
The data generated and/or analysed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.