Letter of Intent for KM3NeT 2.0

The main objectives of the KM3NeT Collaboration are i) the discovery and subsequent observation of high-energy neutrino sources in the Universe and ii) the determination of the mass hierarchy of neutrinos. These objectives are strongly motivated by two recent important discoveries, namely: 1) The high-energy astrophysical neutrino signal reported by IceCube and 2) the sizable contribution of electron neutrinos to the third neutrino mass eigenstate as reported by Daya Bay, Reno and others. To meet these objectives, the KM3NeT Collaboration plans to build a new Research Infrastructure consisting of a network of deep-sea neutrino telescopes in the Mediterranean Sea. A phased and distributed implementation is pursued which maximises the access to regional funds, the availability of human resources and the synergetic opportunities for the earth and sea sciences community. Three suitable deep-sea sites are identified, namely off-shore Toulon (France), Capo Passero (Italy) and Pylos (Greece). The infrastructure will consist of three so-called building blocks. A building block comprises 115 strings, each string comprises 18 optical modules and each optical module comprises 31 photo-multiplier tubes. Each building block thus constitutes a 3-dimensional array of photo sensors that can be used to detect the Cherenkov light produced by relativistic particles emerging from neutrino interactions. Two building blocks will be configured to fully explore the IceCube signal with different methodology, improved resolution and complementary field of view, including the Galactic plane. One building block will be configured to precisely measure atmospheric neutrino oscillations.

The main objectives of the KM3NeT Collaboration are (i) the discovery and subsequent observation of high-energy neutrino sources in the Universe and (ii) the determination of the mass hierarchy of neutrinos. These objectives are strongly motivated by two recent important discoveries, namely: (1) the highenergy astrophysical neutrino signal reported by IceCube and (2) the sizable contribution of electron neutrinos to the third neutrino mass eigenstate as reported by Daya Bay, Reno and others. To meet these objectives, the KM3NeT Collaboration plans to build a new Research Infrastructure consisting of a network of deep-sea neutrino telescopes in the Mediterranean Sea. A phased and distributed implementation is pursued which maximises the access to regional funds, the availability of human resources and the synergistic opportunities for the Earth and sea sciences community. Three suitable deep-sea sites are selected, namely off-shore Toulon (France), Capo Passero (Sicily, Italy) and Pylos (Peloponnese, Greece). The infrastructure will consist of three so-called building blocks. A building block comprises 115 strings, each string comprises 18 optical modules and each optical module comprises 31 photo-multiplier tubes. Each building block thus constitutes a threedimensional array of photo sensors that can be used to detect the Cherenkov light produced by relativistic particles emerging from neutrino interactions. Two building blocks will be sparsely configured to fully explore the IceCube signal with similar instrumented volume, different methodology, improved resolution and complementary field of view, including the galactic plane. One building block will be densely configured to precisely measure atmospheric neutrino oscillations.
Keywords: neutrino astronomy, neutrino physics, deep sea neutrino telescope, neutrino mass hierarchy (Some figures may appear in colour only in the online journal)

Executive summary
The main objectives of the KM3NeT 53 Collaboration are (i) the discovery and subsequent observation of high-energy neutrino sources in the Universe and (ii) the determination of the mass hierarchy of neutrinos. These objectives are strongly motivated by two recent important discoveries, namely: (1) the high-energy astrophysical neutrino signal reported by IceCube and (2) the sizeable contribution of electron neutrinos to the third neutrino mass eigenstate as reported by Daya Bay, Reno and others. To meet these objectives, the KM3NeT Collaboration plans to build a new Research Infrastructure consisting of a network of deep-sea neutrino telescopes in the Mediterranean Sea. A phased and distributed implementation is pursued which maximises the access to regional funds, the availability of human resources and the synergistic opportunities for the Earth and sea sciences community. Three deep-sea sites are selected for the optical properties of the water, distance to shore and local infrastructure, namely off-shore Toulon (France), Capo Passero (Sicily, Italy) and Pylos (Peloponnese, Greece).
The infrastructure will consist of three so-called building blocks. A building block comprises 115strings, each string comprises 18 optical modules and each optical module comprises 31photo-multiplier tubes (PMTs). Each building block thus constitutes a threedimensional array of photo sensors that can be used to detect the Cherenkov light produced by relativistic particles emerging from neutrino interactions. Two building blocks will be sparsely configured to fully explore the IceCube signal with comparable instrumented volume, different methodology, improved resolution and complementary field of view, including the galactic plane (GP). Collectively, these building blocks are referred to as ARCA: Astroparticle Research with Cosmics in the Abyss. One building block will be densely configured to precisely measure atmospheric neutrino oscillations. This building block is referred to as ORCA: Oscillation Research with Cosmics in the Abyss. ARCA will be realised at the Capo Passero site and ORCA at the Toulon site. Due to KM3NeT's flexible design, the technical implementation of ARCA and ORCA is almost identical. The deep-sea sites are linked to shore with a network of cables for electrical power and high-bandwidth data communication. On site, shore stations are equipped to provide power, computing and a high-bandwidth internet connection to the data repositories. The readout of the detectors is based on the 'Alldata-to-shore' concept, pioneered in ANTARES. The overall design allows for a flexible and cost-effective implementation of the Research Infrastructure and its low-cost operation. The costs remaining to realise ARCA and ORCA amount to €95 M. The operational costs are estimated at about €2 Mper year, equivalent to less than 2% of the total investment costs.
The whole project is organised in a single Collaboration with a central management and common data analysis and repository centres. A Memorandum of Understanding (MoU) for the first phase (Phase-1), covering the currently available budget of about €31 M, has been signed by the representatives of the corresponding funding agencies. During Phase-1, the technical design has been validated through in situ prototypes; data analysis tools have been developed; assembly sites for the production of optical modules and strings have been setup; and deployment and connection of strings in the deep sea are being optimised for speed and reliability. During the next phase (Phase-2.0), the Collaboration will complete the construction of ARCA and ORCA by 2020. The ultimate goal is to fully develop the KM3NeT Research Infrastructure to comprise a distributed installation at the three foreseen sites (Phase-3) and operate it for ten years or more. The phased implementation of the KM3NeT Research Infrastructure is summarised in table 1. The Collaboration aspires to establish a European Research Infrastructure Consortium (ERIC) hosted in The Netherlands. The first part of this document focuses on the technical design of the infrastructure. As a preview to the science objectives presented later in this document, figure 1 shows the significance as a function of time for the detection of a diffuse, flavour-symmetric neutrino flux corresponding to the result reported by IceCube. Thanks to the purity of the event sample, a high-significance detection of this neutrino flux will be possible for both track-like and cascade-like events within one year of operation. The excellent angular and energy resolutions, combined with the large effective mass, provide for a significant discovery potential to find neutrino sources in the Universe. Figure 2 shows the significance as a function of observation time for the determination of the neutrino mass hierarchy (NMH). A determination of the NMH with at least 3-sigma significance can be made after three years of operation, i.e. as early as 2023. This precedes results of other experiments and provides timely input for experiments aiming at a measurement of the CP-violation phase with high sensitivity. In addition, ORCA will provide improved measurements of some of the neutrino oscillation parameters.

Detector design and technology
The successful deployment and operation of the ANTARES neutrino telescope [1] has demonstrated the feasibility of performing neutrino studies with large volume detectors in the deep sea. The detection of neutrinos is based on the detection of Cherenkov light produced by relativistic particles emerging from a neutrino interaction. The same technology can be used for studying neutrinos from GeV (for KM3NeT/ORCA) to PeV energies and above (for KM3NeT/ARCA). The goal of the KM3NeT technology is to instrument, at minimal cost and maximal reliability, the largest possible volume of seawater with a three-dimensional spatial grid of ultra-sensitive photo-sensors, while remaining sensitive to neutrino interactions in the target energy range. The KM3NeT design builds upon the ANTARES experience and improves the cost effectiveness of its design by about a factor four. All components are designed for at least ten years of operation with negligible loss of efficiency. The system should provide nanosecond precision on the arrival time of single photons, while the position and orientation of the photo-sensors must be known to a few centimetres and few degrees, respectively. The photo-sensors and the readout electronics are hosted within pressure-resistant glass spheres, so called digital optical modules (DOMs). The DOMs are distributed in space along flexible strings, one end of which is fixed to the sea floor and the other end is held close to vertical by  23 . Note that the CPviolating phase d CP has been assumed to be zero. a submerged buoy. The concept of strings is modular by design. The construction and operation of the research infrastructure thus allows for a phased and distributed implementation.
A collection of 115 strings forms a single KM3NeT building block. The modular design allows building blocks with different spacings between lines/DOMs to be constructed, in order to target different neutrino energies. The full KM3NeT telescope comprises seven building blocks distributed on three sites. For Phase-2.0, three building blocks are planned: two KM3NeT/ARCA blocks, with a large spacing to target astrophysical neutrinos at TeV energies and above; and one KM3NeT/ORCA block, to target atmospheric neutrinos in the few-GeV range. Figure 3 indicates the location of the KM3NeT deep sea sites and the location of the various institutes which are currently involved in the PMT testing, the DOM integration, the string integration and the deployment of strings for KM3NeT Phase-1.

KM3NeT/ARCA: deep sea and onshore infrastructures
The KM3NeT-Italy infrastructure is located at  36 16' N  16 06' E at a depth of 3500 m, about 100 km offshore from Porto Palo di Capo Passero, Sicily, Italy (figure 4, left). The site is the former NEMO site and is shared with the EMSO facility for Earth and Sea science research.
The ARCA installation comprises two KM3NeT building blocks. Figure 4 right illustrates the layout. The power/data are transferred to/from the infrastructure via two main electro-optic cables. In addition to the already operating cable serving the Phase-1 detector a new cable will be installed. This Phase-2 cable will comprise 48 optical fibres. Close to the underwater installation the cable is split by means of a Branching Unit (BU) in two branches, each one terminated with a cable termination frame (CTF) (figure 5, left). Each CTF is connected to secondary junction boxes, 12 for the ARCA block 1 and 16 for the ARCA block 2. Each secondary junction box allows the connection of up to 7 KM3NeT detection strings. The underwater connection of the strings to the junction boxes is via interlink cables running along the seabed. For the ARCA configuration, the average horizontal spacing between detection strings is about 95 m. On-shore each main electro-optic cable is connected to a power feeding equipment located in the shore station at Porto Palo di Capo Passero. Power is transferred at 10 kVDC and is converted to 375 VDC at the CTF for transmission, via the secondary junction boxes, along the interlink cables to the strings. The shore station also hosts the data acquisition electronics and a commodity PC farm used for data filtering.
In December 2008, the first main electro-optic cable was deployed. A CTF and two secondary junction boxes were successfully connected in summer 2015.

KM3NeT/ORCA: deep sea and onshore infrastructures
The KM3NeT-France infrastructure is located at  42 48' N 06  02' E at a depth of 2450 m, about 40 km offshore from Toulon, France (see figure 6, left). The site is outside of the French territorial waters and about 10 km west of the site of the existing ANTARES telescope. Figure 6 right illustrates the layout of the full ORCA array; a single KM3NeT building block of 115 strings. The power/data are transferred to/from the infrastructure via two main electro-optic cables comprising 36/48 optical fibres and a single power conductor (the return is via the sea).
The strings are connected to five junction boxes (figure 7, left), located on the periphery of the array. Each junction box has eight connectors, each of which can power four strings daisy chained in series (figure 7, right). Some daisy chains include calibration units, which incorporate laser beacons and/or hydrophone acoustic emitters. In the baseline design, five connectors on the junction box are dedicated for the neutrino array and one is dedicated for Earth and Sea science sensors and two are spares. The underwater connection of the strings to the junction box is via interlink cables running along the seabed. For the ORCA configuration, the average horizontal spacing between detection strings is about 20 m.
Due to the shorter transmission distance involved in the ORCA configuration power is transferred in alternating current (AC). The power station, dimensioned for a single building block (92 KVA) is located at the shore end of the main cable near the 'Les Sablettes' beach. Power is transferred at 3500 VAC The offshore junction boxes use a AC transformer to convert this to 400 VAC for transmission along the interlink cables to the strings. The control room is located at the Institute Michel Pacha, La Seyne-sur-Mer, and hosts the data acquisition electronics and a commodity PC farm used for data filtering.
In December 2014, the first main electro-optic cable was successfully deployed by Orange Marine. Once ANTARES is decommissioned, its main electro-optic cable will be reused for ORCA. The first junction box was connected in spring 2015.

Detection string
The detection strings [2]      ropes parallel. Attached to the ropes is the vertical electro-optical cable, a pressure balanced, oil-filled, plastic tube that contains two copper wires for the power transmission (400 VDC) and 18 optical fibres for the data transmission. At each storey two power conductors and a single fibre are branched out via the breakout box. The breakout box also contains a DC/DC converter (400-12 V). The power conductors and optical fibre enter the glass sphere via a penetrator.
Even though the string design minimises drag and itself is buoyant, additional buoyancy is introduced at the top of the string to reduce the horizontal displacement of the top relative to the base for the case of large sea currents.
For deployment and storage, the string is coiled around a large spherical frame, the socalled launcher vehicle, in which the DOMs slot into dedicated cavities (see figure 9). The anchor at the bottom of the string is the interface with the seabed infrastructure. It is external  to the launcher vehicle and is sufficiently heavy to keep the string fixed on the seabed. The anchor houses an interlink cable, equipped with a wet-mateable connectors, and the base container. The base container incorporates dedicated optical components and an acoustic receiver used for positioning of the detector elements.
A surface vessel (figures 10 (left) and 11 (left)), with dynamic positioning capability, is used at each site to deploy the launcher vehicle at its designated position on the seabed with an accuracy of 1 m. A remotely operated vehicle (figures 10 and 11, right) is used to deploy and connect the interlink cables from the base of a string to the junction box. Once the connection to the string has been verified onshore, an acoustic signal from the boat triggers the unfurling of the string. During this process, the launcher vehicle starts to rise to the surface while slowly rotating and releasing the DOMs. The empty launcher vehicle floats to the surface and is recovered by the surface vessel. The use of compact strings allows for transportation of many units on board and thus multiple deployments during a single cruise. This method reduces costs and also has advantages in terms of risk reduction for ship personnel and material during the deployment. It also improves tolerance to rough sea conditions.
In May 2014, a prototype string comprising three active DOMs was successfully deployed and connected to the KM3NeT-Italy site and operated for more than one year [3]. This test deployment validated many aspects of the deployment scheme. The first ORCAstyle string will be connected to KM3NeT-France infrastructure spring 2016.

Digital optical module
The DOM [4] (figure 12 left) is a transparent 17 inch diameter glass sphere comprising two separate hemispheres, housing 31 PMT and their associated readout electronics. The design of the DOM has several advantages over traditional optical modules using single large PMTs, as it houses three to four times the photo-cathode area in a single sphere and has an almost uniform angular coverage. As the photo-cathode is segmented, the identification of more than one photon arriving at the DOM can be done with high efficiency and purity. In addition, the directional information provides improved rejection of optical background. The PMTs are arranged in five rings of six PMTs plus a single PMT at the bottom pointing vertically downwards. The PMTs are spaced at 60°in azimuth and successive rings are staggered by 30°. There are 19PMTs in the lower hemisphere and 12 PMTs in the upper hemisphere. The PMTs are held in place by a 3D printed support. The photon collection efficiency is increased by 20%-40% by a reflector ring around the face of each PMT. In order to assure optical contact, an optical gel fills the cavity between the support and the glass. The support and the gel are sufficiently flexible to allow for the deformation of the glass sphere under the hydrostatic pressure.
Each PMT has an individual low-power high-voltage base with integrated amplification and tuneable discrimination. The arrival time and the time-over-threshold (ToT) of each PMT, are recorded by an individual time-to-digital converter implemented in a FPGA. The threshold is set at the level of 0.3 of the mean single photon pulse height and the high voltage is set to provide an amplification of 3×10 6 . The FPGA is mounted on the central logic board, which transfers the data to shore via an Ethernet network of optical fibres. Each DOM in a string has a dedicated wavelength to be later multiplexed with other DOM wavelengths for transfer via a single optical fibre to the shore. The broadcast of the onshore clock signal, needed for time stamping in each DOM, is embedded in the Gb Ethernet protocol. The White Rabbit protocol has been modified to implement the broadcast of the clock signal. The power consumption of a single DOM is about 7 W.
The specification for the PMTs are summarised in table 2. Prototype PMTs from Hamamatsu and ETEL have been developed and satisfy the requirements (see figure 13). The PMTs have a photo-cathode diameter of at least 72 mm and a length of less than 122 mm. The  reflector effectively increases the diameter to about 85 mm. The PMT has a ten-stage dynode structure with a minimum gain of 10 6 . The front face of the PMT is convex with a radius smaller than the inner radius of the glass sphere. Due to the small size of the PMT, the influence of the Earth's magnetic field is negligible and a mu-metal shield is not required. The optical module also contains three calibration sensors: (1) the LED nano-beacon, which illuminates the optical module(s) vertically above; (2) a compass and tilt-meter for orientation calibration; (3) an acoustic piezo sensor glued to the inner surface of the glass sphere for position calibration.
In May 2013, a prototype DOM was successfully installed on an ANTARES detection line and operated in situ for over a year [5] . Starting in spring 2014, three prototype DOMs were operated for over a year at the KM3NeT-Italy site [3]. In December 2015, a first production string of 18 DOMs was successfully operated at the KM3NeT-Italy site.

Fibre-optic data transmission system
The KM3NeT fibre-optic data transmission system performs the following functions: • Transfers all the data to shore: the bandwidth per DOM is 1 Gb s −1 . The observed singles rate, dominated by 40 K, is typically 6-8 kHz per PMT [3,5] or 190-250 kHz per DOM, which amounts to 9-12 Mb s −1 per DOM. Additional contributions from bioluminescence can be accommodated up to levels of a factor of 10 compared to 40 K. • Provide timing synchronisation: relative time offsets between any pair of DOMs are stable within 1 ns. • Provide individual control for each DOM: setting the HV of a PMT, turn off/on a single PMT, turn on/off nano-beacon, update soft-and firmware. • Provide individual control for each base of a string: turn string power on/off, control optical amplifiers, monitor AC/DC converter. • Provide slow control for the junction boxes.
The slow-control system is implemented via a broadcast mechanism (same as that of the clock), in which control information for all strings is sent on a single common wavelength. If it is a message for just a single string or DOM it is ignored by all the others. The communication from offshore exploit a dense wavelength division multiplexing technique. The return signals for the slow control are transmitted on 34 wavelengths via the slow-control fibre(s). The data return path is based on a 50 GHz spacing system with a 72 wavelengths uplink. Each DOM of four strings produces a unique wavelength that is combined on one fibre. EDFA optical amplifiers are introduced onshore and at the base of a string to maintain the optical margins above 10 dB.

Data acquisition
The readout [6] of the KM3NeT detector is based on the 'all-data-to-shore' concept in which all analogue signals from the PMTs that pass a preset threshold (typically 0.3 photo-electrons) are digitised and all digital data are sent to shore where they are processed in real time. The physics events are filtered from the background using designated software. To maintain all available information for the offline analyses, each event will contain a snapshot of all the data in the detector during the event. Different filters can be applied to the data simultaneously.
The optical data contain the time of the leading edge and the ToT of every analogue pulse, commonly referred to as a hit. Each hit corresponds to 6 B of data (1 B for PMT address, 4 B for time and 1 B for ToT). The least significant bit of the time information corresponds to 1 ns. The total data rate for a single building block amounts to about 25 Gb s −1 . A reduction of the data rate by a factor of about 10 5 is thus required to store the filtered data on disk. In addition to physics data, summary data containing the singles rates of all PMTs in the detector are stored with a sampling frequency of 10 Hz. This information is used in the simulations and the reconstruction to take into account the actual status and optical background conditions of the detector.
In parallel to the optical data, the data from the acoustics positioning system are processed and represents a data volume of about one third of that of the optical data.
2.6.1. Event trigger. For the detection of muons and showers, the time-position correlations that are used to filter the data follow from causality. In the following, the level-zero filter (L0) refers to the threshold for the analogue pulses which is applied off shore. All other filtering is applied on shore. The level-one filter (L1) refers to a coincidence of two (or more) L0 hits from different PMTs in the same optical module within a fixed time window. The scattering of light in deep-sea water is such that the time window can be very small. A typical value is D = T 10 ns. The estimated L1 rate per optical module is then about 1000 Hz, of which about 600 Hz is due to genuine coincidences from 40 K decays. The remaining part arises from random coincidences which can be reduced by a factor of two by making use of the known orientations of the PMTs. This is referred to as the level-two filter (L2). Separate trigger algorithms operate in parallel on this data, each optimised for a different event topology.
A general solution to trigger on a muon track event consists of a scan of the sky combined with a directional filter [7]. In the directional filter, the direction of the muon is assumed. For each direction, an intersection of a cylinder with the 3D array of optical modules can be considered. The diameter of this cylinder (i.e. road width) corresponds to the maximal distance traveled by the light. It can safely be set to a few times the absorption length without a significant loss of the signal. The number of PMTs to be considered is then reduced by a factor of 100 or more, depending on the assumed direction. Furthermore, the time window that follows from causality is reduced by a similar factor 54 . This improves the signalto-noise ratio (S/N) of an L1 hit by a factor of (at least) 10 4 compared to the general causality relation. With a requirement of five (or more) L1 hits, this filter shows a very small contribution of random coincidences.
The field of view of the directional filter is about 10°. So, a set of 200 directions is sufficient to cover the full sky. By design, this trigger can be applied to any detector configuration. Furthermore, the minimum number of L1 hits to trigger an event can be lowered for a limited number of directions. A set of astrophysical sources can thus be tracked continuously with higher detection efficiency for each source.
For shower events, triggering is simpler, since the maximal 3D-distance between PMTs can be applied without consideration of the direction of the shower.
A maximum distance traveled by the light can be assumed, limiting the maximum distance D between hit PMTs. This reduces the number of PMTs to be considered and the time window that follows from causality. Hence, an improvement of the S/N ratio compared to the general causality relation can be obtained.
Alternative signals with different time-position correlations, such as slow magnetic monopoles, can be searched for in parallel. It is obvious but worth noting that the number of computers and the speed of the algorithms determine the performance of the system and hence the physics output of KM3NeT.
2.6.2. Performance. The performance of the online data filter can be summarised by the effective volume, the event purity and the time needed to process a time slice of raw data. The effective volume is the volume in which a neutrino interaction would trigger the event to be written to disk and the event purity is the fraction of events that contain a neutrino interaction or atmospheric muon bundle. The effective volume of the ARCA and ORCA detectors are presented in section 3.2 and 4.2.2, respectively.
To process the data, the concept of time slicing is applied. In this, the data from each optical module are stored in a frame corresponding to a preset time period. All data frames corresponding to the same time period are sent to a single CPU core based on IP level 2 switching. A complete set of data frames is referred to as a time slice. Data corresponding to subsequent time periods are sent to different CPU cores until the number of available CPU cores is exhausted. The first CPU core should then be ready to receive and process the data from the next time period.
In the following, the performance of the online data filter is presented for one ARCA and one ORCA building block. In this, different triggers are operated in parallel. The typical trigger settings correspond to a L1 time window of D = T 10 ns, a maximum space angle between the PMT axes of 90°(L2), and a minimum number of L1 hits of 4 or 5. The detection threshold thus corresponds to 8 or 10 photons. The trigger rate due to random coincidences and the number of CPU cores are shown in figure 14 as a function of the singles rate.
The typical singles rate due to radioactive decays in the sea water is about 6-8 kHz per PMT [3,5], including the dark count rate. In addition, there are occasional bursts of bioluminescence. To limit the effect of excursions of the singles rate, short bursts of bioluminescence can be filtered in real-time. The probability of the occurrence of bioluminescent bursts depends on the site and is found to be correlated with the velocity Figure 14. Trigger rate due to random coincidences (left) and required number of CPU cores (right) as a function of the singles rate for one building block of ARCA (black circles) and ORCA (red squares). of the sea current [8,9] presumably due to the influence of bioluminescent organisms induced by turbulence or impacts on the infrastructure. An enhanced level of bioluminescence has been observed in the ANTARES detector during the spring period of some years [9]. Averaged over the live time of the ANTARES detector, the overall inefficiency due to bioluminescence is about 10%. Due to the slender design of KM3NeT, it is expected that the turbulence and impacts on the infrastructure are significantly less and so is this inefficiency.
As can be seen from figure 14, the number of CPU cores needed to process the data in real time is less than 50 up to singles rates of 20 kHz (three times the nominal rate). It should be noted that the number of CPU cores may be larger than one for a modern PC. So, this result provides for a cost-effective implementation of the 'all-data-to-shore' concept. Moreover, the trigger software is the same for the ARCA and ORCA detectors; only the settings of the trigger parameters are adjusted to optimally detect neutrinos with the targeted energies.

Introduction
The main science objective of KM3NeT/ARCA is the detection of high-energy neutrinos of cosmic origin. Since neutrinos propagate directly from their sources to the Earth, even modest numbers of detected neutrinos can be of utmost scientific relevance, by indicating the astrophysical objects in which cosmic rays (CRs) are accelerated, or pointing to places where dark matter particles annihilate or decay. The prospect of such fundamental physics discoveries have led the astroparticle and astrophysics communities to include KM3NeT as a high priority in their respective European road maps (APPEC/ASPERA, AstroNet) and the European Strategy Forum on Research Infrastructures (ESFRI) to include it in their list of priority projects. The KM3NeT Research Infrastructure will also provide user ports for realtime, long-term Earth and Sea science measurements in the deep-sea environment.
One priority goal of KM3NeT/ARCA is indisputably to find neutrinos from the CR accelerators in our Galaxy. In a neutrino telescope the two simplest event topologies that can be identified are: a 'shower' topology that includes the NC interaction of all three neutrino flavours, the CC interaction of n e , and a subset of n t interactions; and a 'track' topology that indicates the presence of muons produced in n m and n t CC interactions (see section 3.2.4 for a detailed explanation).
The preferred search strategy is to identify upward-moving muons, which unambiguously indicates neutrino reactions since only neutrinos can traverse the Earth without being absorbed. A neutrino telescope in the Mediterranean Sea is ideal for this purpose, since most of the potential Galactic sources are in the Southern sky; in contrast, the IceCube detector at the South Pole is much less sensitive to these individual sources, at least in the energy range where the signal is expected (a few TeV to a few tens of 10 TeV-see section 3.3.6). The KM3NeT/ARCA design has been carefully optimised to maximise the sensitivity to these Galactic sources. One of the findings in this process is that the overall sensitivity is not reduced if the neutrino telescope is split into separate building blocks, provided they are large enough, at least 0.5 cubic kilometres each [10]. It has thus been decided to make a distributed infrastructure, thereby maximising the influx of regional funding and human resources. Furthermore, the concept of independent building blocks complies with the technical specifications for the construction and operation of the Research Infrastructure.
Currently, the KM3NeT Collaboration is proceeding with the first construction phase (Phase-1). Until 2017, 31strings equipped with 558 optical modules (see section 2) will be assembled and deployed. Of these, 24strings will be configured for ARCA and deployed at the Italian site. The resulting array will provide the equivalent of 10%-20% of the size of the IceCube detector. The recent experience from a combined analysis of ANTARES and Ice-Cube data [11], increasing the sensitivity to point-like neutrino sources by up to a factor of two with respect to the individual analyses, indicates that Phase-1 will already have a decent discovery potential and provide significant new data.

Cosmic neutrinos.
A new situation has emerged since IceCube has presented evidence for a neutrino signal of cosmic origin. This signal includes upward-and downward-going events with neutrino energies from a few 10 TeV to above 1 PeV. Even though the signal is statistically very significant, its astrophysical implications are not yet clear. This signal is the first high-energy extra-terrestrial neutrino signal ever observed and thus marks a major turning point in the history of neutrino astronomy. Detailed studies have been and are being conducted to estimate the sensitivity of KM3NeT/ARCA to a neutrino flux with the reported properties, to investigate the consequences of a re-optimisation of the detector for such a signal (in particular in terms of geometry parameters and building block size) and to evaluate the prospects of Phase-2.0. Results of these studies are presented in the following.
IceCube's HESE analysis [12] has now observed 54events with a reconstructed energy above 30 TeV, 39 of which are identified as cascades and 14 as track events [13] 55 . Most of the observed events originate from the Southern hemisphere, corresponding to down-going neutrinos in the IceCube detector. Due to the different topologies of the events, the angular resolution is roughly 10°-15°for cascades and 1°for muons. The expected background due to atmospheric muons and neutrinos is about 12 and 9 events respectively, resulting in a significance of over s 5 for the observation to be incompatible with the background. This significance has been obtained by applying designated event selection cuts using the outer layers of the detector as veto against incoming charged particles. The best constraints on an  55 One of them has been identified as a coincident air-shower event.
(assumed diffuse) astrophysical spectrum come from a maximum likelihood analysis using both HESE and other event samples [14], finding a neutrino flux proportional to -E 2.5 , disfavouring at s 2.1 an -E 2 spectrum with a cutoff at a few PeV. The distribution of the neutrino directions combined with the angular resolution does not (yet) allow for the identification of one or more point sources. Deviations from flavour-uniformity are only weakly constrained [15], and tau neutrino events have not yet been identified [16,17].
The prime physics case for KM3NeT Phase-2.0 ARCA is to measure and investigate the signal of neutrinos observed by IceCube with different methodology, improved resolution and a complementary field of view.
3.1.2. Assumptions. The basic assumption in the following studies is that the ARCA detector will comprise two KM3NeT building blocks, providing an instrumented volume of about one cubic kilometre, i.e. of similar size as the IceCube detector. All analyses reported in this document are performed for a horizontal distance between strings of 90 m and vertical distance between adjacent optical modules of 36 m. The footprint of one block is shown in figure 15. To estimate the dependence of the sensitivity on the geometrical detector configuration, an alternative layout with 120 m distance between strings but unchanged vertical distances is being investigated; this configuration corresponds to an increase of the instrumented volume to 1.7 km 3 . In both cases a water depth of 3.5 km and a latitude of  ¢ 36 16 N were assumed, corresponding to the Italian KM3NeT site (KM3NeT-It, see section 2.1).
The following sensitivity studies are discussed in the following: • Cascade events from a diffuse flux, including high-energy starting muon tracks. This analysis includes all neutrino flavours. Owing to an efficient suppression of the atmospheric muon and neutrino backgrounds (see below), a p 4 angular coverage has been achieved.
• Up-going, diffuse flux of muon (anti-)neutrinos. This analysis is usually referred to as the 'conventional' diffuse flux analysis. Traditionally, it does not include the upper hemisphere, with the exception of a small zenith region above the horizon. • Muon (anti-)neutrinos from a diffuse GP flux. Up-going muon track events are used for an analysis covering an extended region of the GP near the Galactic centre in the Southern sky. • Up-going flux of muon (anti-)neutrinos from point sources. In order to quantify the sensitivity of KM3NeT Phase-2.0 to extragalactic and Galactic point sources of neutrinos, both a generic -E 2 spectrum from point sources and spectra with energy cut-off for specific Galactic sources with non-zero radial extension have been considered.
• Cascade events from point sources. KM3NeT/ARCA's resolution in the cascade channel will allow us to use these events in point-source searches. The sensitivity of such an analysis is evaluated against generic -E 2 point-sources.
The background of atmospheric neutrinos assumed in these analyses corresponds to the so-called Honda flux [18] with a prompt component as calculated by Enberg [19]. A correction taking into account the 'knee' of the CR spectrum has been applied to both conventional and prompt atmospheric neutrino fluxes according to the prescription in [20] and references therein. The Honda parameterisation includes an anisotropy caused by the Earth's magnetic field, while the prompt component is assumed to be isotropic in the full solid angle. Moreover, in the sensitivity studies the effect of the uncertainties on the atmospheric neutrino flux has been estimated. An uncertainty of ±25% was assumed for the intensity of the conventional Honda flux. For the prompt component, the uncertainty band estimated in [19] has been used.
Recently, new calculations of the prompt neutrino component have been reported in [21][22][23]. The calculation of [23] followed that in [22], from which it differs mainly in the use of different input, in particular the parton distributions functions (PDFs). The PDFs in [23] were further constrained by taking into account LHCb measurements at 7 TeV.
In figure 16 the different components of the atmospheric neutrino flux are reported for n e and n m ; see section 3.2 for details on the background from atmospheric muons.
It should be noted that the results reported in the following are preliminary and some analysis details are not yet fully completed and optimised. Also, the analyses reported above do not reflect the full physics potential of ARCA; the event resolutions shown in section 3.2.4 can be used to characterise ARCA's ability to probe any assumed extraterrestrial neutrino fluxes.

Simulations
Monte Carlo (MC) simulations have been used to simulate the detector response to particles incident on the detector, their interaction with the medium surrounding the detector and subsequent Cherenkov light production, and the detector response in terms of the PMT data sent to shore. The software packages described in this section have mostly been developed in the ANTARES Collaboration and then adapted to KM3NeT. The simulation is based on the nominal detector geometry described in section 3.1.2 and figure 15-see section 2.1 for further details. Each of the two ARCA blocks are treated identically and independentlysimulations are performed for a single block, and the effective lifetime (event rate) is multiplied by two. The effects of position and orientation calibration uncertainties are estimated using dedicated simulations, as described in section 3.4.
3.2.1. Event generation. The relevant volume for Cherenkov light production is defined as a cylinder with height and radius of about 3 absorption lengths larger than the instrumented volume (the 'can'), limited by the seabed below. The first step in the simulation chain is the generation of particle fluxes incident on the can-neutrinos from astrophysical sources, and the atmospheric muon and neutrino backgrounds-within which a detailed description of particle behaviour and Cherenkov light production is required.
Astrophysical and atmospheric fluxes of (anti-)neutrinos of all three flavours (n e , n m , and n t ) are simulated with a code propagating neutrinos through the Earth (density profile from [24]) and generating their interactions in rock and sea water. For reactions outside the can, long-range interaction products (muons and taus) are subsequently propagated to the can. Both neutral-current (NC) and charged-current (CC) reactions are simulated. The deep inelastic scattering (DIS) cross-sections, which are dominant in the energy range relevant to this study, are implemented using the LEPTO code [25]. The CTEQ6D table of parton distribution functions is used, and the resulting behaviour-especially in the small-x regionvalidated up to 10PeV. Quasi-elastic scattering and resonance production are also taken into account, by using RSQ [26] below 300 GeV. Reactions of n e with electrons in the atmosphere are relevant in the energy regime of resonant W production ('Glashow resonance') around 6.3 PeV and are simulated according to the leading-order electroweak cross sections. The propagation of muons in rock and water is performed with MUSIC [27]. Tau leptons, which have a life time of´-2.9 10 13 s and thus typically travel only very short distances before decaying, are propagated by assuming them to be minimally ionising particles, and decayed using TAUOLA [28].
Atmospheric muons, produced in CR interactions in the atmosphere, can penetrate to the detector volume if their energy at the sea surface is in the TeV range or above. This is frequently the case, both for single muons, and muon 'bundles' up to several hundred muons from a primary CR event. Atmospheric muons therefore establish an important, high-rate background that is simulated using MUPAGE [29,170]. Single and multiple atmospheric muon events are generated using a parameterisation of the flux of muon bundles at different  [14] and atmospheric spectra; atmospheric μ events; and cosmic ray (CR) events from CORSIKA, as a function of neutrino/muon-bundle/cosmic-ray energy E. The lifetimes for other neutrino channels (n e and n t ) are similar to that of n m NC, except for the atmospheric n t events, which have effectively infinite lifetime (since the estimated flux is very small). depths and zenith angles . In the present analysis, three simulated atmospheric muon event  samples are used, with muon bundle energies exceeding 1 TeV, 10 TeV, and 50 TeV, respectively, in order to provide sufficient coverage in the high-energy regime. The corresponding lifetimes of these and the neutrino productions are shown in figure 17.
The correlated flux of atmospheric neutrinos and muons from the same primary CR interaction is simulated with CORSIKAv7.4001 [30], in order to investigate the 'self-veto' effect [31] for high-energy studies. GHEISHA [32] and QGSJET01 [33] were respectively used to model low-and high-energy hadronic interactions, and the curvature of the Earth was accounted for. Both muons and neutrinos are recorded at sea-level; muons are propagated to and through the can with MUSIC, while one neutrino from each event is forced to interact. The intention was to estimate the effect of accompanying muons on high-energy atmospheric neutrino events (section 3.2.9)-thus, only events with at least one muon at can level, and one neutrino above 10 TeV, are kept, which excludes all up-coming neutrino events 56 . The resulting event sample forms only a small fraction of all atmospheric muon bundles, but a significant fraction of all down-going atmospheric neutrino background events above 10 TeV. Therefore, analyses using CORSIKA events down-weight the standard atmospheric neutrino events to avoid double-counting. Additionally, CORSIKA underestimates the expected atmospheric neutrino flux at high energies [18,19], and this is corrected for as per [34].

Detector response.
A quantity often used to characterise the detector response for neutrino telescopes is the neutrino effective area, A eff , defined here such that the rate, R trig , of particles being detected at trigger level is equal to the flux of particles through A eff . Here, A eff is calculated as a function of neutrino flavour, ℓ, and energy, n E ℓ , relative to the flux Φ incident upon the Earth, i.e.: For a point-like source, A eff is calculated relative to the rate R trig (s −1 GeV −1 ) and flux Φ (m −2 s −1 GeV −1 ) from that source, while for a diffuse flux, the solid-angle-integrated values of R trig and Φ are used. Along with the detector efficiency, A eff also includes the neutrino cross-section, and the probability for neutrinos to be absorbed in the Earth, resulting in a smaller value of A eff than the physical cross-sectional area of the instrument. The generated particles propagated to the can level, or generated inside the can volume, are then tracked in the sea water using tabulated results from full GEANT 3.21 simulations of relativistic muons and electromagnetic cascades to generate the number of Cherenkov photons detected by the KM3NeT PMTs. The light production from hadronic or mixed hadronic/electromagnetic cascades is scaled to that from purely electromagnetic cascades according to the energy and type of constituent particles. The program takes into account the full wavelength dependence of Cherenkov light production, propagation, scattering and absorption; and the response of the PMTs as described in section 2 and modelled in [35], including absorption in the glass and optical gel, the PMT quantum efficiency (QE), the reduced effective area for photons arriving off-axis, and the effect of the reflecting expansion cones [36].
Hits from background photons (mostly due to 40 K decays in the sea water) in an event are simulated by adding random noise hits with a rate of 5 kHz per PMT. Correlated hits over multiple PMTs on the same optical module from single 40 K decays are also included, with 2, 3, 4 { }-fold coincidences at rates of 500, 50, 5 { } Hz per DOM. The singles and 56 Events with only a neutrino or atmospheric muon bundle are already simulated using standard methods.
coincidence rates as well as the angular dependence are in reasonable agreement with the results from the prototype detection unit (DU) deployed at the KM3NeT-It site [3]. An example of the simulated time-distribution of photons detected by a KM3NeT PMT from a 1 TeV muon 50 m from the track is given in figure 18.
KM3NeT PMT hits are recorded via the start time and the duration of the signal above a predefined threshold (ToT). This scheme is implemented in the detector simulation, with the simulated response of individual PMTs to photon hits being based on laboratory measurements. The full transit-time distribution is implemented on a per-photon basis, corresponding approximately to a 2 ns Gaussian smearing for the majority of photons. Hit amplitudes are smeared, and the start time and ToT are calculated by accounting for sequences of photo-electrons on PMTs that cannot be resolved in time, saturation effects at around 40 simultaneous photoelectrons, and a maximum ToT readout of 255 ns. After this step, each event contains a complete and unbiased snapshot of all hits recorded during a time window around the event, representing a part of the stream of data sent to shore.
The final stage is to simulate on-shore triggering, as described in section 2.6.1. This process takes filtered L1 hits (photon hits on multiple PMTs within a short time window on the same OM) and generates a trigger if multiple nearby OMs record such events at causally connected times within a spherical (cascade) or cylindrical (track) geometry. Trigger parameters have been tuned so as to minimise false triggers on optical backgrounds, while registering all reconstructible physics events. In the case of ARCA, the real-time trigger rate is dominated by down-going atmospheric muons, and trigger settings were set to keep the corresponding data rate manageable.
The trigger settings correspond to a coincidence (L1) time window of D = T 10 ns, and a minimum number of 5 L1 hits for both the shower trigger and the muon trigger. Only MC events which pass either trigger condition are available for further analysis, as is the case for the on-shore trigger. The resulting effective areas are given in figure 19. Following equation (1) to evaluate the number of detectable events from a specific neutrino source that maximises the significance (see section 3.3) these effective areas have to be corrected for the number of events that survive the cuts of the analysis.
The simulation times per event for different stages are shown in figure 20. The simulation time is dominated by event reconstruction and light propagation, which can reach up to a few seconds per event at high energies. The cascade reconstruction time does not reduce quickly at low energies, since it includes in the likelihood fit PMTs which have no detections.
The MC events simulated with the described codes have been compared with the data from a prototype of the string that was deployed at the Italian site and that took data for about one year [3]. The very good agreement between the data and the MC simulation have demonstrated the high reliability of the MC simulation chain. Figure 19. Effective areas of ARCA (two blocks) at trigger level for n m , n e , and n t , as a function of neutrino energy E ν . The effective area is defined relative to an isotropic neutrino flux incident on the Earth, is averaged over both ν and n, and includes both NC and CC interactions. The peak at 6.3PeV is due to the Glashow resonance of n e .

Further improvements.
The simulation chain for ARCA is mature, but not complete, and several additions will be required for future data analysis. These are: • The simulation of tau (anti-)neutrinos is performed using some simplifications. CC tau interactions within the Earth are treated as absorbing the neutrino, i.e. the tau 'regeneration' effect is not included. Additionally, only two-and three-body tau decay modes (approximately three quarters of all decays) are currently implemented-the branching ratio of ∼17.4% for the decay to a muon is kept constant, while other modes are re-normalised to the remaining 83.6%, and result in almost identical event topologies at high energies. • The MUPAGE package for generation of atmospheric muons does not contain a prompt component originating from charm decays in CR-induced air showers. The flux of atmospheric muons with energies above roughly 10 TeV is therefore underestimated, although likely only by a small amount. A refined simulation has recently been provided in the CORSIKA [30] framework, where the correlations between conventional and prompt muon and neutrino fluxes are adequately included at the event-by-event level. While a production with the new CORSIKA v7.4005 has begun, the high CPU demand has so far prevented this simulation from being fully processed through the MC chain and used for analysis. • Atmospheric muon events which coincidentally arrive simultaneously with neutrino events have not been simulated, since it is anticipated that resolving multiple components will prove feasible for ARCA. An explicit production of coincident muon events will need to be produced in order to verify this.

Event reconstruction.
Two broad event classes can be identified in a high energy neutrino telescope: track-like events and cascade-like events: • The track-like events are generated by muons that are produced in the matter inside or surrounding the detector through CC interactions of n m (n m ) and n t (n t ). CC reactions of n t (n t ) produce a muon with a branching ratio of 17%, when the emerging τ decays into a μ. • The cascade-like events are produced in the matter near or inside the detector volume through CC interactions of n e (n e ) and n t (n t ) and in NC interactions of neutrinos of all flavours. CC n t (n t ) interactions produce cascade events with a branching ratio of 83%.
These two events classes produce very different time-space hit patterns in the detector. The cascade-like events are characterised by a very dense hit pattern close to the neutrino interaction point. A significant fraction of the neutrino energy is released in a hadronic shower (and, in the case of n e (n e ) CC interactions, the rest in an electromagnetic cascade), thus allowing for a good estimate of the neutrino energy. A track-like event is characterised by the Cherenkov light from the emerging muon that can travel large distances through Earth rock and sea water. The spatial hit pattern in this case is closely related to the muon direction, thus allowing for a precise measurement of the latter. Typical hit patterns for track-like and cascade-like events are shown in figure 21.
Starting from the ANTARES experience, algorithms that reconstruct direction, energy and interaction vertex of the neutrinos from the muon tracks or the showers have been developed. These have been optimised for 'pure' track events (n m CC events far from the detector, where only a single energetic muon is observed) and for cascade events (ν NC and n e CC events, where only a cascade is observed), respectively. Thus their performance on more complicated event topologies is not optimal; prospects for improvements are discussed at the end of this section.
3.2.5. Track reconstruction. Muons with energies above 1 TeV can reach track lengths of the order of kilometres and have a direction that is nearly collinear with that of the parent neutrino. To reconstruct the muon direction-and consequently the neutrino direction-an algorithm is used that maximises the likelihood that the observed space-time PMT hit pattern is consistent with Cherenkov emission from the fitted muon trajectory. An initial hit selection exploits hit coincidences between PMTs in the same optical module or between different  optical modules to remove uncorrelated hits from background photons, mostly from K 40 decays. The reconstruction of the muon trajectory starts with a linear fit, followed by three consecutive fitting steps, each using the results of the previous one as starting point. A pseudo-vertex position is also estimated, which, however, usually is related to the entry point of the muon in the detector rather than to the location of the interaction vertex; this quantity is useful for background rejection. In addition to the track information (direction and pseudovertex) an estimator of the fit quality, Λ, and the number of hits associated with the final track fit, N hit , are determined. The Λ parameter is used in the analysis to reject badly reconstructed events, in particular atmospheric muons mis-reconstructed as up-going. The N hit parameter is related to the muon energy and is used to reject low-energy events that are mainly due to atmospheric neutrino background. A very good angular resolution of about  0.2 is achieved for neutrinos above 10 TeV, see figure 22 (left).
The amount of light collected by the PMTs when a muon travels inside the detector is correlated with the muon energy. To estimate the muon energy, a method exploiting this dependence by means of an artificial neural network has been developed. The first step is the selection of events with a reconstructed muon track travelling inside the detector for an adequate distance. The second step is the evaluation of several quantities related with the total event ToT and with the number of DOMs hit. These quantities are used to feed the neural network. The energy resolution obtained for well reconstructed (cut on Λ applied) and contained events is 0.27 units in 10 PeV (see figure 22 right); without containment requirement, the resolution slightly worsens to 0.28 units. Further details on the track reconstruction code can be found in [37].
This energy reconstruction method must be trained on appropriate samples of MC events and is not yet fully integrated in the reconstruction software for ARCA. A simple energy reconstruction using the N hit parameter is embedded in the reconstruction software and gives results of almost equivalent quality. This method is used for the sensitivity studies presented in the following. At the length scale of typical distances between optical modules, cascades thus produce almost point-like signatures, characterised by vertex position, direction, and energy. CC interactions of n m and n t , if they happen in the detector volume, also produce cascades, but the outgoing μ, τ, or τ decay products produce a more complex signature. Hence, cascade reconstruction is optimised for ν NC and n e CC interactions, and then the performance is assessed on the latter class. Three independent algorithms have been applied to reconstruct cascade vertex position, direction and energy. The first has been specifically developed for ARCA, and exhibits the best performance. The second and third have been adapted from ANTARES analyses and have outputs which prove useful in event classification and background discrimination. All three are described here, although only the performance of the first is shown.
The first algorithm (algorithm 1) has been specifically developed to exploit the information provided by the KM3NeT multi-PMT DOM. The hit selection is designed to be simple and to allow for a fast reconstruction. Hits on the same PMT within 350 ns are merged using the time of the first hit, and coincidences of two merged hits within 20 ns on a single DOM are used for the vertex fit. This fit minimises time-residuals assuming a spherically expanding shell of light about an assumed cascade maximum. The offset of this fit from the MC true vertex position in the longitudinal direction (figure 23 left) mostly measures the shower elongation, while the offset in the lateral direction (figure 23 right) measures the accuracy, reaching a precision well below 0.5 m in the high-energy regime.
The direction and energy are reconstructed using maximum-likelihood methods, applied to the merged hits as described above. All PMT hits within −100 ns to +900 ns of the expected Cherenkov light-front from the vertex fit are used. Thus each PMT only has a 0.2% chance of receiving a random hit from the optical background. Rather than fitting the ToT (∼charge) measurement from each PMT, the algorithm simply fits the probability of a PMT recording one or more photons within this time-window, making the procedure highly robust. This probability is estimated from simulations as a function of PMT distance and pointing direction to the shower, angle from the shower axis to the PMT position, and electromagneticequivalent cascade energy. The strong geometrical dependence in hit probabilities allows for a very high reconstruction quality: nearby PMTs facing towards the cascade, close to the  Cherenkov angle, will tend to have a hit probability of unity, while distant PMTs facing away from the cascade, far from the Cherenkov angle, will tend to have a hit probability of zero.
For contained events above 50 TeV, the s 1 energy and median direction resolutions achieved with this method are roughly 10% and 2°respectively, with no loss of efficiency. The resolutions after the selection cuts described in section 3.3.1 are shown in figure 24. For energy above 60 TeV, corresponding to the approximate low-energy threshold of the cut-andcount diffuse-flux analysis (see section 3.3.1), the s 1 energy resolution is characteristically 5%, while the median directional resolution is  1.5 . This energy resolution is close to the limit imposed by variations in the hadronic cascade component (mostly due to the variable inelasticity), which yields less Cherenkov light (∼90% at 100 TeV) than the electromagnetic component [38].
The second algorithm fits the vertex position from the positions and the arrival times of the PMT hits using an M-estimator procedure and applies selection cuts on the resulting quality parameters. The cascade direction is determined from the average direction of hits with respect to the vertex position, the energy is estimated from the observed ToT values, taking into account the expected relative intensities at given PMT positions. The third algorithm starts from a simple vertex estimation based on large-amplitude hits, followed by a hit selection using this vertex and causality relations and finally by two sequential, independent log-likelihood fits yielding first the vertex position and then the energy and direction of the event. The algorithms yield similar accuracy and are fully efficient for events passing the cuts. While they are less precise than algorithm1, they exhibit different responses to non-cascade events, and their output is useful for background suppression. More details on the cascade reconstruction codes presented here can be found in [39].
3.2.7. Prospects for improved reconstruction. The main reconstruction goal of ARCA is to precisely determine the parameters of track-like and cascade-like events, and the methods presented above have been developed with this in mind. New reconstruction algorithms tuned on n m CC and n e CC events are in the testing phase and first results are very promising. Also reconstructions tuned for different event classes that present more complex topologies are in the development phase. In particular: • Improved track and cascade reconstructions.
The track and cascade reconstructions described above are first-generation algorithms developed for ARCA, and there are good prospects for improvements in both. In fact, when the reconstruction algorithms were developed the full PMT response was not yet being implemented in the simulation chain. Reconstructions based on a more-detailed knowledge of the detector are currently in development or in the testing phase.
In particular, the best current cascade reconstruction (algorithm 1) uses very little timing information to fit the cascade energy and direction, and no information from individual PMT signal magnitudes (all ToT values treated equally). A new cascade reconstruction algorithm that exploits this information in detail is under development. First estimates indicate that a cascade resolutions of 1°may be attainable with improved efficiency. Additionally, a new track reconstruction algorithm has recently been developed. From initial values obtained by a rigorous scan of the full solid angle, the likelihood is maximised using a multi-dimensional probability distribution function (PDF) of the arrival time of Cherenkov light from the muon. In figure 25, the angular resolution reached for n m CC events is reported, showing that an angular resolution better than  0.1 is reached for events with energy higher than 100 TeV.
However, these reconstructions have not yet been processed through the full MC chain described in section 3.2, and hence are not used in the analyses presented here. However, since the atmospheric background for point-source studies (section 3.3.6) reduces with the square of the angular resolution, using these reconstructions is expected to significantly improve the sensitivity of such studies in the near future.
A τ produced in a n t CC interaction will on average travel 4.89 cm TeV −1 before decaying. If the decay is not into a μ (~83% probability), the decay products will create a cascade-like signature offset from the first interaction vertex. At sufficiently high energies (E 100 TeV), this second cascade will be offset from the first by distances significantly larger than the precision of the cascade reconstruction, creating a 'double bang'. Identifying such double-bang events would be a clear signature of the flavour of the neutrino primaries. A preliminary investigation, conducted assuming an initial hadronic cascade ('bang') energy of = , showed that cascade reconstruction algorithm1 could identify both events when separated by 10 m or more, i.e. for n t at ∼ 250 TeV and above. It is expected that an even closer separation will be resolvable.
• Starting track events. A n m CC interaction in the detector volume will produce a cascade at the interaction vertex, and an outgoing μ; n t CC events with subsequent t mnn  decays will produce a similar signature. Such interactions typically do not manifest themselves as either wellreconstructed cascade or track events, due to the presence of the other component. An optimal reconstruction method would separate both components and reconstruct them simultaneously, allowing for improved energy and direction resolution on the neutrino primary, and a better event selection. • Muon bundles.
Groups of muons from the core of an extended air shower (EAS) exhibit a signature very similar to that of a single high-energy muon in the detector. However, their stochastic energy-loss pattern is much more uniform, and their lateral spread is non-negligible at the characteristic spatial resolution scale of ARCA. Currently, muon bundles are reconstructed using a single-muon hypothesis. Identifying such events can be used to reduce the background for studies searching for an excess of single energetic down-going muons, either from an astrophysical n m flux, or from the prompt decay of charm particles in EAS.
The rate of down-going μ from EAS above ARCA that produce a detectable signature in the detector is expected to be about 50 Hz. With a typical event duration of 5 μs, approximately one in 4000 events will have a coincidental muon present, corresponding to a double coincidence every 80 s. Current simulations only model particles for individual EAS, and current reconstruction methods only return at most a single track or cascade event. Observe that this effect is much less important for ARCA than it is for IceCube: the increased detector depth reduces the rate of coincident down-going muonic background, and the better time-resolution afforded by the low scattering in sea water allows photons from different sources to be separated within a much narrower time window.
3.2.8. Background suppression. Backgrounds from atmospheric muons, as well as random coincidences of hits from K 40 decays, are reduced to acceptable levels by applying selection cuts on the event reconstruction quality, the reconstructed zenith angle for track-like events and quantities related to the event energy (such as the number of hits) or event topologies (e.g. using boosted decision tree (BDT) techniques-see section 3.3). For point-source studies, the main method of reducing the background event rate of both muon and neutrino events is via the excellent angular resolution afforded by seawater, since the background rate reduces with the square of the resolution. However, in particular for studies of a diffuse flux, the most problematic remaining source of background is the atmospheric neutrino flux.
3.2.9. Self-veto of down-going atmospheric neutrinos. The interactions of CR with the atmosphere generates extensive air showers (EASs) where both neutrinos and muons are produced. Despite the ∼ 3 km overburden of water, muons with an energy in the TeV range and above can reach the detector, either singly, or in multiples (muon 'bundles'), particularly under low zenith angles. These muon bundles can be used to 'veto' any accompanying neutrinos, allowing for a strong reduction of the down-going atmospheric neutrino background. This technique has been proposed in [31], where it is predicted in the context of an IceCube-like detector that atmospheric neutrinos above 10 TeV and with zenith angles less than 60°can be vetoed with almost 100% efficiency. More detailed calculations in [40] suggest a somewhat lower, but still significant, veto probability.
Events simulated by CORSIKA (see section 3.2) have been used to estimate the self-veto probability, and some preliminary results for the high-energy diffuse analysis using the cascade channel (see section 3.3.1) are shown in [34]. In the case of ARCA, accompanying muons make neutrino-induced cascade events less cascade-like, so that while these events are not explicitly vetoed (as would be the case with IceCube), their topology is such that they appear more background-like than signal-like in sensitivity studies targeting down-going cascade events (see section 3.3). An example of such an event is given in figure 26.
An effective 'veto' effect can be demonstrated by using the (less sensitive) 'cut-andcount' analysis method, as shown in figure 27. Shown are the distributions of atmospheric n e CC events in the event selection both before (left) and after (right) the self-veto effect has been taken into account. The total effect is a reduction of the down-going atmospheric neutrino events in the selection by a factor of about two, or ∼ 25% in the all-sky background, with higher-energy events close to the zenith being more efficiently rejected. It is difficult, however, to compare this estimate with those of [31,40] due to the different event samples,  rejection methods, and detector depths. It should also be noted that the current estimate suffers from low statistics, and that the analysis was not optimised with the self-veto effect being taken into account. Hence, the final self-veto efficiency is expected to be higher, and improvements of the results for searches of both diffuse and point-like astrophysical neutrino fluxes in the cascade channel are anticipated.

Sensitivity studies
In this section, studies of the sensitivity of ARCA to diffuse fluxes and point-like sources are presented. All the analyses take into account (anti-)neutrinos of all flavours (n m , n e , and n t ) in equal proportions and their CC and NC interactions, as simulated according to section 3.2. Each analysis proceeds in the following steps: (1) A preselection of the events to reject most of the atmospheric background, mostly by cuts on parameters that are provided by the reconstruction algorithms or that are related to the total ToT or number of hits.
(2) A multivariate analysis based on the BDT from the ROOT TMVA package [41], applied to the preselected events for a more stringent background rejection. This step is not applied in all analyses. When it is used, the exact input observables vary with each analysis, but always consist of a subset of the reconstructed event directions, energies, and positions from the track and the three cascade reconstruction methods described in section 3.2.4. Additionally, quality parameters related to the fit procedures, such as the log-likelihood of each fit, are included, as are measures of the photon arrival time distribution about the light front-see section 3.2.4, and [39,42], for further details. (3) A 'cut-and-count' analysis method for a fast evaluation of the discovery potential and a rough estimate of the number of events from background and signal. This method consists of maximising of the model discovery potential (MDP) (see e.g. the methods of [42,43]) by placing cuts on simulated observables to obtain clean event samples. (4) A maximum likelihood method applied to the event sample resulting from step2 to calculate the discovery potential at different significance levels. All quoted significances arise from this method-however, since only loose cuts are applied for the likelihood maximisation in order to retain the maximum possible information, the resulting event sample is very broad. Therefore, the expected numbers of events passing cuts are quoted using the cut-and-count method above, reflecting the number of high-quality signal candidate events.
For the last step, the likelihood ratio function: is employed, where n sig is the estimated number of signal events, n is the total number of events (and hence, implicitly, -= n n n sig back is the number of background events), and P sig and P back are the PDFs for signal and background events, respectively. The LR is maximised by altering n sig to obtain LRmax . The PDFs are functions of one or more parameters X, such as the BDT output if it is applied, and/or other parameters related to the specific analysis.
Pseudo-experiments (PEs) are performed and LR is maximised for each PE. The distributions of LRmax when simulated signals events are present are compared to distributions in the background-only case to evaluate the significances of each simulated observation.
Unlike in the high-energy starting event (HESE) analysis of IceCube [12], no explicit veto to remove atmospheric muon contaminations is used for KM3NeT/ARCA. Rather, the methods of steps (2) and (4) above assign to each event likelihoods based on the observed event topology, which is well-preserved in sea water due to the low light scattering. This study has been optimised assuming that the IceCube signal originates from an isotropic, flavour-symmetric neutrino flux following a power law spectrum with a cut-off at a few PeV. The cutoff-or a steeper spectrum-is implied by the observation of events with a deposited energy exceeding 1 PeV and the absence of events at about 6.3 PeV associated with the Glashow resonance (W production in scattering of n e on electrons). The single-flavour energy spectrum has been parameterised as: GeV exp 3 PeV GeV cm s sr . 3 Since the first IceCube discovery [12], several new analyses with updated event samples and different event selection strategies have been published [15,20,44]. In these analyses various compatible parameterisations for the cosmic neutrino flux have been proposed. To check the robustness of our results with respect to the diffuse neutrino flux assumed we have also calculated the significance of the KM3NeT/ARCA observation to the following diffuse flux from [45], which is similar to the results recently reported in [14]: GeV exp 3 PeV GeV cm s sr . 4 6 2.46 Note that for this steeper spectrum, a cut-off to suppress the Glashow resonance signature is not necessarily required by observations, but is kept here in order to avoid biasing the analysis by maximising the selection of such events. In figure 28 these fluxes are presented together with the atmospheric neutrino fluxes for comparison. In the following, the sensitivity studies for diffuse fluxes are presented for the cascade channel and for the track channel.
3.3.2. Cascade channel. Events simulated as described in section 3.2 have been reconstructed with the three available cascade reconstruction codes discussed in section 3.2.4.
The first selection cut requires the containment of the reconstructed vertex in a cylindrical volume around the detector centre, with radius < r 500 m and height < z 200 m. The effect of this cut is illustrated in figure 29. It rejects most of the atmospheric muons which, coming from above, have the reconstructed vertex in the upper part of the detector. The containment cut reduces the fiducial volume by about 20% with respect to the instrumented volume, although this is compensated for by the included region below the instrumented volume. with N being the number of causally connected hits selected by the cascade reconstruction algorithm, is related to the energy deposited in the detector. The cumulative ToT evt distribution is shown in figure 29 (bottom left panel). A cut m > ToT 12 s evt is applied and rejects most low-energy atmospheric muons and a large part of the atmospheric neutrino background, which is concentrated at lower energies. The corresponding rejection efficiency is reported in figure 30 (green points). As shown in figure 30 (left panel) the number of reconstructed atmospheric muons is still too large. To further reduce this background, a BDT algorithm was applied to the preselected event sample. As input for the BDT training, several quality parameters from the available shower and track reconstruction algorithms are used. The BDT is then trained to discriminate tracks from showers using simulated datasets of atmospheric muons and n e CC events as training samples. The cumulative distribution of the resulting discrimination parameter ρ and the cut applied on ρ are shown in figure 29 (right bottom panel).
A first estimate of the discovery potential, obtained with the cut-and-count approach, yields final event selection cuts of r > 0.5 and > » E 10 GeV 50 TeV. The corresponding rejection efficiency is reported in figure 30 (blue line). With these cuts the background due to atmospheric muons is almost completely rejected for the presently available simulation live-time of three years at high energies. Table 3 reports the number of events per 5 years for ARCA at each step of the analysis for the different event samples. Most  ). To further improve the evaluation of the discovery potential, the maximum likelihood method (step (4) above) has been applied to the preselected events. The PDF functions in equation (2) are functions of the reconstructed energy E rec and the BDT output ρ. The resulting significance is reported in figure 35 as a function of the number of observation years. With the KM3NeT/ARCA detector the assumed signal flux will be detectable at s 5 in the cascade channel in about one year of observation time.
The estimate of the significance depends on the assumed background and in particular on the model assumed for the description of the conventional and the prompt components of the atmospheric neutrino flux (see section 3.1.2). For the cascade channel the maximum variation of the significance, reported as a red band in figure 35, has been obtained assuming the maximum and minimum flux values of the prompt atmospheric neutrino component reported in [19]. Moreover, the significance has also been estimated taking into account the new prompt calculation reported in [23] (see section 3.1.2). In this case, the time to discover the diffuse flux is reduced by about 30%.

Track channel.
Since energetic muons can have very long tracks (a 10 TeV muon has a path length of »5 6 -km in water), muon neutrinos with interaction points far from the instrumented volume can be detected, thus making the effective volume much larger than the geometrical detector volume. The main challenge in using the track channel is to distinguish these events from atmospheric muons. Here we follow the traditional approach to reject atmospheric muons by using the Earth as a shield, and select track-like events that come from below the horizon, or a few degrees above it. In this analysis a cut on the reconstructed zenith angle q >  80 rec is applied. In figure 32 the ratio of the numbers of selected events and triggered events is reported (red lines) for atmospheric muons (left panel) and n m CC events (right panel). The atmospheric muon rate at energies below 10 6 GeV is reduced by more than one order of magnitude. Most of the remaining atmospheric muons are mis-reconstructed as up-going or are near the horizon.
To remove the mis-reconstructed events an additional cut on the quality parameter Λ (see To reduce the background due to atmospheric neutrinos, a cut on the number of hits associated with the fitted track, N hit , is applied (see section 3.2.4). N hit is related to the muon energy loss in the detector and thus to the primary neutrino energy. The cumulative N hit distribution is presented in figure 33 (right panel).   figure 17). The MC neutrino energy distribution for the event sample after final cuts is shown in figure 34. The cuts select events in the neutrino energy range from about 80 TeV to about 3 PeV. A discovery at s 5 with 50% probability is achieved in about 3.2 years. As in the cascade analysis, the maximum likelihood method was applied to the preselected event sample (q >  80 rec and L > -5.8) to further improve the sensitivities. The likelihood ratio (equation (2)) was calculated for signal and background using PDFs that were mono-dimensional functions of N hit . The resulting significance is reported in figure 35 as a function of the observation time. The assumed signal flux can be detected with KM3NeT/ ARCA at s 5 in the track channel in about 1.6 years of observation time with 50% probability. For the track channel, the maximum variation of the significance (reported as a grey band in figure 35) has been obtained with the assumed uncertainties in the intensity of the conventional atmospheric neutrino flux (see section 3.1.2). Table 4. Expected numbers of events for the KM3NeT/ARCA detector (two building blocks) for the different event samples in 5 years of observation time. The cosmic events correspond to the source flux of equation (3).   is expected to observe the IceCube flux (equation (3)) in about 6 months with a significance of s 5 with 50% probability.

Reconstruction level After preselection cuts
To investigate the sensitivity of these results to the assumed form of the IceCube diffuse flux, both the cascade and track analyses were repeated for signal fluxes according to equation (4) both with and without the 3 PeV cutoff. In each case, the flux normalisation constant, F s 5 , required for a s 5 discovery after 1 year of observation time, was calculated. The results are reported in table 5 in terms of their ratio to the flux normalisation reported by IceCube, F IC 0 . Values larger (less) than unity indicate a s 5 discovery time of more (less) than 1 year. The results show that for flux assumptions with a softer spectrum and the same cut-off the main results of our analysis do not change, and in fact a small improvement (»10%) is expected.
3.3.5. Diffuse neutrino flux from the GP. One of the most promising potential source regions of a diffuse astrophysical neutrino flux is the GP. Neutrinos are expected to be produced in the interactions of the galactic CRs with the interstellar medium and radiation fields, with a potentially significant excess with respect to the expected extragalactic background. The observation of diffuse TeV γ-ray emission from the GP [46,47], which is expected to arise from the same hadronic processes that would produce high-energy neutrinos, strongly supports this hypothesis. Also Fermi-LAT observes, after the subtraction of known point-like emitting sources, a broad diffuse emission from the GP, with a spectrum consistent with a significant hadronic component [48].
Recently, also related to the observed IceCube high-energy neutrino events, new phenomenological models for the diffuse galactic neutrino emission have been proposed [49][50][51][52][53]. In particular, in [54] a non-uniform CR transport model with a radially dependent diffusion coefficient has been adopted to explain the high-energy diffuse γ-ray emission along the whole GP, as well as the hardening of CR spectra measured by PAMELA and AMS-02 around 250 GeV and two possible CR cut-offs, at 5 PeV and 50 PeV, compatible with KASKADE and KASKADE-Grande observations. In [51], these authors estimate that the astrophysical flux detected by IceCube in both the HESE [13] and diffuse muon [55] analyses is still dominated by an extragalactic diffuse component, with galactic emission respectively accounting for 15% and 10% of events. Using this model, a detailed prediction of the neutrino emission from the inner GP, i.e. for <  l 30 | | and <  b 4 | | (b and l being the Galactic latitude and longitude, respectively) has been obtained (see figure 4 of [54]). This flux is adopted here to estimate the performance of the KM3NeT/ARCA detector in searching for neutrinos from the GP. The selected region is entirely located in the Southern Hemisphere. The estimated one-flavour neutrino flux has been parameterised as: An analysis similar to that described for the all-sky diffuse track channel has been performed to estimate the KM3NeT/ARCA sensitivity to this flux. Events were preselected requiring the zenith angle to be q >  3.3.6. Point-like neutrino sources. Due to its good angular resolution, KM3NeT/ARCA is a very promising instrument for the detection of point-like sources. In particular, its location in the Northern Hemisphere will allow the study of most Galactic sources, as well as extragalactic sources (which are expected to be approximately uniformly distributed over the sky) using up-going muon track events. In this section the sensitivity of the ARCA detector to point-like sources will be discussed. In particular the two following physics cases will be analysed: • Neutrino emission by the supernova remnant (SNR) RX J1713 and the pulsar wind nebula (PWN) Vela-X, which are at present the Galactic objects exhibiting the most intense high-energy emission [56][57][58]. For these sources, the zenith position, angular extension, and neutrino flux parameterisation are extracted from the measured highenergy γ-ray spectra. In both cases, the expected neutrino spectra are evaluated from the γ spectrum under the hypothesis of a transparent source and 100% hadronic emission. Although PWN are commonly assumed to be powered by e-/e+ winds, they will entrain ions from the ambient medium, possibly accelerating them to very high energies. • Sources without significant angular extension, emitting a benchmark -E 2 neutrino spectrum. These can be viewed as characteristic of extragalactic sites of hadronic acceleration (e.g. AGN) with cut-offs expected at very high energies. While the actual spectra of individual neutrino sources is not expected to follow a simple -E 2 power-law, and may exhibit features such as a peak at PeV energies, or a harder spectra extending to EeV energies [59], the projected sensitivity to an -E 2 flux gives a good indicator of ARCA's ability to study such extragalactic sources with higher-energy fluxes.
For the detection of neutrinos from point-like sources, the best performance is expected from a search for track-like events. In fact, as discussed in section 3.2.4, with long muon tracks an angular resolution of about~ 0.2 can be achieved. To remove the unavoidable down-going atmospheric muon background, events are selected that contain tracks reconstructed as up-going.
At the latitude of the Mediterranean Sea, selecting tracks that are reconstructed below or a few degrees above the horizon implies a reduction of the visibility for source declinations above - 40 , as shown in figure 37. On the other hand, it is possible to view Northern-sky sources below +  50 of declination, giving a total of p »3.5 sr sky coverage.   (6), black line) and Vela-X (equation (7), red line).
3.3.7. Galactic sources. SNR RX J1713.7-3946 (short: RX J1713) is a young shell-type supernova remnant that has been observed by H.E.S.S. in several campaigns [60,61]. The γ rays are emitted from a relatively large circular region with a radius of about  0.6 and a complex morphology, with an energy spectrum that extends up to 100 TeV. The source, at a declination of - ¢ 39 46 , is visible for 80% of the time when selecting tracks with reconstructed zenith angle q >  78 rec . For the present analysis, homogeneous emission from a circular region around the measured declination with a radial extension of  0.6 has been assumed. The neutrino flux adopted has been derived from the measured γ-ray spectrum and has been parameterised following [62]:

· ( )
This energy spectrum is shown in figure 38 (black line).
In the point source analysis for track-like events, all simulated events (n m , n e , n t , m atm ) have been reconstructed with the track reconstruction code described in section 3.2.4.  Since the maximum elevation for a source at the declination of RX J1713 is~ 14 , and to maximise the signal-to-background ratio, events were preselected requiring that the reconstructed track has a zenith angle q >  78 rec and a radial distance from the centre of the source of a <  10 . The numbers of events at reconstruction level and after the preselection cuts are shown in table 6. Even after the preselection, the numbers of events due to neutrino and muon atmospheric background largely exceed the number of expected signal events from the source. The atmospheric muons can be efficiently removed by imposing a cut on the Λ parameter as shown in figure 39. Finally, a BDT trained to discriminate signal events from neutrino background is applied.
The MDP is then maximised by adjusting the cut on the BDT output value. The number of events per 5 years of observation time surviving these cuts is indicated in table 6, together with the number of events expected at each step of this analysis. The ratio between these  The significance has been evaluated with an unbinned method [63] by maximising the likelihood ratio of equation (2), with PDFs expressed as functions of the BDT output ( figure 41). The result shows that a 3σ significance can be reached in about 4 years of observation time.
The same analysis has been applied to Vela-X, which is one of the nearest and most intense PWNe, and has been extensively studied in TeV γ rays by the HESS Collaboration [64,65]. Vela-X is located at a declination of - ¢ 45 36 . The neutrino spectrum has been estimated from the differential energy spectrum using the prescription in [66][67][68] for an integration radius of  0.8 around the source centre and was parameterised as:

· · ()
This spectrum is shown in figure 38 (red line). The source has been simulated as a homogeneously emitting disk of  0.8 radius. The expected sensitivity of ARCA to Vela-X is shown in figure 42 as a function of the observation time. Owing to the good visibility of the GP, a significance of s 3 can be reached in less than 3 years of observation time. The bands show the variation of the significance due to the uncertainty on the normalisation of the conventional part of the atmospheric neutrino spectrum (see section 3.1.2).
3.3.8. Sources with a spectrum ∝E À 2 . The flux required for a s 5 discovery has also been calculated for a generic point-like source with a spectrum µ -E 2 . In the preselection sample only events with q >  80 rec have been selected. In this analysis, at present, the BDT procedure has not been applied, since the larger difference in the slopes of the atmospheric and source neutrino energy spectra eases discrimination between them. After the preselection, an unbinned method has been applied that maximises the likelihood ratio of equation (2), with PDFs as functions of the two parameters N hit (related to the energy of the neutrinos) and α, the angular distance from the source centre. The s 5 discovery flux is reported in figure 43 as a function of the declination for 3 years of observation time, corresponding to the exposure for the current IceCube result. The upper limit of ANTARES is also reported for comparison.
ARCA's expected resolution on cascades of~ 1.5 (see section 3.2.6) allows us to also use this channel for a point-source search, as recently demonstrated by ANTARES [71]. Since discriminating down-going cascade events from the muonic background is easier than for tracks, cascade searches have a p 4 sr coverage, making this detection channel especially important for sources with an otherwise limited visibility. First preliminary results for the cascade channel for generic point-like sources with an -E 2 spectrum will also be presented in this section.
The sensitivity of KM3NeT/ARCA to point-like sources has been evaluated using cascade events. In this analysis all simulated events have been reconstructed with both the track and cascade reconstructions. To remove the atmospheric muons, which are the main source of background, a preselection of events was performed, leading to the two event samples: The containment cuts mainly select cascade events that have the interaction vertex inside the detector volume and remove track-like events. Remaining track-like events are rejected by the Λ cut in sampleA (removing well-reconstructed atmospheric muons) and with the ToT cut that removes lower-energy tracks with the vertex inside the instrumented volume. In both samples, most of the selected source events are cascade-like events, the track 'contamination' being of order 10%.
Since the cut-and-count method has not been applied in this case (i.e. no cut on the distance between the reconstructed direction and the source centre has been applied, with events reconstructed closer to the source appearing more source-like), the unweighted number of signal and background events passing the above cuts is not meaningful, and is not reported.
The same BDT procedure described in section 3.3.1 for the diffuse cascade analysis, to discriminate tracks from showers, has been applied to the two samples. An optimal cut on the BDT output variable was found to be r > 0.5.
The discovery potential has been obtained by performing an unbinned log-likelihood search. The likelihood takes into account the energy and directional information of each event reconstructed with the cascade reconstruction. In order to take into account the two different event samples, the following likelihood ratio, similar to that one of equation (2), has been considered: where j indicates the data sample and i indicates the event in that sample. S j i and B j i are the PDFs for the signal and background of the jth sample and are evaluated as functions of the reconstructed cascade energy and of the distance from the source centre. N j is the total number of events in the jth sample. The estimates number of signal events n j signal in each The discovery flux at the 5σ level is reported in figure 45 as a function of the declination (red line) for 3 years of observation time, and is compared with the discovery flux obtained for the track analysis. For declinations higher than 50°, where the visibility for up-going tracks is very poor or null, a competitive value w.r.t. the present IceCube value can be obtained (see figure 43).
The cascade angular resolution of the preselected events for d =  45 is reported in figure 44 right panel, and shows that an average angular resolution of about 2°can be reached. This includes all events passing the cuts described above (no cuts on reconstructed angle to the source), showing that a very good angular resolution is obtained.
Similarly to the diffuse analysis, improvements in point-source sensitivity are expected when combining the events from the track and cascade channels, especially for sources located in the Northern sky.
3.3.9. Potential improvements in point-like-source searches. An improvement in the sensitivity for the search for neutrinos from point-like sources is expected when the two new reconstruction algorithms (one for track and one for cascade events, see section 3.2.4), that are being tested, will be applied to the MC data set. Additionally, the first tests indicate a higher number of reconstructed events (higher efficiency) in addition to a better angular resolution.
The search for neutrino sources can be also improved by grouping potential sources together in a procedure that is known as 'source stacking'. This is usually applied to sources of the same class. In our case, several potential sources otherwise too weak to be investigated individually are present in the Galactic and extragalactic region. This technique has not been yet applied, but an improvement is expected both for the search of Galactic PeV sources (SNR, PWN, etc) and for extragalactic sources (AGN). 3.3.10. Further physics opportunities. In addition to the central science targets of neutrino astronomy, i.e. investigating high-energy cosmic neutrinos and identifying their astrophysical sources, KM3NeT/ARCA will offer a wide spectrum of further physics opportunities, of which a selection is sketched in the following. Corresponding physics analyses have been pioneered by the IceCube and ANTARES collaborations.
• Gamma-ray bursts (GRBs) There is strong evidence that long-duration GRBs are produced from relativistic jets formed in the collapse of a massive star [72]. Shocks generated either within the jet, or when the jet collides with surrounding material, are potential CR acceleration sites, with an associated neutrino flux from subsequent interactions and decays [73]. The short duration of GRBs (seconds to minutes) allows a narrow neutrino-search time-window, effectively reducing the background when compared to a standard point-source search. This has allowed ANTARES and IceCube to constrain the properties of GRB jets [74,75]. KM3NeT/ARCA will increase the sensitivity of such searches similarly to that for -E 2 point-sources (section 3.3.8).

• Multi-messenger studies
KM3NeT/ARCA will be part of a global alert system able to tag synchronous observations of different experiments, observing e.g. γ rays or gravitational waves, that in themselves are not significant but become so when combined. Another branch of multimessenger studies is the creation of alerts for optical, radio or x-ray telescopes to follow up 'suspicious' neutrino observations, such as a doublet of events from the same celestial direction during a short time period. As per the ANTARES TAToO program [76], KM3NeT/ARCA will monitor more than half the sky, and will be able to generate alerts with high angular precision within seconds. As ultra-high-energy CRs are also expected to retain some directional information, correlation studies with the arrival directions of events detected by e.g. the Pierre Auger Observatory will also be possible.

• Cosmic ray physics
KM3NeT/ARCA will register a huge number of high-energy atmospheric muons that reflect the direction of impact of the primary CR particle with sub-degree precision. This data set will allow us to investigate inhomogeneities of the CR flux and to complement the corresponding sky maps by IceCube and dedicated CR experiments. A further opportunity might be the detailed investigation of muon bundles that could, via their multiplicity and divergence, be related to the chemical composition of CRs.

• Particle physics with atmospheric muons and neutrinos
The high-energy end of the atmospheric muon and neutrino spectra are expected to be dominated by prompt processes, i.e. the production of charm or bottom hadrons in the primary CR reactions in the atmosphere and their subsequent fast decay to leptons. Little is experimentally known about these reactions, and theoretical modelling is difficult since it involves QCD processes at the border line of the non-perturbative regime. Identifying and measuring the muons and neutrinos from these processes would shed light on the underlying reaction mechanisms.

• Tau neutrinos
The capability to identify tau neutrino reactions at energies beyond a few 100 TeV (see section 3.2.4) will not only allow for constraining the flavour composition of high-energy cosmic neutrino fluxes, but might also provide an additional handle to investigate prompt neutrino fluxes (see above), which are the only CR reactions for which a significant production probability for tau neutrinos is expected.
A further interesting phenomenon of tau neutrinos is their regeneration after CC reaction in the Earth through the subsequent tau decay (relevant for energies above a few 10 TeV). The observation of this phenomenon would be interesting in itself, but might in addition signal new particle physics, e.g. in the context of supersymmetry.

• Dark matter
Even though the existence of dark matter is considered proven and its particle nature very likely, there is no direct or indirect evidence for the properties of these particles. Should they have masses in the TeV range or above, neutrinos from self-annihilation reactions could be the first dark matter signal ever detected. ANTARES and IceCube have already proven the ability of neutrino telescopes to significantly constrain dark matter properties, with searches targeting accumulations in the Sun [77,78], the Galactic Centre [79,80] and halo [81], and nearby galaxies [82]. The corresponding investigations with KM3NeT/ARCA data will-as with all indirect searches-be particularly sensitive to dark matter particles with spin-dependent scattering cross sections on nuclei. The study of neutrino fluxes from the Galactic centre and halo, nearby galaxies and Galaxy clusters could also provide constraints on scenarios that invoke the decay of very heavy (∼PeV) dark matter to explain the high-energy neutrino excess observed by IceCube [83][84][85][86].

• Exotics
There is a variety of hypothesised stable or quasi-stable particles that would leave an identifiable, characteristic signature when crossing the detector. Amongst these are magnetic monopoles (for which ANTARES and IceCube have already performed a search [87,88]), strangelets, Q-balls, and nuclearites.

• Violation of Lorentz invariance (LIV)
LIV could lead to oscillation-like interference patterns of atmospheric neutrinos in the energy range of TeV and above. Additionally, LIV would produce a time-delay between neutrinos and photons from distant, time-variable sources (in particular, GRBs), allowing LIV to be tested by multi-messenger studies.

Investigation of systematic effects
The simulation chain described in section 3.2 assumes standard values for the detector geometry, water optical properties, bioluminescent rates, and also a perfectly calibrated detector. In the context of KM3NeT/ARCA sensitivity studies, the term 'systematic effects' is used broadly to cover all potential deviations from the standard simulated dataset, and this section describes a series of dedicated studies aiming to estimate their potential influence on ARCA event reconstruction and sensitivity to astrophysical neutrino fluxes. Each systematic was simulated using a data-set of 10% that of the standard simulation, with the systematic being inserted at the latest possible point in the chain to ensure the least influence of random variation between the sets. For example, changing the water scattering length required re-simulating the hit-time distribution on the PMTs, while reducing the PMT effective area was performed by keeping 90% of the detected photons from the standard simulation. Most systematic effects were simulated as a 10% change, and thus did not reflect the expected size of the resulting effect, but rather were used to estimate the change dX/dSys in some relevant quantity X as a function of the systematic Sys.
In each case, the effects of the systematics were first analysed using the 'golden channel' approach, i.e. by applying the track reconstruction of section 3.2.4 to n m CC events, and applying the cascade reconstruction algorithm 1 of section 3.2.4 to n e CC events and with the cut-and-count method. Only when a significant effect was found was the systematic applied to the full analysis chains: the point-like track analysis of section 3.3.6, and the diffuse cascade analysis of section 3.3.1.
Similar effects have been considered for KM3NeT/ORCA, particularly in the case of reconstruction (section 4.4.6). However, the effects of systematics on the mass hierarchy sensitivity of KM3NeT/ORCA are treated via the inclusion of nuisance parameters in the calculation described in section 4.6, rather than using fully resimulated data.
3.4.1. Optics: water properties and DOM response. The absorption and scattering of light in seawater has been measured at the KM3NeT-It site to within an accuracy of approximately 10% [89]. The dominant uncertainty is the contribution due to particulates, whereas the scattering and absorption from pure seawater (salt and water) is well-determined. To simulate this effect, the particulate contribution only has been varied so that the scattering/absorption lengths (l scat and l abs respectively) vary by ±10% at wavelengths near 400nm. It is expected that in situ measurements using the KM3NeT calibration system [90] will be able to significantly improve on this knowledge.
The major uncertainty in the response of a DOM to incident photons is the total effective area, A eff , to Cherenkov photons. This is modelled in GEANT simulations with a high degree of accuracy, as described in [35], and has been measured using K 40 coincidences in situ with a precision of ∼1% [3]. In order to model a significant effect, simulations were produced with A eff varied by ±10% for all photon wavelengths and incident angles.
The effects of these systematic uncertainties on reconstruction accuracy are summarised in table 7, showing the change in reconstruction variables for each percent systematics uncertainty. The most significant effect for the cascade channel is on the energy reconstruction due to a change in l abs , since in the high-energy regime, only after a large distances do PMTs cease to become saturated, so that the energy reconstruction depends on the response after several absorption lengths. In no case was the direction reconstruction affected, since the Cherenkov peak (which contains most of the directional information) remains unobscured by these effects.
In the case of muon reconstruction, changes in absorption and PMT efficiency have similar effects on energy reconstruction, which is smaller than in the case of cascade reconstruction due to the inherent uncertainties. Systematic effects on direction reconstruction depends on the muon (∼neutrino) energy. The difference in the track direction w.r.t. the Table 7. Estimated effects of systematics on event reconstruction accuracy, evaluated on the event samples from -E 2 point-source searches for tracks (section 3.3.6) and the diffuse flux search (section 3.3.1).

Effect
Tracks Cascades Note. For each sample, the worsening in angular resolution q D , and percentage change in the mean reconstructed energy DE/E, are given for the changes listed in the first column. The magnitude of the effects does not reflect the final expected uncertainty. standard value is constant above »1 TeV (values quoted in the table), and increases with decreasing energy below 1 TeV (» - 0.1 0.2 at 100 GeV) as expected, since here the reconstruction is photon-limited. Unlike the case of cascade reconstruction, an increase in water quality, or a larger effective area of the PMTs, improves the directional reconstruction, by increasing the number of Cherenkov photons directly reaching the PMTs.
The effects given in table 7 describe the best estimates of future systematic effects as a function of future uncertainties in the quantities shown. Another relevant measure is: what range of future performances of ARCA is possible given the current uncertainties in these parameters? For this, the systematic effects above were propagated through the simulation chain, allowing reconstructions, cuts, etc, to be re-optimised, i.e. assuming the new value of the changed parameter is known. Effects were analysed in the context of the diffuse flux search using cascades, and the RX J1713 source search using the track channel. Results are given in table 8. Note that for the diffuse analyses, the effects are small, since detection efficiency to both signal and background are affected equally.

3.4.2.
Detector calibration and alignment. The suite of calibration and alignment systems described in [90] have a finite accuracy, and differences from the true DOM positions and orientations might reduce the precision of reconstruction.
Due to the mechanical structure of KM3NeT DUs, the major degrees of freedom for DOM motion are the position in the horizontal plane, and rotation about the vertical axis. The accuracy of acoustic positioning (position in the horizontal plane) is expected to be 20cm (corresponding to a hit time uncertainty of about 1 ns in water), while the internal compass for each DOM will measure the rotation angle to within 3°.
To simulate each effect, a false detector was generated with each DOM randomly deviated using Gaussian distributions of width equal to the expected accuracies above, and these were used by reconstruction routines on events generated with the standard simulation chain.
No detectable effects were observed in the accuracy of either the cascade or track reconstruction. This is partially due to the accuracy of the calibrations, partially the uniform coverage of the DOMs (which make errors in the pointing direction less relevant), and partially the robust nature of the reconstruction algorithms themselves. In the case of orientation angle, the uncertainty was artificially increased until, at 9°(three times the expected uncertainty), negligible degradation (too small to be measured) in the angular reconstruction accuracy of track-like events was observed. Hence, no further investigation was undertaken. Note. E.g., a reduction in l abs of 10% is expected to increase the one-year s 5 discovery flux of the diffuse cascade analysis by 3.5%.
3.4.3. Ageing effects. As KM3NeT ages, some loss of performance due to the degradation or loss of key parts is expected. The effects of PMT ageing are covered by the A eff estimate above. An additional simulation was performed to estimate the effect of both lost DOMs and entire DUs, with the standard simulation re-run once with a random 10% sample of DOMs turned off, and once with DUs randomly removed. The effects on both reconstruction and future sensitivity were estimated assuming that the failed units were known, which will be the case due to continual monitoring. The results are shown in tables 7 and 8. In general, the effects are most important for low-energy muon tracks.

Detector geometry studies
The chosen geometry of KM3NeT ARCA building blocks, with approximately 90 m horizontal spacing between DUs, and 36 m vertical spacing between DOMs, was optimised in preliminary studies to target Galactic sources such as RX J1713. While some limits on the final layout are imposed through engineering considerations-in particular, the maximum length of DUs-the horizontal spacing between DUs can be increased or decreased within a relatively broad range. The discovery by IceCube of a diffuse flux extending above 100 TeV [12] now motivates revisiting the question of the optimal horizontal spacing. In particular, a larger spacing would be expected to be more optimal when targeting high-energy events.
In order to characterise the effects of a larger spacing, the analyses described above have also been performed by considering a detector block with 120 m spacing between the DUs, giving an approximate 78% increase in detector volume. The results are tentative since the analyses have not been fully re-optimised to the alternative geometry. Non-etheless, the change in performance gives an indication of the utility of increasing the horizontal spacing.
These tentative results are summarised in table 9. They have been performed with 10% of the data set, using the fast cut-and-count method. An improvement of about 20%-30% in the discovery flux is observed in the search for a diffuse flux for both channels for the detector layout with increased string spacing. Note that in the cascade channel the sensitivity gain is significantly below the increase of the instrumented volume; one of the reasons is a decrease of the signal detection and reconstruction efficiency with increasing string distances. For Galactic sources, which present a lower neutrino energy spectrum, the change is of the same order, but as expected, in the reversed direction.
It is therefore expected that the final optimum, taking into account all channels and their physics priorities, is in or close to the range explored in this first geometry investigation. Clearly, the choice of optimum configuration depends on the targeted science goals-a larger Table 9. Sensitivity changes in the different analysis channels when increasing the string distance from 90 to 120 m.

Analysis channel
Sensitivity change: 90 m → 120 m Cascades: diffuse (equation (3)) +27% Muons: diffuse (equation (3) spacing is better for high-energy diffuse fluxes, and smaller spacings for point-like sources with a low-energy cutoff. Given that ARCA is in a unique position to study Galactic pointlike sources of neutrinos, and that the detection of such sources is the most challenging of the sensitivity studies presented here (see figure 42), the 90 m horizontal spacing has been retained for the two ARCA building blocks in KM3NeT Phase-2.0.

Introduction
Important progress has been made in the past two decades on determining the fundamental properties of neutrinos. A variety of experiments using solar, atmospheric, reactor and accelerator neutrinos, spanning energies from a fraction of MeV to tens of GeV, have provided compelling evidence for neutrino oscillations, implying the existence of non-zero neutrino masses (see e.g. [91] and the review by Nakamura and Petcov in [92] for recent insights on the subject).
In the standard n 3 scheme, the mixing of the neutrino flavour eigenstates (n e , n m , n t ) into the mass eigenstates (n 1 , n 2 , n 3 ) is described by the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix U which is a product of three rotation matrices related to the mixing angles q 12 , q 13 and q 23 and to the complex CP phase 57 δ: 2 ( = i j , 1, 2, 3). In the n 3 scheme, there are two independent squared-mass differences; one is responsible for for IH. 57 We 2 ). At present, the values of all mixing angles and squared-mass differences in the n 3 oscillation scheme can be extracted from global fits of available data with a precision better than 15%, the largest remaining uncertainty being currently on q sin 2 23 and its possible octant (i.e. whether q 23 is smaller or larger than p 4) [93][94][95]. Table 10  2 ), as can be seen from figure 46.
From a theoretical point of view, the determination of the NMH is of fundamental importance to constrain the models that seek to explain the origin of mass in the leptonic sector and the differences in the mass spectrum of charged quarks and leptons [99]. More practically, it has also become a primary experimental goal because the NMH can have a strong impact on the potential performances of next-generation experiments with respect to the determination of other unknown parameters such as the CP phase δ (related to the presence of CP-violating processes in the leptonic sector), the absolute value of the neutrino masses, and their Dirac or Majorana nature (as probed in neutrinoless double beta decay experiments, or nbb 0 ). From the astrophysical point of view, the NMH impacts, e.g., neutrino flavour conversion in supernovae [100,101]. Finally, the NMH also affects the precise 3 ). 58 Updates of global fits presented at conferences, yet unpublished, achieve a precision better than 12% on q sin 2 23 .
determination of the PMNS matrix parameters, as can be seen from table 10, which summarises the current best fit values and their 3σ uncertainties under both hierarchy hypotheses. While the combination of nbb 0 and direct neutrino mass experiments with cosmological constraints on S n n m might have an indirect sensitivity to the NMH, most of the efforts currently focus on the determination of NMH via neutrino oscillation experiments (see e.g. section 3.1 of [91] for an overview of the subject). One option uses medium-baseline (∼50 km) reactor experiments such as JUNO and RENO-50, which probe the n e oscillation probability at low energies (∼MeV) where matter effects are negligible [102]. These experiments may be sensitive to the NMH through the interference effects arising from the combination of the fast oscillations driven by Dm 31 2 and Dm 32 2 . Such a measurement however requires an extreme accuracy both in the energy resolution and in the absolute energy scale calibration.
In the n 3 framework, the n n « where E ν is the neutrino energy and L stands for the oscillation baseline. These relations establish the direct link between the transition probabilities and the value of q ; 13 they also show that the transitions in vacuum are actually insensitive to the sign of Dm 31 2 . This sign can however be revealed once matter effects come into play along the neutrino propagation path [144,145] where A is positive for neutrinos and negative for antineutrinos. Both the amplitude and the phase of the oscillations can therefore be affected by matter effects. From equation (15), the resonance condition is met when the effective mixing is maximal, i.e. D m m 2 is minimal. This happens for the case of the NH (IH) in the neutrino (antineutrino) channel at the energy:

( )
where ρ is the matter density of the medium. For neutrinos passing through the Earth's mantle (core) the resonance will appear around 7 GeV (3 GeV), which explains why atmospheric neutrinos are an appropriate probe for these effects, in association with the large baselines available.
As an illustration, oscillation curves n n  m P x ( )(x = e, μ) obtained with the ORCA software tools (using the PREM model [147] of the Earth density layers) are shown in figure 47 for various zenith angles θ (i.e. various baselines) as a function of the neutrino energy, both for neutrinos and antineutrinos. In each case, both NMH hypotheses are represented. The strongest impact of the NMH to the oscillation probabilities is in the resonance region~n E 4 8 ( ) GeV. In the region  qcos 0.85 and < n E 7GeV, the effect of the resonant enhancement of the oscillations [112, 125-127, 133, 148-156] for the neutrino trajectories crossing the Earth's core can also be seen. Above ∼15 GeV, the n n  m e Figure 48. Total neutrino (left) and antineutrino (right) CC cross sections per nucleon (for an isoscalar target) divided by neutrino energy and plotted as a function of the energy. Also shown are the various contributing processes: quasi-elastic scattering (dashed), resonance production (dot-dashed), and deep inelastic scattering (dotted). Reprinted with permission from [159]. Copyright (2012) by the American Physical Society.   transition probability becomes very small and differences from distinct NMHs tend to disappear as well 59 . Figure 47 shows that to first order, the effect for neutrinos in the NH scheme is the same as for antineutrinos in the IH scheme. Nevertheless, and even in the case of non-magnetised detectors (such as ORCA) which do not distinguish νʼs and nʼs event-by-event, a net asymmetry in the combined (n n + ) event rates between NH and IH for a given flavour can be observed. This mainly comes from the fact that in the GeV energy range relevant for atmospheric neutrinos, the CC cross section is different (by about a factor of 2) for neutrinos and antineutrinos, as can be seen from figure 48. The relative contribution of n e and n m in the steeply falling atmospheric neutrino spectrum, as shown in figure 49, also affects the number of events of each flavour that can be expected at the detector level.
Convoluting the oscillation probabilities with the atmospheric neutrino fluxes and the neutrino-nucleon cross section, one can construct bidimensional plots of event rates as a function of the neutrino energy E ν and cosine of the zenith angle θ. Such an 'oscillogram' is represented in figure 50 for n n + m m , for both NMH hypotheses. Integrating over energies above 4 GeV, one typically expects of the order of 4650 n m -induced events and 2850 n e -induced events per year in a 1 Mton detector. The phase space region where the differences between NH and IH are most visible clearly depends on the three ingredients mentioned here above; but other factors also come into play, related both to intrinsic effects (such as the physics of the neutrino interaction) and to the detector performance (such as energy and angular resolutions), that will blur the oscillogram patterns and partly wash out the asymmetry effect.
An intrinsic uncertainty in the neutrino energy and direction arises from the kinematics of the neutrino interaction. At the relevant energies, the out-coming lepton can no longer be considered as collinear with its parent neutrino, as can be seen from 60 figure 51. This smearing can conveniently be expressed in terms of the Bjorken inelasticity parameter  59 This justifies the approximation of a two-flavour n n  m t oscillation scheme adopted by high-energy atmospheric neutrino experiments so far [157,158]. 60 Note that the angle and energy resolutions can be improved by combining information from the leptonic and hadronic parts of the interaction to better reconstruct the kinematics. where l stands for a charged lepton and which represents the fraction of energy transferred to the associated hadronic shower. Since the cross section for neutrino and antineutrino behave differently as a function of y, measuring the inelasticity of the neutrino interaction could provide some statistical separation between the ν and n channels and therefore enhance the sensitivity to the NMH [135]. This effect could be best exploited in the muon channel, where the lepton track and the hadronic shower can in principle be more easily identified than in the other channels; the difference in the muon angular spread for 10 GeV n m and n m is illustrated in figure 52. Preliminary studies performed for ORCA using flavour identification tools are presented in section 4.5 and could be the starting point for a statistical separation between n m ʼs and n m ʼs, providing additional enhancement of the sensitivity to NMH in the track and in the shower channel. The kinematic smearing described here is only one among other sources of systematics directly related to the physical processes at play; fluctuations in the development of the particle cascades, and in the production and propagation of the associated Cherenkov light, must also be taken into account. These effects are discussed in more detail in [160].
Uncertainties in the neutrino oscillation parameters can also degrade the sensitivity to the NMH. These uncertainties are taken into account when evaluating the ORCA sensitivity to the NMH (see section 4.6.1). Other sources of systematics such as the uncertainties on the atmospheric spectra, the uncertainties of the Earth matter density profile, or the unknown d CP phase are further discussed in section 4.6.8. Detector-related effects, and in particular the energy and angular resolution, are presented along with the description of the event selection and reconstruction performances in section 4.3.
In order to identify, for each flavour, the phase space region where the effects are larger and therefore the discrimination more powerful, asymmetry variables can be defined such as Figure 53. Asymmetry (as defined by equation (19)) between the number of n n +¯CC interactions expected in case of NH and IH, expressed as a function of the energy and the cosine of the zenith angle. The right (left) plot applies to muon (electron) neutrinos. A smearing of 25% is applied on the energy. On the angle, a smearing s = where N NH and N IH are the number of expected events at a given angle and energy for NH and IH respectively. ¢ essentially reflects the asymmetry of oscillation probabilities and does not depend on the effective mass of the detector, while  is useful to provide an estimation of the significance of the hierarchy measurement by summing over all oscillogram entries, as proposed by [132]. This approach should however be taken with care as it typically overestimates the discrimination power of the experiment. Alternative approaches discussed in [140,[161][162][163], and providing a more rigorous statistical treatment, are followed in section 4.6. An example of asymmetry plots (following the definition of equation (19)) for n m and n e obtained with ORCA software tools and a smearing on energy and angle is shown in figure 53. It is clear that the region where the asymmetry is more evident is above 5 GeV. The plots also indicate that comparable levels of asymmetry are reached in both n m and n e CC interaction channels. Most first-stage studies have concentrated on the n m channel (and on the detection of the associated muon) to determine the sensitivity to NMH, anticipating on the larger statistics in the muon channel and the worse angular resolution of deep sea (/ice) Cherenkov detectors for shower-like events (as produced by n e ) [111,131,132,134,136].
In the course of the study it has however been pointed out that this approach may have been too conservative and that the shower channel, and in particular the n e -induced events, could also provide a significant contribution to the total sensitivity to NMH 61 . To first order, the atmospheric flux of n e of energy E ν which reach the detector after crossing the Earth along a given trajectory, q F n n E , e ( ) is given by [126,127]: As can be seen from figure 49, the ratio r is close to 2 around 2 GeV and below, which tends to suppress the oscillations. This is referred to as the 'screening effect' in [132]. However, in the energy range of a few GeV the ratio increases, which could on the contrary enhance the asymmetry as stated in [133]. The final asymmetry level in the n e channel will also depend on the value of the mixing angle q ; 23 it could in particular be further enhanced if q 23 is found to be in the second octant (i.e. q >  45 23 ). The status of electron neutrino studies within ORCA is summarised in section 4.4. 61 Experiments like Super-Kamiokande and the proposed Hyper-Kamiokande have indeed mainly focused on the electron neutrino channel, because of the good resolutions they can achieve for this topology in the few GeV energy range [110,164] The ORCA simulated detector characteristics rely on reasonable assumptions based on the expertise acquired in the KM3NeT collaboration. To reduce the energy threshold, both the vertical and the horizontal spacing must be reduced with respect to the high-energy KM3NeT design (KM3NeT/ARCA). Vertically this can essentially be done at will whereas horizontally there are limitations due to the deformation of the lines by the sea currents and the unfurling procedure of the strings. For a line with 6 m vertical spacing and 18 modules the maximum deviation at the top of the line is about 10 m (corresponding to a sea current of about 30 cm s −1 ). In addition, the accuracy with which a string can be placed on the sea bottom is from ANTARES experience a few meters. A 20 m distance is therefore assumed to be feasible.
The collaboration therefore decided to start the simulation study with a detector consisting of 2070 optical modules distributed on 115 DUs placed at a distance of about 20 m from each other (accounting for the positioning uncertainty at deployment), in a circular pattern of radius 106 m (figure 54). The detector is located at the KM3NeT-France site (2450 m depth). The DUs host 18 DOMs with 6 m vertical spacing. In this geometry, the first floor is 50 m distant from the seabed and the detector has a total instrumented volume of about3.6 10 m 6 3 (equivalent to ∼3.7 Mt for sea water). Larger vertical inter-DOM spacings have been investigated as well using a masking technique described insection 4.2.4. The results obtained in terms of detector performances for the NMH discrimination indicate an optimum inter-DOM distance of about 9 m (see section 4.6.1).

Event generation and characterisation.
This section describes the software packages used for the generation of MC events. Additionally, a selection of event observable distributions is used to characterise their typical fundamental and detector physics phenomenology. The employed software packages generate atmospheric muons and atmospheric neutrinos. Several codes have been developed for the KM3NeT project and older codes, that were developed by the ANTARES collaboration, have been modified to take into account the KM3NeT DOM characteristics. The codes simulate the particle interactions with the medium surrounding the detector, light generation and propagation as well as the detector response. In the simulation chain a volume surrounding the instrumented volume, called 'can', is defined. The can volume is a cylinder with height and radius exceeding the instrumented volume by about 3 absorption lengths for the atmospheric muon background simulation and by 40 m for the neutrino generation. Generated particles are propagated inside the can and Cherenkov light is generated.
Neutrino and antineutrino induced interactions in sea water in the energy range from 1 to 100 GeV have been generated with a software package based on the widely used GENIE [165][166][167] neutrino event generator. Electron and muon neutrino events are weighted to reproduce the conventional atmospheric neutrino flux following the Bartol model [168].
All particles emerging from a neutrino interaction vertex are propagated with the GEANT4 based software package KM3SIM [169] that has been developed by the KM3NeT collaboration. It generates Cherenkov light from primary and secondary particles in showers and simulates hits taking into account the light absorption and scattering in water as well as the DOM and PMT characteristics.
The background due to down-going atmospheric muons is generated with the MUPAGE [29,170] program. MUPAGE provides a parameterised description of the underwater flux of atmospheric muons including also multi-muon events. The parameterised muon flux was obtained starting from full simulations with HEMAS [171] and CR data. These muons are tracked inside the can with the code KM3 which generates and propagates the light produced by the muons and their secondary particles, taking into account the optical properties of the water. For the photon propagation, the code uses tables containing parameterisations obtained from a full GEANT3 simulation. The code simulates the PMT hit probabilities and the response of the PMTs. The PMT photocathode area, QE and angular acceptance, as well as the transmission of light in the optical module glass sphere and in the optical gel are taken into account.
In order to reproduce the randomly distributed background PMT hits due to the Cherenkov light from β-decays of 40 K, single photoelectron hits can be added to the hits induced by charged particles inside a chosen time window. Also the hits in coincidence due to 40 K between two PMTs inside the same DOM are taken into account.
First measurements of the optical background rate indicate a single PMT noise rate of 8 kHz and twofold coincidence noise of about 340 Hz, for details see [5]. For the simulation results described below a conservative optical background light estimation has been used. An uncorrelated hit rate of 10 kHz per PMT and time-correlated noise on each DOM (500 Hz twofold, 50 Hz threefold, 5 Hz fourfold and 0.5 Hz fivefold) was added. The simulated timecorrelated noise rates due to 40 K decays have been verified with a complete simulation based on GEANT4.    Figure 56 compares the number of hit PMTs (DOMs) due to the Cherenkov light emission from a muon, an electromagnetic and a hadronic shower as a function of their respective energy. An electromagnetic shower will cause roughly 12 hits per GeV while a hadronic shower is, as can be expected due to the Cherenkov thresholds of the comparably massive hadrons involved, much dimmer with seven hits per GeV, i.e. an electromagnetic shower of about 5 GeV energy is almost as bright as a hadronic shower with 10 GeV. The DOM hit multiplicity scales in a similar manner, but somewhat more favourably for hadronic showers due to the on average greater opening angle as compared to electron positron pair cascades.
The inelasticity parameter y of a neutrino interaction on the nucleon critically determines the reaction kinematics as can be seen in figures 57 and 52. At energies below 10 GeV the different strengths of the different interaction channels, quasi-elastic, resonant and deep inelastic, are visible in the y-distributions and result in a higher average inelasticity for  between the incoming neutrino and the outgoing muon shows a strong dependency and increase with increasing reaction inelasticity. The lower average inelasticity for antineutrinos leads to on average also lower scattering angles. This indicates the discrimination potential of this parameter and the importance to get access through event reconstruction.

Muons from hadronic showers.
Employing detailed GEANT3 based simulations, the muon production within the hadronic shower has been studied. A significant contribution of muons with path lengths in excess of the hadronic shower extension, i.e. with energies of at least one or several GeV, could complicate and probably deteriorate the particle flavour identification capabilities (see section 4.5). However, as is shown in [172] and summarised in the following, GeV muons from hadronic showers affect only about 1% of the events.
In figure 58 the Cherenkov photon emission positions along and perpendicular to the hadronic shower direction are shown for simulated shower energies of » E 5 GeV had (left) and » E 20 GeV had (right). Each Cherenkov photon is weighted with its wavelength dependent detection probability taking into account the PMT QEs and the absorption and scattering in sea water. In total 3400 (4000) n e CC events with < < n E 8 G e V 1 2 ( < < n E 30 G e V 5 0) are used to extract and superimpose their hadronic showers. For both cases only a few muon tracks can be seen to emit light significantly beyond the hadronic shower extension.
Most muons in the hadronic shower come from pion decays. However, pions with energies in the GeV range will likely interact before they decay, as the hadronic interaction length for pions in water is approximately 1 m. In order to study the muon production from charged pions in greater detail, 10 4 charged pions with energies of = p E 2, 5, 10 GeV have been simulated in sea water. The mean number of muons m N ⟨ ⟩ and the fraction of simulated events with at least one muon  m N 1 are summarised in table 11. The energy spectrum and cumulative energy distribution of the most energetic muon is shown in figure 59. For all three pion energies the fraction of events producing a muon with more than 1 GeV (2 GeV) is below 2% (1%). In the 9 and 15 m configurations, neighbouring DUs use different masking schemes in order to make the masked detector as homogeneous as possible. Doing so the instrumented volume stays the same for all detector configurations, but the DOM density changes. In order to compare the effective volume of the different detector configurations assuming the same  number of DOMs for each vertical spacing, the effective volumes of the masked detectors are scaled accordingly (factor of 1/1.5/2/2.5). It should also be noted that the surface to volume ratio for the masked detectors is larger than it would be for a full detector with 18 DOMs per DU. Therefore, the presented results overestimate possible surface-related effects. 4.2.5. Triggering. As described in section 2.6.1, muon and shower events are extracted from the real-time data stream using causality conditions. In the case of ORCA, with a simulated 10 kHz uncorrelated single noise rate per PMT and about 500 Hz time-correlated noise from 40 K decays on each DOM, the estimated L1 rate (coincidences on the same DOM in a short time-window) per optical module is about 1.5 kHz.
The trigger algorithms described in section 2.6.1 were optimised for ORCA by considering the effective volume and the event purity. The effective volume is the volume in which a neutrino interaction would trigger the event to be written to disk and the event purity is the fraction of triggered events that contain a neutrino interaction or at least one atmospheric muon. The trigger rate from neutrino interactions is  mHz ( ) and is negligible compared to the rate from atmospheric muons ( 40 Hz ( )). The trigger settings correspond to a L1 time window of D = T 10 ns, a maximum angle between the PMT axes of 90°(L2), and a minimum number of L1 hits of three for the shower trigger and four for the muon trigger 62 . Both triggers run in parallel and one of them or both must fire to flag an event (logical OR). For the different considered vertical spacings the distance parameters (R and D) of the muon and shower triggers have been adjusted such that  62 Muon and shower triggers with larger minimum numbers of L1 hits in conjunction with larger distance parameters R and D have also been studied. However, these triggers show smaller effective volumes than those used in this document. each of the triggers has a rate of~10 Hz from pure noise. Note that this adds up to a rate of ∼20Hz from pure noise for both triggers. The rate of atmospheric muon events is evaluated at a depth of 2450 m using the simulations described in section 4.2 and amounts to about 36 Hz (6 m) -55 Hz (15 m) depending on the vertical spacing of the ORCA detector.
In order to estimate the trigger rates, dedicated simulations for each vertical spacing have been performed, i.e. the detector masking described in section 4.2.4 has not been applied. Trigger rates from pure noise and atmospheric muons are summarised in table 12 for the various vertical spacings. The trigger event purity is 65%-73%.
It should be noted that during periods of high bioluminescence [9], the trigger conditions (minimum number of L1 hits and distance parameters R and D) can be tightened in order to reduce the output data rate and match the available data transfer bandwidth.
The effective volume at trigger level for 6 m vertical spacing is shown for different neutrino flavours in figure 60 (left) as a function of neutrino energy. Events are weighted to reproduce the conventional atmospheric neutrino flux following the Bartol model [168] and only up-going neutrinos are considered. The effective volume is smaller for and than for events as the outgoing neutrinos are invisible to the detector. For n m e, CC events the effective volume is larger than for n m e, CC due to the lower average inelasticity and the resulting higher average light yield (at the considered energies hadronic showers have a smaller average light yield than electromagnetic showers). The effective volume depends also on the neutrino direction as figure 60 (right) shows for n e CC events. Other neutrino flavours exhibit a similar zenith angle dependency. For vertical up-going events ( q »n cos 1) the effective volume rises more steeply with energy than for horizontal events ( q » n cos 0) as more PMTs are oriented downward than upward in an DOM and the density of DOMs is higher in vertical than in horizontal direction.
The effective volumes at trigger level for and events for different vertical spacings are shown in figure 61 as a function of neutrino energy. For 9 m, 12 m and 15 m vertical spacing the simulation of the benchmark detector with a 6 m spacing is masked and the resulting effective volumes are scaled to the same number of DOMs per DU as described in section 4.2.4. Further details on the triggering studies can be found in [172].

Muon neutrino studies
This section presents the strategy adopted to reconstruct muon neutrino CC events with ORCA, and its current performance. All results shown in this section are based on the MC simulations presented in the previous sections.

Muon direction reconstruction.
The track reconstruction algorithm presented here permits to estimate muon (and consequently neutrino) directions using the combined information of the PMT spatial positions and the Cherenkov photon arrival times. The reconstruction code used is based on the strategy developed for the ANTARES telescope and described in reference [173]. This algorithm has been modified to exploit the multi-PMT peculiarities taking into account the directional sensitivity of the KM3NeT optical module.
After an initial hit selection, requiring space-time coincidences between hits, the reconstruction proceeds through four consecutive fitting procedures, each using the result of the previous one as starting point. Each fitting stage improves the result, but the last fit produced, that provides the most accurate result, works well only if the input parameters of the muon track are not too far from the true track parameters. Moreover, the efficiency of the algorithm is improved with a scanning of the entire sky in steps of 3°starting from the prefit track, thus generating 7200 tracks. A scheme of the overall procedure is shown in figure 62.
As described in section 4.2, the optical background induced by 40 K decays has been simulated adding an uncorrelated hit rate of 10kHz per PMT and a time-correlated hit rate of 500Hz per DOM (two coincident hits in different PMTs inside the same DOM). To remove the hits from 40 K decays, the requirement of space-time coincidences between hits is used, since hits due to optical background are mostly uncorrelated.
In particular, the hit selection proceeds by first selecting all the local coincidences, i.e. coincidences of hits within the same DOM, in a time window of 10ns and for which the PMTs involved are less than 90°apart. Among them, a cluster is selected such that any hit in the cluster is causally related to all the remaining ones, according to the following causality relation: where Dt is the time difference between the two hits, d is the distance between the two PMTs and c water is the group velocity of light in water. The cluster of hits obtained is further extended by including the yet unselected hits which fulfil all the following conditions: Figure 63. Efficiency and purity of the hit selection adopted by the track reconstruction algorithm as a function of the interacting neutrino energy.
• are causally connected to at least 75% of all the hits in the cluster, • are closer than 50 m to at least 40% of all the hits in the cluster, • are all causally connected among them. The latter extension procedure is iterated twice. The resulting performance of the hit selection for n m -CC events in terms of efficiency and purity is shown in figure 63, where the efficiency is the fraction of signal hits selected among all the signal hits, whereas the purity is the fraction of signal hits among all the selected ones. The resulting set of hits is referred to as Selected hits in the following.
The Selected hits serve as input of the first step of the track reconstruction procedure, referred to as 'linear prefit', which is a linear fit through the positions of the hits. Once a first estimate of the track is obtained, the evaluation for each hit of the expected angle of incidence q i of the photon on the PMT is possible. An 'angular selection' is then applied discarding all the hits with q >cos 0. where N hits is the number of hits used in the final fit and  is the maximum value of the likelihood.

4.3.2.
Neutrino energy estimate. The neutrino energy estimation is performed in two steps: first the muon energy is estimated by reconstructing the muon track length and the interaction vertex, then the neutrino energy is estimated depending on the reconstructed muon length and the number of hits used by the track reconstruction algorithm. These two procedures are described in detail in the following sections. (1) The detected photons are projected back to the track according to the Cherenkov angle.
The first track length estimate, ¢ m l is then defined as the distance between the position of the first and last projected photon on the track. The first projected photon is the first vertex estimate ¢ V . If the muon is generated inside or near the instrumented volume, ¢ V is an estimate of the interaction vertex, otherwise it indicates the first photon seen by the detector. For these reasons in the following the vertex estimate will be referred to as the 'pseudo-vertex' estimate.
(2) Some specific features of the hits from the hadronic shower are identified and used to select a set of hits around the first pseudo-vertex estimate.  The percentage of background hits contained in the set of track-hits is below 2%. The set of track-hits contains 60%-70% of the total amount of hits coming from the track. On the other hand, the contamination due to the hits produced by the hadronic shower increases with the inelasticity y. For low values of y, the largest part of the neutrino energy is transferred to the muon and almost all the selected hits are hits produced by the true muon track. In this case the purity of the track-hits reach about 98%. When » y 1 the hadronic shower takes almost all the neutrino energy and most of the detected hits are due to the shower. Consequently, the purity of the selection decreases to about 20%, the track length is overestimated and the estimated vertex position is some meters away from the real interaction vertex. In such a case, the particles produced at the vertex may even travel backwards with respect to the muon direction. To overcome this problem, a study of the distribution in time and space of hits produced at the interaction vertex has been performed, with the goal of identifying specific features in the reconstruction phase which could be used to distinguish hadronic shower hits among hits due to the optical background and to the muon.
The parameters analysed are the distance d from the estimated pseudo-vertex to the hit position, the transverse and longitudinal projection of d with respect to the reconstructed muon track direction, called k and l respectively. Moreover, the time evolution of the shower Figure 65. Normalised distributions of the time residuals for muon neutrino charged current events with energy higher than 5 GeV and whose vertex is reconstructed within the instrumented volume for 4 Bjorken y intervals, with respect to the reconstructed muon track on the left (track hypothesis) and with respect to the reconstructed vertex on the right (shower hypothesis). For the blue curve, the peak at = timeresidual 0 is less sharp due to the lower resolution of the vertex reconstruction at low Bjorken y interactions, caused by the lower amount of light emitted at the interaction vertex. hits can also be studied. Under the simplistic assumption that all the hits are emitted from the vertex at a time t V , a hit with distance d from the vertex should occur at a time if v is the speed of light in the medium. A 'time residual' can be thus defined as where t i is the time of the hit. Finally, the conditions applied to select hits from the shower are: < l 120 m, < k 100 m, D < t | | 50 ns, and ->k l k ( ) 2. The first two conditions are intended to reject the optical background hits and identify a region  where the shower is likely to be. The other two are used to distinguish the shower hits from the hits due to the muon track.
The used cuts are chosen in order to distinguish as much as possible shower hits from muon and background hits but trying to keep the few hits that are produced by the shower at low energy. Hits selected in this way are called shower-hits. In this hit set the contamination due to the background hits is around 2-3%. The purity of the shower-hits increase with the inelasticity reaching about 75% whenỹ 1. The set of shower-hits contains about 50% of the total number of hits coming from the shower. To find the vertex position a maximum likelihood fit applied to the selected shower-hits. A function obtained from the Dt distribution for the simulated shower hits is used as PDF and the final estimate of vertex position is chosen among the first emission point and the result of the fit. Once the vertex has been identified, the track length is scaled according to the distance from the estimated vertex and the last back projected photon on the track. The muon energy is estimated as   between NDoF fit and the energy of the interacting neutrino, for a certain interval of reconstructed muon track length, is obtained by fitting the median distribution of n E as a function of NDoF fit . In order to further improve the accuracy, two different estimations are used, taking into account the reconstructed Bjorken y being higher or lower than 0.5.  Figure 68 shows the performances for events reconstructed as up-going, whose vertex is reconstructed within the instrumented volume, with quality cut of the reconstruction algorithm of L > -5.0 (see equation (24)). The top left plot shows the median distance between the true and estimated vertex position, distance P P ,  figure 68 shows the resolution on the reconstructed neutrino zenith angle and the bottom plot shows the fractional energy resolution, which is defined as - Another parameter needed to evaluate the reconstruction performance as well as to calculate the sensitivity for the measurement of the NMH is the detector effective volume. The effective volume V eff can be defined as the volume of a 100% efficient detector for observing neutrinos that interact within that volume, for a set of specified quality cuts. In the simulation adopted, described in section 4.2, all the neutrinos interacting within a volume larger than the instrumented volume and surrounding the detector, that can be referred to as generation volume V gen , are kept for the subsequent steps of the simulation and, eventually, the reconstruction. The effective volume is then obtained by scaling V gen with the ratio of the reconstructed events N rec (or selected according to a given criterion) and the generated events Assuming a seawater density of 1.025 g cm −3 , the effective volume is converted into an effective mass M eff . The M eff calculated for events with a quality parameter L > -5.0 and whose vertex is reconstructed within the instrumented volume is plotted in figure 69 as a function of the neutrino energy and for various intervals of the direction of the incoming neutrino.

Electron neutrino studies
This section describes the methodology and performance of a reconstruction strategy that has been developed for NC and CC shower-like events in ORCA [172]. Electron neutrino events will play a crucial role for the envisaged mass hierarchy measurement, good angular and energy resolutions are therefore mandatory.
where N refers to the target nucleon and h to the hadronic system in the final state. The outgoing electron initiates an electromagnetic shower while the hadronic system develops into a hadronic shower with a possibly complex structure of hadronic or electromagnetic subshowers, depending on the decay modes of individual particles in the shower. In the following, the energy E had and momentum  p had of the hadronic shower are defined by the difference of the respective energy and momentum of the neutrino and the electron: interactions is similar to that presented in figure 66. The angle is minimal for y=0.5 with a mean value of roughly  25 . For  y 0 (  y 1) the angle between the incoming neutrino and the outgoing hadronic shower (lepton) becomes larger, leading to larger f lep,had . For increasing neutrino energies the angle f lep,had becomes smaller.

Light production in showers.
Some information about the Cherenkov light production of showers can be found in the literature, e.g. in [174] and references therein. Mostly, however, previous studies have focused on energies well above those relevant for ORCA. Therefore, the most important characteristics of showers, as obtained from MC simulation studies, in the relevant energy range for ORCA are briefly summarised in the following.
In general, an electromagnetic shower consists of a cascade of  e emitting photons via bremsstrahlung, which interact with matter and again produce  e -pairs via pair production. The evolution of a hadronic shower is similar but the initial particles are hadrons and the developing cascade will show significantly larger fluctuations as it is dominated by particle  10 GeV and » y 0.5 in the upper and lower row. Each event is rotated in such a way that the electron is in the zdirection. Left: Illustration of the particles produced in the two events. Each arrow represents one particle. The arrow direction and length correspond to the particle momentum in the p y -p z -plane, and the arrow colour indicates the particle type. Middle and right: photon distributions in sea water recorded on shells at 20 and 50 m around the neutrino interaction vertex. Each photon is weighted with the solid angle averaged effective area of a PMT for the photon wavelength. The Cherenkov ring from the electron is centred around 0, 0 ( ) with an opening angle of 42°, as the electron moves in the z-direction. decays. In water the electromagnetic and nuclear interaction lengths are roughly 36 cm and 83 cm [92], respectively. Therefore, compared to muon tracks, showers appear in first approximation as a point-like burst of light in the detector. The light is emitted by charged particles with energies above their Cherenkov threshold.
The longitudinal and transverse light emission profile of electromagnetic and hadronic showers can be seen in figure 70. For the energies of interest the brightest point of a shower is offset roughly 1-2 m in the shower direction. The longitudinal extension of the showers increases with E log( ). In spite of the larger interaction length the longitudinal offset for hadronic showers is smaller than for electromagnetic showers with the same shower energy = E E e had , since they are initiated by several hadrons, each with an energy below E had , and the initial hadrons have different directions reducing the longitudinal extension when projecting onto the shower axis. The transverse extension of the showers is negligible compared to the longitudinal.
Although an electromagnetic shower consists of many  e -pairs with rather short path lengths and overlapping Cherenkov cones, the small pair opening angle preserves the Cherenkov angle peak of the effective angular light distribution which results in a single Cherenkov ring in a projection onto a plane perpendicular to the shower axis. Similarly, each hadronic shower particle with energy above the Cherenkov threshold will produce a Cherenkov ring. Therefore, hadronic showers show a huge variety of different signatures due to the various possible combinations of initial hadron types, their momenta and the diversity of their hadronic interactions in the shower evolution.
Two simulated electron neutrino event examples with » n E 10 GeV and » y 0.5 each are shown in figure 71. The Cherenkov photon ring from the electron is clearly visible together with fainter rings from hadronic shower particles. Due to the large scattering length in water, the angular profile of the emitted light is well conserved over large distances, which leads to the different visible, distinct Cherenkov rings.
While electromagnetic showers show only negligible fluctuations in the number of emitted Cherenkov photons and in the angular light distribution, hadronic showers show significant intrinsic fluctuations in the relevant energy range. These intrinsic fluctuations of hadronic showers and the resulting limitations for the energy and angular resolutions have been studied in detail, see [160].
In hadronic showers also muons can be produced via charged pions, which can lead to a wrong flavour classification of the event (see section 4.5). The relevance of muons leaking out of hadronic showers and their energy distribution has been studied in detail, see section 4.2.3.
The averaged angular light distribution for electromagnetic and hadronic showers is shown in figure 72 . For both shower types the probability to detect at least one photon within one DOM (DOM-hit probability) is maximal at the Cherenkov angle of 42°, but it is more peaked for electromagnetic than for hadronic showers. At smaller distances the Cherenkov peak becomes washed out due to the extension of the shower in conjunction with the small lever arm for the definition of the angle with respect to the shower direction. Note that the light distribution for a single hadronic shower event will not be as smooth as shown in these plots due to the distinct Cherenkov rings from each hadron. 4.4.1.3. Sensitivity to the reaction inelasticity y. Electromagnetic and hadronic showers induced by neutrino interactions in the energy range relevant for the NMH measurement show slightly different light emission characteristics in the detector. Due to the large scattering length in water these differences are conserved over sufficiently large distances, so that information from a large detector volume can contribute to the discrimination between the two shower types. In electron neutrino CC events, in which both an electromagnetic and a hadronic shower are present at the same time and partly overlapping, the angular separation f e,had of both showers can help to distinguish between them. This can make an estimation of the reaction inelasticity y in events feasible. Additionally, it might allow for a partial separation of and NC events on a statistical basis. However, with an ORCA-like detector 63 it seems impossible to distinguish a shower induced by a single electron from a shower induced by a single hadron, since both resulting Cherenkov light cones will be of the same intensity for the same particle energy. Figure 71 (bottom) shows a simulated example event, in which the electron ( = 3.71 GeV) induce Cherenkov rings of similar intensity. The most intense Cherenkov ring in events is seen in most cases from the electron, as can be inferred from the distribution of the inelasticity parameter y in section 4.2 and keeping in mind that the hadronic shower energy E had is often shared between many hadrons. A measure for the intensity of a Cherenkov ring E x cher induced by a particle x with energy E x can be defined by:  63 Detector with a spacing between optical sensors of several metres up to few tens of metres. for all neutrino interaction types (see figure 73). This is even the case for NC events, which are in principle very similar 64 to events with the same neutrino energy as the hadronic energy in the NC events and an inelasticity of y = 1.

Shower reconstruction algorithm.
A neutrino-induced shower-like event is characterised by 8 free parameters: vertex position  x vtx and time t vtx , energy E, direction e ŝ and inelasticity y. The shower direction is characterised by 2 angles.
The shower reconstruction is performed in two steps. In the first step the vertex is reconstructed based on the recorded time of the PMT signals, commonly called hits, and in the second step the direction, energy and inelasticity are reconstructed based on the number of hits and their distribution in the detector. In both steps a maximum likelihood fit is performed for many different starting shower hypotheses and the solution with the best likelihood is chosen.
This factorisation of the fitting procedure works well due to the homogeneity of water and its large scattering length which allows for a precise vertex reconstruction independent of the shower direction. The vertex reconstruction is performed in two successive maximum likelihood fits. For both fits, the likelihood for the vertex hypothesis is a function of the hit time residuals for a given shower hypothesis. 64 Small differences are due to different characteristics of hadronic showers induced by W or Z bosons.
The first vertex fit (prefit) is designed to be very robust against noise hits and an imprecise initial vertex hypotheses. The initial hit selection is optimised for low energetic shower-like events and is described below together with the choice of the initial vertex hypothesis. In the prefit, the following function g is used: ns . 32 Based on the initial hit selection in total 15 starting vertex hypotheses for the prefit are generated. The fitted vertex with the best likelihood is chosen as result of the prefit. The second vertex fit is more precise but needs a hit selection with higher signal purity and a good starting vertex hypothesis. The result of the prefit is used to generate in total 10 starting vertex hypotheses (result of the prefit and 9 vertex hypotheses around it with time shifts of 25 ns and position shifts of 5 m in a random direction). A rather pure signal hit selection is achieved by selecting hits according to the following criteria: where ψ is the angle between the PMT direction (vector normal to the photocathode plane) and the vector from the vertex to the PMT, i.e. only PMTs which are orientated towards the vertex and can be hit by unscattered photons are taken into account. The fit uses a function g t res ( ) obtained from simulated events, and which is dependent on the distance d. Such distributions are shown for three different distances d in figure 74. With increasing distances the peak of direct hits becomes broader due to scattering and dispersion, and the hit probability decreases due to absorption leading to a relative increase of the noise level.
The fitted vertex with the best likelihood and within 10 m and 50 ns around the result of the prefit is chosen as final vertex.

Initial hit selection for first vertex fit.
For the initial selection of shower-like hits the following hit patterns are defined: L1 coincidence between hit times of two PMTs on the same DOM in a time window  Dt 10 ns. L2 L1 with an angle between the hit PMTs smaller than  90 , note that these are the same definition as used in the triggers, see section 4.2.5. L3 coincidence between hits on three PMTs on the same DOM in a time window  Dt 10 ns. V2L2 coincidence between two L2 hits on different DOMs which are closer than 35 m and within a time window  D + t t 10 ns D , where t D is the time required by the light to travel the distance D between the two DOMs. T0L0 coincidence between two hits on adjacent or next-to-adjacent DOMs on the same string in a time window  D + t t 10, ns D . The general strategy is to find first a reference hit that is very likely a signal hit and close to the neutrino interaction vertex. The position/time of this reference hit is then used as an initial vertex hypothesis to select additional hits based on their time residual and further requirements to suppress noise hits.
Firstly, the largest cluster of causally connected L2 hits is selected by requiring  D + t D c 10 ns water for all L1 hits within the cluster. From these causally connected L2 hits the subset of hits that additionally satisfy the L3 or V2L2 criteria is selected. These L3 or V2L2 hits are ranked according to their hit multiplicity (number of coincidences on the same DOM) as well as the number and multiplicity of causally connected hits in the vicinity of 25 m. The most signal-like hit is chosen as 'reference hit'.
Secondly, all hits around the reference hit are selected that are closer than 100 m, within a time window of -< < t 250 ns 10 ns res and causally connected with most L3 or V2L2 hits. The loose lower time cut allows for distances up to about 50 m between the true neutrino interaction vertex and the reference hit, e.g. because the neutrino interaction is outside the detector volume. The drawback of this relatively large time window is a contamination with noise hits. Therefore, hits are discarded that do not satisfy the L1 criterion, or are either causally connected with an adjacent L3 or V2L2 hit on the same string or fulfil the T0L0 criterion in addition to being causally connected with a L3 or V2L2 hit in the vicinity of 25 m. The hits selected by this procedure are used in the first vertex fit and the position/time of the 15 most signal-like hits are used as initial vertex hypotheses.

Reconstruction of energy, direction and inelasticity.
Once the shower vertex is fixed, the remaining parameters which can be fitted are the shower energy E, direction e ŝ and the reaction inelasticity y. In principle all of these parameters can be inferred from the angular light distribution (see figure 72): the shape is sensitive to the inelasticity y, the integral is in first order proportional to the energy (as the light yield is in first order proportional to the shower energy) and the direction in which this angular light emission profile is present gives the shower direction.
In the following, the shower energy E, direction e ŝ and inelasticity y are reconstructed using a maximum likelihood fit based on the probability that the hit pattern is created by a trial shower hypothesis a =   t x E y e , , , , s vtx vtx (ˆ). As discussed in section 4.4.1, the electron mostly is the dominant particle in n CC e events and produces the brightest Cherenkov ring. Therefore, the reconstruction is designed to find the electron direction e ê and not the neutrino direction.
The final hit selection, the definition of the likelihood function and the fitting procedure are described in the following.

Final hit selection.
Based on the result of the vertex fit, hits are selected according to the following criteria: For simplification 65 , all PMT-hits on the same DOM are merged and the times of the individual hits are not taken into account, so that the event is quantified by N hits DOM for each DOM. For the fit all DOMs with < < d 10 m 80 m are taken into account, that includes also the DOMs without any selected hit. ) to detect N hits DOM on a given DOM depends on: E, y, the distance d between the vertex and the DOM, the angle θ between shower direction e ŝ and the vector  d from the vertex to the DOM, and the DOM orientation. The DOM orientation can be described by a single angle β between  d and the DOM direction, because the angular acceptance of the entire DOM (sum of all PMT angular acceptances) shows in first order a rotational symmetry due to the multi-PMT structure, see section 2. All of these quantities are illustrated in figure 75.
The likelihood is computed as follows: )are obtained from MC simulations of events. An example distribution of the expected number of photons g N ⟨ ⟩ as a function of the angle θ for different inelasticity y intervals is shown in figure 76. As the angle θ is defined with respect to the electron direction, a clear Cherenkov peak of the electron at  42 is visible. With higher inelasticity y this peak becomes fainter due to less energetic electrons, while the number of expected photons in the 'off-peak region' (  q  60 ) increases due to the more energetic hadronic showers. Therefore, these PDF tables gain sensitivity to the reaction inelasticity y from the ratio of the peak to the off-peak region. ) . The other four seeds are randomly chosen perpendicular to the first starting shower hypotheses with the same energy.
Finally, the result with the best likelihood of all 45 fits is selected. Thus, the final result has a discrete value for the reconstructed inelasticity y.  This coverage cut is introduced to ensure that a reasonable fraction of the expected hit pattern from the reconstructed shower is contained in the instrumented volume. Therefore the coverage cut is in principle a containment cut for the reconstructed vertex depending on the reconstructed shower direction. Effective volume: the effective volume for up-going n CC e and n CC ē events is shown in figure 77 as a function of neutrino energy for different neutrino zenith angle ranges. Depending on the zenith angle the plateau reaches 3.8 Mm 3 (horizontal), 3.6 Mm 3 (vertical up-going) and around 3.7 Mm 3 for all up-going n e and n ē . The turn-on is slightly steeper for vertical up-going than for horizontal events as more PMTs are oriented downward than     and is dominated by the longitudinal vertex resolution. This precise vertex reconstruction justifies the factorisation of the shower reconstruction into a vertex reconstruction and a shower energy, direction and inelasticity reconstruction.
The fitted mean longitudinal vertex shift (in meter) is shown in figure 79 as a function of E ν and Bjorken y. The increasing distance of the reconstructed shower bright point from the interaction vertex with increasing neutrino energy is clearly visible.
Direction resolution: the median neutrino direction resolution (the angle between reconstructed direction and neutrino direction) as a function of neutrino energy is shown in figure 80 for different neutrino zenith angle ranges and for n e and n ē separately. For events weighted with the Bartol flux model the median directional resolution is better than  10 for energies above 8.5 GeV for n CC e and above 5.5 GeV for n CC ē events. The resolution is slightly better for vertical up-going than for horizontal neutrinos as more PMTs are oriented downward than upward in a DOM.
As the reconstruction is designed to find the electron direction, the resolution is better for n ē than for n e due to the smaller average inelasticity for n ē leading on average to a smaller intrinsic scattering angle between the neutrino and the electron. The median intrinsic scattering angle, the median resolution with respect to the electron direction and the neutrino direction as a function of neutrino energy are shown in figure 81. For the relevant energy range the median electron direction resolution is smaller than the intrinsic scattering angle and the median neutrino direction resolution, verifying that the reconstruction actually has the ability to find the electron in events. Figure 82 shows the median electron direction resolution as a function of electron energy for different true inelasticity y ranges. The reconstruction of the electron direction is only slightly affected by the additional light from the hadronic shower up to » y 0.5. For  y 0.6 the reconstruction can additionally be confused by high energetic particles in the hadronic shower producing a brighter Cherenkov ring than that from the electron. Due to momentum conservation the most energetic particles produced in neutrino interactions tend to have smaller scattering angles with respect to the neutrino direction. Therefore, by sometimes reconstructing the dominant particle from the hadronic shower the median neutrino direction for n CC e events is slightly better than the intrinsic scattering angle between neutrino and electron for neutrino energies above~5 GeV, as can be seen from figure 81.
Inelasticity resolution: the resolution on the inelasticity y for a low, medium and high y range is shown in figure 83 (left) for < < n E 6 GeV 12 GeV. The distributions of the   events. This is a feature of the reconstruction algorithm. Due to the sensitivity to y the y reco distribution is different for n e and n CC ē events leading to a separation power between both channels. This sensitivity to y can also be used to separate events from events. Energy resolution: in figure 84 (left) the reconstructed energy is shown as a function of the neutrino energy for events weighted according to the Bartol flux model. The reconstructed energy is systematically higher than the neutrino energy. Therefore, an energy correction depending on the reconstructed zenith angle q reco , inelasticity y reco and reconstructed energy E reco is applied. The corrected reconstructed energy E reco corr is given by  where the three-dimensional correction function q f y E , , reco reco reco ( ) has been calculated from MC such that the median reconstructed energy is equal to the neutrino energy assuming a Bartol flux model. The corrected reconstructed energy as a function of the neutrino energy is shown in figure 84 (right).
The difference between reconstructed and neutrino energy in different neutrino energy bins is shown in figure 85 for n CC e and n CC ē events separately. These distributions are very well described by Gaussians.
The median fractional energy resolution-given as n n E E E reco | | -is better than 18% for neutrino energies above 5 GeV for up-going n CC e and n CC ē events and is shown as a function of neutrino energy in figure 86. The relative energy resolution-given as the RMS of n E E reco ( ) distributions (see figure 85) over neutrino energy-is better than 26% (24%) for neutrino energies above 7 GeV for up-going n e (n ē ) CC events and is shown as a function of visible energy E vis in figure 87 (left) together with the resolution for the other shower-like neutrino interaction channels. For events the visible energy is equal to the neutrino energy. The resolution is better for n CC ē events than for n CC e events due to the lower average contribution from the hadronic shower which shows larger fluctuations than electromagnetic showers [160]. Figure 87 (right) shows the mean relative offset between the mean reconstructed energy and the visible energy. At energies corresponding to the effective volume turn-on region the reconstructed energy is overestimated for n CC ē and n CC e events as only events pass the event selection criteria that appear more energetic than they actually are. Above~9 GeV the reconstructed energies are slightly overestimated (underestimated) for n ē (n e ) CC due to the smaller light yield of hadronic showers compared to electromagnetic showers.  The effective volume for up-going shower-like neutrino events is shown in figure 88 (left) as a function of neutrino energy. The turn-on is much less steep for and events than for events, as the outgoing neutrinos are invisible to the detector. For events the turn-on is steeper than for events as on average the visible energy in events is larger than in events. In nNC events the average inelasticity is higher than in nNC events leading to more energetic hadronic showers and a steeper turn-on. The median directional resolution is shown in figure 88 (right) as a function of neutrino energy. The directional resolution for and events is clearly worse than for events as the information of the outgoing neutrinos is unavailable. As the angle between the hadronic shower and the neutrino is smaller for nNC than for nNC events due to a higher average inelasticity, the directional resolution is better.
The relative energy resolution-given as RMS over visible energy E vis -for up-going shower-like neutrino events is shown as a function of E vis in figure 87 (left). E vis is defined as the difference between the energy of the incoming neutrino and the outgoing neutrino(s) from the primary neutrino interaction (NC events) or τ-decay ( events). The resolution is worse for events with higher average contribution from hadronic showers which show larger fluctuations [160].
Due to the smaller light yield of hadronic showers compared to electromagnetic showers, the ration E E reco vis ⟨ ⟩ is different for each neutrino interaction channel and energy dependent. This can be seen in figure 87 (right). The higher the fraction of electromagnetic shower component in the event the higher is the mean reconstructed energy. This leads also to different turn-on behaviours in the effective volume for both shower types, and consequently to different compositions (in terms of electromagnetic and hadronic shower components) of well reconstructed neutrino events. The latter explains the behaviour below  E 10 GeV vis . The distribution of the reconstructed inelasticity y reco for events with hadronic shower energies of < < E 6 GeV 12 GeV had is shown in figure 89. As expected, the y reco distribution for NC events looks similar to the distribution for events with < < y 0.8 1 (see figure 83), but different to the other y ranges, leading to a separation power between shower-like events from and events. The effective volumes for the masked detectors with different vertical spacings are shown in figure 90 (top left). For all detector configurations a similar plateau value is reached, but the turn-on is less steep for smaller DOM density (larger vertical spacing). Assuming the   In figure 91 the resolutions for the different vertical spacings are summarised. The resolution on both the neutrino direction and energy deteriorates slightly. The performance for other shower-like neutrino events for different vertical spacings is similar as described previously.
4.4.6. Effect of variation in water/PMT properties and noise level on reconstruction performance. The reconstruction performances have been studied for a variation in water properties, PMT QE and optical background noise. For this purpose, the absorption and scattering lengths l abs and l scat have been changed by ±10%, while the QE has been changed by -10%-a fuller discussion of these parameters is given in section 3.4. To test the influence of the optical background, the single noise rate is increased from an already conservative 10 kHz to 20 kHz in the whole detector. Bioluminescence does not produce correlated noise apart from random coincidences and can be simulated by increasing single noise rates. 4.4.6.1. Effect of known parameter variations. It is assumed that the true water, PMT and noise properties are known so that they can be accounted for in the reconstruction. The trigger conditions are unchanged compared to the nominal values 67 and events are selected according to the same criteria as for the nominal values. This study has been performed for the detector with 6 m vertical spacing-similar effects are expected for larger spacings.
The energy and direction resolution for a known variation in water, PMT and noise properties is shown in figures 92 and 93 together with the performance for the nominal values. For all studied variations the direction resolution is unaffected, as the direction resolution is  67 For 20 kHz single noise rate the trigger rate from pure noise would be too high, so that the trigger conditions would have been tightened. However, the purpose of this study is to demonstrate the robustness of the reconstruction with respect to an increased noise rate. dominated by the intrinsic scattering angle and not by detector effects. The energy resolution deteriorates slightly for a lower number of detected photons, i.e. reduced l abs or QE.
For 20 kHz single noise rates the resolutions are as good as for 10 kHz, confirming the good S/N ratio due to small time windows in the hit selections (see section 4.4.2) allowed by the large scattering length in water.
The effective volumes are shown in figure 94. For all studied variations in water, PMT and noise properties a similar plateau value is reached, but the turn-on is less steep for less detected photons, i.e. reduced l abs or QE. For 20 kHz single noise rates the effective volume is only slightly lower compared to a 10 kHz noise rate.
The negligible deterioration in direction and energy resolution in conjunction with the relatively modest loss in effective volume for an increase in single noise rates by a factor of two 68 compared to the nominal assumed rate of 10 kHz demonstrates the robustness of the reconstruction against higher noise rates. Consequently, it is expected that the assumed performance can be achieved for most of the data taking time.
4.4.6.2. Effect of undetected parameter variations. While the direction and energy resolutions are unaffected, figure 95 depicts the ratio of the mean reconstructed energy for nominal and varied water and PMT properties. Variations of the same properties and magnitude as above have been used for this study, but the underlying assumption is now that the variation relative to the nominal values is not known and not accounted for in the reconstruction. An exemplary±10% variation in scattering length has a negligible effect on the mean reconstructed energy, while the same variation in the absorption length induces a corresponding shift in reconstructed energy of±8%. A decrease in QE of 10% results in a corresponding downward shift in energy of 10%.

Flavour identification and muon rejection
The determination of the NMH requires a precise estimate of the neutrino energy and zenith angle and a high-purity event sample. In addition, since neutrino events of all flavours are reconstructed, the discrimination between neutrino flavours is necessary. In this section an event type discrimination algorithm is developed and its performance is outlined. The algorithm is conceived with the distinction between three classes of events in mind. These classes are 'atmospheric muons', 'shower-like' and 'track-like' neutrino events. In particular the atmospheric muon event class is induced by the passage of downward-going muon bundles coming from a cosmic-ray air shower which is misreconstructed as upward-going, i.e. neutrino induced, event. Track-like events are those that are induced by charged current muon neutrino interactions, having the signature of a straight track passing through or nearby the instrumented volume. Finally, shower-like events are those coming from all other neutrino interaction channels and flavours: all NC interactions and the CC interactions of electron and tau neutrinos 69 . 4.5.1. Methodology. In order to optimally exploit the information imprinted in the light emission of the events, several machine learning algorithms, so called classifiers, have been evaluated. Finally, a classification algorithm known as random decision forest (RDF) [175] has been used in this study. 68 This is even a factor of 2.5 compared to the measured 8 kHz, see section 4.2. 69 Except for those roughly 18% of τ decays producing a muon.
A RDF consists of many decision trees that individually categorise an event into different classes. Each decision tree consists of several nodes. During classification a number of features, i.e. observables contributing discrimination power, are calculated for an event. At each node a decision in favour of a class is taken and the event is pushed to a child node according to the result of the decision. The individual node decisions in a tree are found by a cut on one of the calculated features. The cuts are chosen so that they maximise performance key figures such as the signal class purity. In this way the event is classified as more likely to be a track, a shower or an atmospheric muon. The decision process is repeated until the event reaches a leaf, a node without children, and the classification into one of the classes is finished.
A decision tree is trained on MC event data. A major disadvantage of single trees, however, is the low ability to generalise the trained tree, i.e. the ability to not only reproduce the features in the training sample. Several methods are proposed in the literature to improve the performance of single decision tree methods and we use the RDF approach. For an RDF many of the above described decision trees are trained simultaneously, a total of 101 in our study. Instead of using all features at once, for each tree a predefined fraction of features and events is selected. Finally, the classification is done by a majority decision of the trees as described above. The purity and efficiency of the classification can be set by defining cuts different from a simple 50% majority decision.

Event preselection.
Even if the detector will be located under more than 2000m of sea water, the number of atmospheric muons arriving at the detector and being triggered (see section 4.2.5) is larger than that of atmospheric neutrinos by several orders of magnitude. However, since the atmospheric muon flux is fully shielded by the Earth, looking at upward going events will allow to search for neutrinos. Nonetheless, Cherenkov photons from atmospheric muons can produce a hit pattern in the detector such that reconstruction algorithms still reconstruct the event as upward-going. A pre-selection of events is necessary before training the RDF, since in any case it would not be able to handle such a large contamination of atmospheric muons. Both the reconstruction strategies described in the previous section can produce a proper rejection of the atmospheric background without significantly reducing the amount of good neutrino events.
At first each event is requested to be reconstructed as upward going. This holds both for the track and the shower reconstruction algorithm. Then two different sets of quality criteria are applied, one for the muon and one for the shower reconstruction method. The logical 'OR' of the two chosen criteria is used to define the input sample for the RDF.
Concerning the shower reconstruction algorithm, a preliminary event selection is implicitly done in the reconstruction itself, see section 4.4.3. Through-going atmospheric muons release a large amount of light in the detector and can easily be separated from low  Number of hit DOMs within <  10 around the reconstructed shower direction and vertex 7 Median of time residual distribution of hits selected under a shower hypothesis 8 Ratio between number of selected hits for a track hypothesis and shower hypothesis 9 Ordinate intercept of a linear fit to cumulative time residual distribution with respect to a shower hypothesis 10 Bjorken y as reconstructed by the shower reconstruction energy neutrino showers. This is done by requiring a proper hit selection in the shower reconstruction itself. This is not the case for the muon track reconstruction algorithm, for which both the signal and the background events show the same hit topology. For bright reconstructed shower events it is required that the hit pattern is compatible with a point-like emission. Here, bright events are defined as events with more than 15 causally connected L2 hits (defined as in section 4.4.2) and the compatibility with a point-like emission is evaluated based on the time residuals of these L2 hits with respect to the reconstructed vertex. If the difference between the 80% and 20% quantiles of the time residuals is smaller than 15 ns, the event is considered as a shower event candidate. This requirement results in a preselection of shower event candidates and efficiently focuses the time-consuming part of the shower reconstruction to neutrino-like events. Additionally, the shower reconstruction algorithm provides many different event-by-event quality parameters, which provide further rejection power for atmospheric muons and are used as features in the RDF. As far as the track reconstruction is concerned, the Λ parameter described previously can provide a first rejection of atmospheric muons; however, acting on this parameter alone would also suppress a large part of the neutrino sample at lower energy if high purity is requested. Adding also the reconstructed track starting point information allows an improved rejection of wrongly reconstructed atmospheric muon tracks. Figures 96 and 97 show the distribution of the reconstructed track starting point for atmospheric muons and low energy ( n E < 20 GeV) atmospheric muon neutrinos. A variable n R , the radius of a 'fiducial cylinder', has been defined and, in combination with the Λ quantities, has been tested in order to achieve a preliminary selection cut. The chosen value for n R is equal to the radius of the instrumented volume, i.e. 106 m. The number of wrongly reconstructed atmospheric muons can be reduced by more than three orders of magnitude when applying a preliminary selection cut on n R and the track quality parameter Λ. 4.5.3. Classification input. As described above, a decision tree relies on cuts on observables, so called features, that are chosen to discriminate well between the different classes and that are calculated for each event. In the following, the best performing features are ranked according to their discrimination power and very briefly explained.
The ranking is done using the overall classification rate under a majority decision of 50%. For the ranking, the trees of the RDF are trained with each individual feature and the overall RDF performance is evaluated. In the next step the algorithm adds one more feature to the best one and does the training once more. This is done for every possible configuration. The best configurations are chosen to do more iterations in the same way and this process iterates as long as the performance increases. In table 13 the ranking of the best features used in the classification is listed. 4.5.4. Classification performance and results. In the following the performance of the classification algorithm is evaluated using all events passing the selection criteria shown above.
Definitions. It is desirable to maximise the number of correctly classified events for all channels. The following definition is used to evaluate the performance of the classification algorithm. The fraction of correctly classified events R A corr of class A is defined as the ratio of correctly classified events with succeeded reconstruction N A corr,rec with respect to the total number of events in this class N A all :   Figure 98 shows the result of the RDF classification. The fraction of correctly classified events per interaction channel is plotted versus the MC neutrino energy.
Here the majority vote of the RDF was set to 50% as this was the best compromise between all classes. Each colour depicts the result for neutrinos and antineutrinos of one flavour in one interaction channel.
The left plot in figure 98 shows the fraction of events classified as track-like for the different flavours and interaction channels versus the MC neutrino energy. The shown results are obtained for all events used in the last classification step. Classification results for CC tau neutrino interactions are shown without distinction between track-like and shower-like decay topologies of the resulting tau lepton 70 . The high energy range shows an expected increase in identification power for long-track muons from muon neutrinos undergoing a CC interaction. As can be seen, antineutrinos can be identified more easily than neutrinos. This is expected due to the different reaction inelasticities for neutrinos and antineutrinos. The fraction of interactions with a resulting shower signature wrongly identified as track-like falls below 20% above 10 GeV. Electron neutrinos undergoing a CC interaction are identified more easily as shower-like than NC reactions as they yield more light.
In the right plot the fraction of events recognised as showers is depicted. Most efficiently recognised are electron neutrino CC interactions. Above a neutrino energy of 15 GeV the fraction of correctly classified events reaches more than 90%. At 6 GeV the fraction reaches  70 Note that tau neutrino interactions have been excluded from the event set used for the RDF training. 85%. Charged current muon (anti-)neutrino events are falsely classified at a rate of 35% (15%) at 10 GeV.
The results for the configuration with a spacing of 9 m (figure 99) show a drop of around 5% in the identification power for track-like events. The general shape of the distribution remains. The fraction of misclassified shower events stays nearly the same. However, shower events need more energy now to result in a clear signature and successful classification as shower-like events. Therefore, the response curve is shifted by 5 GeV to higher energies. Figure 100 shows the particle identification performance for a detector configuration with 12 m vertical spacing. The identification power for the charged current muon neutrinos drops significantly. The fraction of misclassified shower events stays below 20%. Again a shift to higher energies of the response curve for shower-like events is observed.
The contamination of atmospheric muons in the neutrino sample, i.e. downward-going atmospheric muons which are reconstructed as up-going and classified either as neutrino induced tracks or as showers is of the order of a few percent. These wrongly identified muons have equal probability of ending up either in the 'showers' or 'tracks' sample. This surviving background is taken into account in the subsequent calculation of the ORCA sensitivity. 4.6. Sensitivity studies for the NMH 4.6.1. Global fit. This section describes the main mass hierarchy sensitivity calculation based on PEs and log likelihood ratios. It is divided into three parts. First, the modelling of the physics and detector is detailed. This model is used to calculate the expected event rates for given values of the oscillation parameters and systematics. Then, the statistical method for the mass hierarchy sensitivity calculation is described. Finally, an overview is given of the current results using this method. An independent study based on Asimov-sets is described at the end of this section. 4.6.2. Rate calculation. KM3NeT/ORCA's data will consist of observed event rates as a function of the reconstructed neutrino energy and zenith angle. By comparing these to the expected rates it will be possible to distinguish between the two mass hierarchy cases. The rate computation is separated into two parts. First the expected neutrino interaction rate at the detector site is calculated as a function of the true neutrino energy and zenith angle. Secondly, the response of the detector itself is modelled, leading to the rates of reconstructed events as a function of the reconstructed energy and zenith angle.
As shown in sections 4.3 and 4.4, KM3NeT/ORCA is sensitive to the inelasticity (Bjorken y), potentially adding a third dimension to the rate histograms. At the moment, the inelasticity is not yet included in the sensitivity study. Doing so will likely improve the mass hierarchy significance, due to its power to discriminate neutrinos and antineutrinos on a statistical basis.
The whole computation chain is summarised in figure 101. Each step is described in detail in the following paragraphs.
• R a is the interaction rate per unit volume at the detector site of (anti)neutrinos of flavour a as a function of the neutrino energy and direction. • The initial flavour b is summed over n e , n m , n ē and n m .
atm is the atmospheric neutrino flux for neutrinos of flavour b.
, osc is the oscillation probability for a neutrino passing through Earth.
• s a is the CC neutrino-nucleon cross section for a neutrino of flavour a.
A consistent binning is used throughout the calculation. The energy axis is binned linearly in E log 10 ( ) from 2 to 100 GeV in 40 bins. The zenith angle axis is binned linearly in q cos( ) from −1 to 0 in 40 bins, where we use the convention that q =cos 1 ( ) corresponds to vertically up-going neutrinos.
The atmospheric neutrino fluxes are modelled by the HKKM2014 simulations [176]. The given flux values are tabulated as a function of energy and zenith angle, averaged over the azimuth angle. In order to deal with the rather coarse binning the values are interpolated. To be more precise, a two-dimensional spline interpolation is made of the cumulative tables. The spline's derivatives then yield a bin-integral conserving interpolation of the flux tables. The chosen tables are for the Fréjus site (without mountain) at solar minimum, since the Fréjus site is expected to be most similar to the KM3NeT/ORCA detector site.
The oscillation probabilities depend on the mixing parameters (including the hierarchy) and the Earth density profile. They are calculated by evaluating the neutrino propagation time evolution operator in a constant density medium (see [177]) at small steps along the trajectory. The Earth's density profile is given by the preliminary reference Earth model (PREM) [147]. To speed up calculations the model is approximated by 42 constant-density shells. The electron density (an ingredient for the oscillation probability calculation) is approximated to be half of the nucleon density.
We use the CC and NC neutrino-nucleon cross sections from the GENIE MC generator [166,167] for an oxygen nucleus and two protons.
The results of the first half of the simulation chain (i.e. at intermediate result (c)) are eight histograms of neutrino interaction rates per unit volume at the detector as a function of the true energy and zenith angle: six for the CC interactions (three flavours, neutrinos and antineutrinos) and two for NC interactions of neutrinos and antineutrinos. Throughout the simulation, NC events are approximated as equal for all three flavours. 4.6.2.2. Detector-dependent part. The second part of the simulation chain models the detector response to neutrino interactions. Each step is based on the results presented in the previous sections of this document.
The energy-and zenith angle-dependent effective mass determines how many of the interacting events can be reconstructed. This is step (3)

≔ ( )
where N gen. is the total number of generated events in a large generation volume V gen. . Events that are successfully reconstructed by either one of the two reconstruction algorithms are counted in N sel. . The density of sea water r water is assumed to be 1025 kg m −3 . The effective mass is binned as a function of the true neutrino energy and zenith angle, and is evaluated for each of the eight event classes separately. At this step an additional histogram is created, representing the expected background from misreconstructed atmospheric muons. As shown in section 4.5, the contamination of such events can be effectively reduced to a few percent by applying cuts. Due to the high suppression efficiency it is increasingly difficult to generate high statistics samples for this type of background. Therefore, the distribution from a looser cut is used and rescaled to the total number of events found for a stricter cut. This is a conservative estimate as for looser cuts the event distribution turns out to be mostly centred around our signal area (up-going, around 10 GeV) while the distribution becomes more uniform as we apply stricter cuts.
At the end of this step there are a total of nine histograms. The next step ((4) in figure 101) is particle identification. Each input histogram is the basis for two new ones, representing events identified as 'tracks' and 'showers', respectively. The identification probabilities are based on the RDF study described in section 4.5, and depend on the true neutrino energy only. Neutrino events identified as atmospheric muons are discarded. The probability for atmospheric muon background events to be identified as a track/shower has not been determined due to lack of statistics; a simple 50/50 separation is applied. After this step we have eighteen histograms.
In the final step ((5) in figure 101) the energy resolutions and angle resolutions are applied. They are implemented as response matrices filled from simulated data. First the zenith angle is 'smeared out' using a three-dimensional response matrix that provides binned q cos reco ( )-distributions as a function of q cos true ( ) and E true . Then a two-dimensional energy response matrix providing E reco -distributions as a function of E true is used to smear the energy.
The resolutions are evaluated separately for each of the sixteen neutrino event classes. We have, for example: • n m CC interactions identified as tracks; • n ē CC events identified as showers; • NC ν events identified as showers; • n e CC events misidentified as tracks; • K Each is smeared using dedicated response matrices. In particular, neutrinos and antineutrinos are smeared differently. So are correctly and wrongly identified events.
The response matrices use a coarser binning than the rate histograms. Depending on the available MC statistics, the number of bins is reduced from 40 to 20 or 10 to avoid artefacts.
After reconstruction all histograms are combined in two final event histograms representing the track channel and the shower channel, respectively. 4.6.3. Sensitivity calculation. The sensitivity to the mass hierarchy is calculated using likelihood ratio distributions from PEs. The procedure works as follows: (1) Pick a set of true values for the oscillation parameters and other systematics.
(2) Calculate the expected number of events for a given period of data taking, using the simulation chain described above. (3) Generate pseudo-data by randomly drawing a detected number of events for each bin based on Poisson statistics. The two histograms thus attained constitute the PE. (4) Find the best-fit likelihoods  NH and  IH for the NH and IH assumption, by maximising the likelihood with respect to the other free parameters in both cases. is the Poisson probability to observe n events when the expectation value is λ. The expected event numbers m i depend on the parameter values (oscillation parameters and systematics) so that maximising the likelihood corresponds to finding the parameter values that best fit the PE. The default parameter settings are summarised in table 14. It shows the true parameter values used to generate PEs. Most of these are fixed at some nominal value. The initial values are those used as starting values by the minimiser in the fitting procedure that finds the likelihood maximum. These values are chosen randomly for each PE to avoid systematic biases. Most parameters are fitted, meaning they are left free in the minimiser. The likelihood is multiplied by Gaussian priors for some parameters (see table 14). The mean and width of the Gaussian priors correspond to those of the matching initial value distributions. Two parameters (q 12 and dm 2 ) are treated as nuisance parameters. This means that, rather than leaving them free in the fit, a random 'best fit' value from the initial value distribution is assigned to each PE. It emulates the fact that these parameters will be constrained almost exclusively by external measurements.
The first six parameters listed in

≔ ( )
The last five entries in the table are systematics. The overall flux factor and NC scaling simply scale the total number of (NC) events by an energy-and zenith-independent factor. The skew parameters introduce an additional asymmetry in the ratio of one event type to the other, while conserving the total number of events. They relate to the ratio of neutrinos to antineutrinos and the ratio of μ-flavour events to e-flavour events. Finally, the energy slope α introduces an energy-dependent scaling of the number of events of the form a E . Because q 23 generally has two likelihood maxima, special steps are taken to avoid ending up in the wrong one. The likelihood maximisation is repeated several times, starting from a different q 23 value each time. Only the best-fit result is considered for either hierarchy.
The distributions for each parameter are uncorrelated and based on the current world uncertainties [93,95,178]. The final figure of merit is the median significance, computed by comparing LLR distributions for true NH and true IH PEs. An example is shown in figure 102. The further these two distributions are apart, the higher the significance. We quote the significance with Figure 104. The mass hierarchy sensitivity for the true normal (left) and inverted (right) hierarchy. The horizontal axis indicates the true value of q 23 . The vertical axis indicates the 'alternative' value of q 23 : the value belonging to the hypothesis that is being rejected. The diagonal dashed lines indicate the position where the alternative q 23 is the same as the true one. Along these lines the mass hierarchy sensitivity according to the original method can be read off. The solid red and blue lines show the most likely alternative value for each true q 23 . They are the most likely value when fitting q 23 under the wrong hierarchy assumption. Along these lines the mass hierarchy sensitivity according to the new method can be read off. Figure 105. Comparison of the mass hierarchy sensitivity calculated using the old method (dashed lines) and the new method (solid lines). In the former, the significance is calculated to reject the other hierarchy at the same q 23 , whereas in the latter the alternative hypothesis has a different q 23 . The differences are rather small, but there is a noticeable decrease in the second octant IH mass hierarchy sensitivity. This is for the 9 m spacing and three years of operation time, using the default settings Initially, the mass hierarchy sensitivity was calculated by comparing LLR distributions generated with identical true oscillation parameter values (other than the hierarchy). However, this approach does not take into account the strong correlation between the measurement of q 23 and the hierarchy. From simulations it follows that the bestfit value of q 23 depends strongly on the assumed hierarchy. In many cases, the best-fit values for the two hierarchy assumptions are not in the same octant. In the actual measurement we therefore have to distinguish between two cases: NH with some best-fit value q 23 NH and IH with a different best-fit value q 23 IH . Since the two values can be very far apart, the mean and width of the corresponding LLR distributions can be noticeably different, leading to a different mass hierarchy sensitivity. Note that this effect does not occur for the other parameters, which typically have very similar best-fit values for the two hierarchy assumptions.
To take this effect into account the following procedure was adopted. For each true hypothesis (true hierarchy TH with q 23,true ) the most likely alternative hypothesis (other hierarchy OH with q 23,alt ) is determined from the q 23 best-fit distribution of PEs generated with the true hypothesis and fitted assuming the OH. The median significance to reject the alternative hypothesis is then calculated: A technical issue arises because the LLR distributions were only simulated for certain given values of q 23 , while the alternative hypothesis q 23 ʼs can take any value. To overcome this we notice that the LLR distributions' fitted widths and means as a function of q 23 look rather smooth, so that they can be reasonably approximated by interpolating between the already calculated values. This is shown in figure 103. This method enables us to calculate the mass hierarchy sensitivity for any value of the true and alternative q 23 . Figure 104 illustrates the values of the alternative q 23 and the effect on the mass hierarchy sensitivity. All the LLRmethod mass hierarchy sensitivity results in this document are produced using this method, unless explicitly stated otherwise. Figure 105 shows the effect of the new method on the mass hierarchy sensitivity. 4.6.5. Results. Figure 106 shows the latest mass hierarchy significance plot. The expected significance depends strongly on the true value of q 23 and d CP . Without CP-violation, the NMH can be measured with more than 3σ in three years at the current world best fit values of q 23 4.6.6. Spacing studies. Whereas the LLR method (described above) provides the most accurate description of the planned experiment, its application to certain problems might sometimes be prohibitive due to the large number of PEs to be generated. Therefore a simplified approach is used to answer dedicated questions. The starting point is again the set of two histograms (for tracks and showers) in the reconstructed quantities q E , reco reco . In each bin i, the expected number of events m i TH ( ) for a given TH hypothesis is calculated. A c 2 minimisation is performed assuming the WH marginalising over the parameters given in table 14. Contrary to the description in table 14, ( ) It has been verified, that the results obtained with this method are generally rather close to those from the full LLR treatment. The simplified method is used to optimise the vertical distance of the DOMs on the DUs. Whereas the horizontal spacing between DUs is determined by deployment constraints (20 m distance between DUs is considered a minimum), the vertical distance is a free parameter with little constraints from a technical point of view. Simulations have been performed with DOM distances of 6, 9 and 12 m. The detector performance for these different setups have been shown before. Figure 107 shows the expected NMH sensitivity after three years of data taking for both hierarchy hypotheses as function of the true mixing angle q 23 . An optimal distance is found close to 9 m, as both for 6 and 12 m the NMH sensitivity degrades, at least in some regions of the parameter space.
4.6.7. Measurement of ΔM 2 and θ 23 . The derivation of measurement contours for the oscillation parameters is done as well with the simplified procedure, which had been used already for the spacing study. The same set of nuisance parameters is applied. Optionally an energy scale shift is added as additional systematic uncertainty. It is implemented as a free scaling of the neutrino energy in all detector related distributions such as effective mass, particle identification, angular and energy resolution. All nuisance parameters are fitted unconstrained, i.e. without priors. Both DM 2 and q 23 are determined under the assumption that the correct NMH has been already identified. The s 1 measurement contours obtained after three years of data taking for three test points ) are shown on figure 108. They are compared to current world Figure 108. Measurement precision in DM 2 and q sin 2 23 after three years of data taking with ORCA with (solid red) and without (dashed red) energy scale uncertainty for three test points compared to present results from MINOS(black) [179] and T2K(blue) [104] and predicted performance of NOvA(magenta) [180] and T2K(blue, dashed) [104] in 2020. All contours are at s 1 , left for NH, right IH.
best measurements [104,179] as well as to extrapolations of final results from NOvA and T2K [104,180], to be expected around 2020. For T2K, the extrapolation is performed by exploiting the published likelihood shape of the present measurement [104] assuming the planned complete beam exposure of 7.8 10 21 protons on target. A precision of 3% in DM 2 is reached after three years which can be reduced to 2% when suppressing the energy scale uncertainty. The precision in q 23 varies between 4% and 10%, depending on its true value and the NMH.
4.6.8. Systematic uncertainties. A substantial list of possible uncertainties is already taken into account while fitting the NMH by marginalising over the related nuisance parameters, as indicated in table 14. Some of these parameters-such as q 23 and DM 2 -can be determined together with the NMH with high accuracy, as shown above.
It is crucial to determine reliable priors for the chosen nuisance parameters. The currently used priors are listed as well in table 14. However, it has been verified that loosening the prior conditions or even totally suppressing them has only a small impact on the final NMH sensitivity. Therefore, in future studies some of them might by treated as unconstrained fit parameters, i.e. without priors.
Contributions to the uncertainties come from the neutrino flux [176], cross section [159] and from the detector performance. For the latter, a main contribution is expected from the uncertainty in the photon detection efficiency by the PMTs and the related readout electronics. However, as demonstrated in ANTARES and also with the KM3NeT prototype module [5], the measurement of 40 K coincidences between adjacent PMTs of the same DOM allows the photon detection efficiency to be monitored in real time with high precision. The variable nature of optical noise due to bioluminescence is controlled by sampling it for each individual PMT with a frequency of 10Hz. The results of these measurements are directly injected into the simulation, as is done in ANTARES. This excludes bioluminescence as a source of systematic uncertainty of any measurement. Apart from the optical noise due to bioluminescence, sea water is a very stable and homogeneous medium, as monitored over many years by ANTARES. Current knowledge of its light propagation properties are discussed in section 3.4. The residual uncertainties of quantities such as absorption and scattering length have less effect (due to the closer spacing) for ORCA than for ARCA, and are well-covered by the nuisance parameters discussed above, so no separate investigation has been performed.
Additional systematic uncertainties, not yet included in the present study, comprise systematic shifts in the reconstructed energy and zenith angle. These will be considered in the near future. However, it is believed that the energy scale is well-constrained through the knowledge of the absolute PMT efficiency and the water parameters. The angular resolution of neutrino telescopes in sea water is excellent, and it remains better than 10°down to the energies relevant for the NMH determination. Systematic angular offsets are at most in the sub-degree region, as shown by the study of the moon shadow in the CR signal in ANTARES. A deterioration of the angular resolution due to the movement of the detector elements in the sea current is excluded by permanently re-calibrating them via an acoustic positioning system. Such a system provides a precision of better than about 10 cm for all detector elements, which makes its influence on the angular resolution of reconstructed neutrino events negligible.
The energy and angular resolutions are a crucial input to the sensitivity calculation. Both are estimated from simulations which are subject to uncertainties on their own. These can be parametrised by applying a scaling (i.e. broadening or narrowing) to these resolution functions, as is planned for the near future.
Finally, an independent study has been performed to study the variations of the Earth model [147] on the NMH sensitivity. Both the thickness and density of each individual layer have been varied within the tolerance of the model, as well as the sharpness of the layer boundaries. The impact of these variations is found to be negligible for the present study and is therefore ignored as a relevant systematic effect.

Outlook
The previous sections provide details on the performances of the KM3NeT/ORCA detector in establishing the neutrino mass ordering and improving the precision on the oscillation parameters in the atmospheric sector. While the results obtained so far rely on full MC studies and incorporate the leading systematic effects, possible refinements have already been identified and will be scrutinised in the near future. This includes notably the usage of the achieved sensitivity to the interaction inelasticity as a statistical tool to discriminate between neutrino and antineutrinos and to further reject the background from NC interactions. As a detailed and ongoing investigation shows, see [160] for first results, event-by-event fluctuations intrinsic to the development of the hadronic system resulting from a neutrino interaction on a nucleon or nucleus limit the achievable resolutions on direction, energy and inelasticity for a given detector geometry. However, some improvements, in particular for the reconstruction of the reaction inelasticity, can still be expected to arise from the development of new reconstruction strategies. Envisaged lines of work comprise e.g. an attempt to identify the leading particle in the hadronic system to constrain its overall momentum, a combined fit of the hadronic shower and the charged lepton in CC interactions, improved energy estimation techniques, etc.
In addition to the oscillation measurements discussed in the previous section, the size and energy range covered by the KM3NeT/ORCA detector allow for the search of CC interactions of tau neutrinos produced in the oscillation of atmospheric electron and muon neutrinos. While these events can hardly be distinguished on an event-by-event basis, their presence could be revealed by a statistical excess of cascade-like events over the baseline from atmospheric NC interactions and electron neutrino CC interactions. This effect is expected to be seen with high confidence level and statistical power within the first years of operation, but a precise study remains to be carried out. The energy and flavour distributions of observed events in the ORCA detector could in principle also reveal sizeable discrepancies from expectations due to non-standard physics interactions (NSIs) [181,182]. While strong deviations from expectations (e.g. enhanced CP violation effects) might deteriorate the sensitivity to the NMH, a more likely scenario is that KM3NeT/ORCA will be able to invalidate many NSI processes.
The studies presented here indicate that the current unknown value of the Dirac CP violating phase in the neutrino sector mildly impacts the sensitivity to the neutrino mass ordering. However, the knowledge of the mass ordering could reversely bring sensitivity to the CP phase in the (0.2-1) GeV regime [141]. This would imply a denser instrumentation than what is currently envisaged for ORCA, but considering the importance of measuring the CP phase, sensitivity studies could be performed for a further step of the ORCA project. In the same spirit, sensitivity studies for both the NMH and the CP phase have been proposed relying on a putative upgraded neutrino beam to be sent to ORCA from Protvino [183,184]. Such a strategy would in particular allow for a confirmation of ORCA-only results on the NMH with high statistical power on a short (< 1 yr) timescale [185]. It would require a new beam-line to be setup but would offer the advantage to rely on an already built detector.
With its low energy threshold the KM3NeT/ORCA detector offers the possibility to extend searches started with ANTARES (e.g. [77,79]) (and likely to be pursued with KM3NeT/ARCA as well) for extra-terrestrial neutrinos as a signature of the presence of dark matter in the centre of the Earth, the Sun and the central region of the Galaxy for which the detector is particularly well located. The low energy threshold of KM3NeT/ORCA is particularly well suited to constrain low-mass weakly interacting massive particle dark matter models. All neutrino flavours could be used for such studies, considering the encouraging first performances in the shower reconstruction channel.
GeV neutrinos are also likely to be emitted by several classes of astrophysical objects like low-energy GRBs [186] or colliding wind binaries [187]. Another promising topic is the ability of KM3NeT/ORCA to detect neutrinos from supernovae (SN) explosions. The use of segmented optical modules closely placed to one another indeed offers new detection capabilities: asking for coincidences of many phototubes on individual storeys is expected to strongly reduce the optical background potentially providing high sensitivity to SN up to few tens of kpc. These results will possibly be incorporated in an update of the present document, together with the prospects for several other physics studies that can be undertaken with KM3NeT/ORCA. These span a wide range of scientific fields, including the Earth and Sea Sciences which are not addressed here but are part of the scientific scope of deep-sea neutrino observatories. As an example, a detailed study of the neutrino energy and angular distributions could provide tomographic information on the electron density [188][189][190], and thus on the composition, of the different Earth layers traversed. Such an approach is complementary to the standard methods used in geophysics, which do not univocally constrain the chemical composition of the Earth, in particular of its innermost layers (mantle and core).

Organisation
KM3NeT federates and unifies the various smaller European efforts in the field of Neutrino Astronomy. The process of convergence was supported by an EU funded Design Study (2008-2009) and Preparatory Phase (2008-2012). The KM3NeT consortium has now formed a collaboration with an elected management. The funding agencies (or funding authorities) involved have installed the Resources Review Board (RRB) which oversees the project. The RRB is advised by an international Scientific and Technical Advisory Committee (STAC). A project organisation is setup with the objective to implement the first phase (Phase-1) of the KM3NeT Research Infrastructure. To this end, a MoU, covering the total available budget of about €31M, has been signed by the members of the RRB. The purpose of this MoU is to define the programme of work to be carried out for this phase and the distribution of charges and responsibilities among the Parties and Institutes for the execution of this work. The MoU sets out (i) the organisational, managerial and financial guidelines to be followed by the collaboration, (ii) the external scientific and technical review processes and (iii) the user access policy. At present, the collaboration consists of more than 240 persons from 52 institutes. The first phase has already started and comprises the final prototyping and preproduction, engineering, construction, calibration, transportation, assembly, installation and commissioning of the elements which form the basis of the KM3NeT neutrino detector and the seafloor and shore station infrastructures as well as the operation of the installed neutrino detectors. The installation is proceeding in two places, off-shore Toulon, France and off-shore Capo Passero, Italy. A third suitable site is available off-shore Pylos, Greece. The construction of the Phase-1 detector has already started with the successful deployment of the first string off-shore Capo Passero and will be completed by 2017.
The Collaboration will offer open access for external users to the KM3NeT Research Infrastructure (Article 15 of the MoU). The KM3NeT Research Infrastructure will also provide user ports for continuous Earth and Sea science measurements in the deep-sea environment. The needs for the Earth and Sea sciences are partly incorporated in the present KM3NeT MoU and other needs will be detailed in designated MoUs between KM3NeT and individual Earth and Sea science groups or more generally with EMSO.
The Phase 1 MoU is a first step towards the intended establishment of a European Research Infrastructure Consortium (ERIC). The collaboration has agreed to host the KM3NeT ERIC in The Netherlands. The neutrino signal recently reported by IceCube has led the KM3NeT Collaboration to propose an intermediate phase (i.e. Phase-2.0). The required actions for the next phase(s) are being taken which include the preparation of requests for additional ERDF funds in France, Italy and Greece as well as requests for national funds. Other support options, e.g. within the framework of Horizon 2020, will also be explored.
Recently, the KM3NeT and ANTARES Collaborations have agreed to organise each general assembly jointly (typically 3-4 times per year). This agreement fosters the scientific progress and the exchange of know-how and limits travel times and expenses. Following a sequence of joint meetings between ANTARES (Mediterranean Sea), IceCube (South Pole), Lake Baikal (Russia) and KM3NeT (Mediterranean Sea), a MoU for a Global Neutrino Network (GNN) has been signed on 15 October 2013 by the representatives of each project. This step formalises the active collaboration between these projects. Once infrastructures of similar scale are operational on the three continents, the stated aim of the GNN is a worldwide Global Neutrino Observatory.

Data policy
The KM3NeT Collaboration has developed a data policy based on the research, educational and outreach goals of the facility. The first exploitation of the data is granted to the collaboration members as they build, maintain and operate the facility and to priority users. Accordingly, each collaboration member has full access rights to all data, software and knowhow. Access for non-members is restricted, as long as methods and results have not yet been published. The prompt dissemination of scientific results, new methods and implementations is a central goal of the project, as is education. High-level data (event information enriched with quality information) will be published after an embargo time of two years under an open access policy on a web-based service. Exceptional access rights that correspond to these goals can be granted.
The Collaboration has developed measures to ensure the reproducibility and usability of all scientific results over the full lifetime of the project and in addition 10 years after shutdown. Low-level data (as recorded by the experiment) and high-level data will be stored in parallel at central places. A central software repository, central software builds and operation system images are provided and continuously maintained until the end of the experiment.
The storage and computing needs of the KM3NeT project are highly advanced. The Collaboration has developed a data management plan and a corresponding computing model to answer those needs. The latter is based on the LHC computing models utilising a hierarchical data processing system with different layers (tiers). Data are stored on two main storage centres (CCIN2P3-Lyon, CNRS and CNAF, INFN); those large data centres are fully interfaced with the major European e-Infrastructures, including GRID-facilities (ReCaS, HellasGRID provide resources to KM3NeT). The main node for processing of the neutrino telescope data is the computer centre in Lyon (CCIN2P3-Lyon). A corresponding long-term and sustainable commitment has already been made by CNRS, which is consistent with the needs for long-term preservation of the data. A specialised service group within the Collaboration will process the data from low-level to high-level and will provide data-related services (including documentation and support on data handling) to the Collaboration and partners. WAN (GRID) access tools (e.g. xrootd, iRODS, and gridFTP) provide the access to high-level data for the Collaboration. The analysis of these data will be pursued at the local e-Infrastructures of the involved institutes (both local and national). The chosen data formats allow for the use of common data analysis tools (e.g. the ROOT data analysis framework) and for integration into e-Infrastructure common services.
The central services are mainly funded through CNRS and INFN that have pledged resources of their main computing centres to the project. Additional storage space and its management are provided by the partner institutes (e.g. INFN has provided 500 TB of disk space for KM3NeT at the ReCaS GRID infrastructure, the Hellenic Open University has pledged 100 TB of disc space and 300 cores to the project).
In addition to the major storage, networking and computing resources provided by the partner institutions and their computing centres, grid resources have been pledged and will be used by KM3NeT (ReCaS, HellasGRID). These will provide significant resources to be used for specialised tasks (as e.g. for special simulation needs). The major resources, however, will be provided by the partners. External services are employed to integrate the KM3NeT e-Infrastructure into the European context of the GRID-in the fields of data management, security and access; services will be implemented in collaboration with EGI.
One of the aims of the KM3NeT data management plan is to play an active role in the development and utilisation of e-Infrastructure commons. KM3NeT will therefore contribute to the development of standards and services in the e-Infrastructures both in the specific research field and in general. In the framework of the GNN, KM3NeT will cooperate with the ANTARES, IceCube and GVD collaborations to contribute to the open science concept by providing access to high-level data and data analysis tools, not only in common data analyses but also for use by citizen scientists.
In the framework of the ASTERICS project, KM3NeT will develop an interface to the Virtual Observatory including training tools and training programmes to enhance the scientific impact of the neutrino telescope and encourage the use of its data by a wide scientific community including interested citizen scientists. Data derived from the operation of the experiment (acoustics, environmental monitoring) will be of interest also outside of the field. Designated documentation and courses for external users will therefore be put in place to facilitate the use of the repositories and tools developed and used by the KM3NeT Collaboration.

Cost and time schedule
The investment budget for the construction of the first phase (Phase-1) of the KM3NeT research infrastructure, which is fully funded, amounts to about €31 M. During 2015-2017, 31 strings equipped with 558 optical modules will be assembled and deployed at the French and Italian sites. The overall size of the initial Phase-1 arrays corresponds to about 0.2 building blocks.
The next phase (i.e. KM3NeT 2.0) comprises a complete ARCA and ORCA detector, consisting of 2 and 1 building blocks, respectively. The additional budget for KM3NeT 2.0 is estimated at €95 M. The cost estimates of KM3NeT 2.0 are based on the actual prices of Phase-1 and thus can be considered accurate.
The breakdown of the cost amongst the major items is illustrated in figure 109. They are consistent with the estimations stated in the KM3NeT Technical Design Report published in 2011 and represent a factor of four cost reduction compared to that previously achieved for the ANTARES detector. The cost of a single KM3NeT string is about €230 k, an additional €90 k is needed for the interlink cable, the string deployment and the ROV connection.
Once the funds for Phase-2.0 are available, the array could be constructed within three years. Thus, if funds were forthcoming in 2017, the full array could be completed in 2020. Note that physics studies would already be possible as the array is being constructed, thus reducing the overall time needed to obtain a specified precision. The cost for operation and decommissioning of the infrastructure have been evaluated and amount to about €2 Mper year and €5 M, respectively. Hence, the total cost for 10 years of operation and decommissioning of the KM3NeT 2.0 infrastructure adds about 25% to the total budget.