Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube PDF
Document Details
2022
R. Abbasi et al
Tags
Summary
This research paper details the application of graph neural networks to the IceCube neutrino observatory for event reconstruction, focusing on the low-energy region from 1 to 30 GeV. It highlights the improvements in classification and reconstruction accuracy in various parameters, compared to previous maximum likelihood techniques, and showcases its potential for real-time analysis.
Full Transcript
Journal of Instrumentation PAPER OPEN ACCESS You may also like - The ATLAS Fast TracKer system Graph Neural Networks...
Journal of Instrumentation PAPER OPEN ACCESS You may also like - The ATLAS Fast TracKer system Graph Neural Networks for low-energy event The ATLAS collaboration, G. Aad, B. Abbott et al. classification & reconstruction in IceCube - All-sky search for long-duration gravitational wave transients in the first Advanced LIGO observing run To cite this article: R. Abbasi et al 2022 JINST 17 P11003 B P Abbott, R Abbott, T D Abbott et al. - Beam test performance of a prototype module with Short Strip ASICs for the CMS HL-LHC tracker upgrade The Tracker Group of the CMS View the article online for updates and enhancements. collaboration]Corresponding author: Marc Osherson., W. Adam, T. Bergauer et al. This content was downloaded from IP address 101.86.164.2 on 15/02/2023 at 09:31 Published by IOP Publishing for Sissa Medialab Received: September 8, 2022 Accepted: October 17, 2022 Published: November 4, 2022 Graph Neural Networks for low-energy event classification & reconstruction in IceCube The IceCube collaboration 2022 JINST 17 P11003 R. Abbasi,16 M. Ackermann,62 J. Adams,17 N. Aggarwal,24 J.A. Aguilar,11 M. Ahlers,21 M. Ahrens,52 J.M. Alameddine,22 A.A. Alves Jr.,30 N.M. Amin,42 K. Andeen,40 T. Anderson,58,59 G. Anton,25 C. Argüelles,13 Y. Ashida,38 S. Athanasiadou,62 S. Axani,14 X. Bai,48 A. Balagopal V.,38 M. Baricevic,38 S.W. Barwick,29 V. Basu,38 R. Bay,7 J.J. Beatty,19,20 K.-H. Becker,61 J. Becker Tjus,10 J. Beise,60 C. Bellenghi,26 S. Benda,38 S. BenZvi,50 D. Berley,18 E. Bernardini,46 D.Z. Besson,33 G. Binder,7,8 D. Bindig,61 E. Blaufuss,18 S. Blot,62 F. Bontempo,30 J.Y. Book,13 J. Borowka,63 C. Boscolo Meneguolo,46 S. Böser,39 O. Botner,60 J. Böttcher,63 E. Bourbeau,21 J. Braun,38 B. Brinson,5 J. Brostean-Kaiser,62 R.T. Burley,1 R.S. Busse,41 M.A. Campana,47 E.G. Carnie-Bronca,1 C. Chen,5 Z. Chen,53 D. Chirkin,38 K. Choi,54 B.A. Clark,23 L. Classen,41 A. Coleman,42 G.H. Collin,14 A. Connolly,19,20 J.M. Conrad,14 P. Coppin,12 P. Correa,12 S. Countryman,44 D.F. Cowen,58,59 R. Cross,50 C. Dappen,63 P. Dave,5 C. De Clercq,12 J.J. DeLaunay,57 D. Delgado López,13 H. Dembinski,42 K. Deoskar,52 A. Desai,38 P. Desiati,38 K.D. de Vries,12 G. de Wasseige,35 T. DeYoung,23 A. Diaz,14 J.C. Díaz-Vélez,38 M. Dittmer,41 H. Dujmovic,30 M.A. DuVernois,38 T. Ehrhardt,39 P. Eller,26 R. Engel,30,31 H. Erpenbeck,63 J. Evans,18 P.A. Evenson,42 K.L. Fan,18 A.R. Fazely,6 A. Fedynitch,56 N. Feigl,9 S. Fiedlschuster,25 A.T. Fienberg,59 C. Finley,52 L. Fischer,62 D. Fox,58 A. Franckowiak,10 E. Friedman,18 A. Fritz,39 P. Fürst,63 T.K. Gaisser,42 J. Gallagher,37 E. Ganster,63 A. Garcia,13 S. Garrappa,62 L. Gerhardt,8 A. Ghadimi,57 C. Glaser,60 T. Glauch,26 T. Glüsenkamp,25 N. Goehlke,31 J.G. Gonzalez,42 S. Goswami,57 D. Grant,23 S.J. Gray,18 T. Grégoire,59 S. Griswold,50 C. Günther,63 P. Gutjahr,22 C. Haack,26 A. Hallgren,60 R. Halliday,23 L. Halve,63 F. Halzen,38 H. Hamdaoui,53 M. Ha Minh,26 K. Hanson,38 J. Hardin,14,38 A.A. Harnisch,23 P. Hatch,32 A. Haungs,30 K. Helbing,61 J. Hellrung,63 F. Henningsen,26 L. Heuermann,63 S. Hickford,61 C. Hill,15 G.C. Hill,1 K.D. Hoffman,18 K. Hoshina,38,𝑎 W. Hou,30 T. Huber,30 K. Hultqvist,52 M. Hünnefeld,22 R. Hussain,38 K. Hymon,22 S. In,54 N. Iovine,11 A. Ishihara,15 M. Jansson,52 G.S. Japaridze,4 M. Jeong,54 M. Jin,13 B.J.P. Jones,3 D. Kang,30 W. Kang,54 X. Kang,47 A. Kappes,41 D. Kappesser,39 L. Kardum,22 T. Karg,62 M. Karl,26 A. Karle,38 U. Katz,25 M. Kauer,38 J.L. Kelley,38 A. Kheirandish,59 K. Kin,15 J. Kiryluk,53 S.R. Klein,7,8 A. Kochocki,23 R. Koirala,42 H. Kolanoski,9 T. Kontrimas,26 L. Köpke,39 C. Kopper,23 D.J. Koskinen,21 P. Koundal,30 M. Kovacevich,47 M. Kowalski,9,62 T. Kozynets,21 E. Krupczak,23 E. Kun,10 N. Kurahashi,47 N. Lad,62 C. Lagunas Gualda,62 M.J. Larson,18 F. Lauber,61 J.P. Lazar,13,38 J.W. Lee,54 K. Leonard,38 A. Leszczyńska,42 M. Lincetto,10 a Also at Earthquake Research Institute, University of Tokyo, Bunkyo, Tokyo 113-0032, Japan. c 2022 The Author(s). Published by IOP Publishing Ltd on behalf of Sissa Medialab. Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this https://doi.org/10.1088/1748-0221/17/11/P11003 work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Q.R. Liu,38 M. Liubarska,24 E. Lohfink,39 C. Love,47 C.J. Lozano Mariscal,41 L. Lu,38 F. Lucarelli,27 A. Ludwig,23,34 W. Luszczak,38 Y. Lyu,7,8 W.Y. Ma,62 J. Madsen,38 K.B.M. Mahn,23 Y. Makino,38 S. Mancina,38 W. Marie Sainte,38 I.C. Mariş,11 S. Marka,44 Z. Marka,44 M. Marsee,57 I. Martinez-Soler,13 R. Maruyama,43 T. McElroy,24 F. McNally,36 J.V. Mead,21 K. Meagher,38 S. Mechbal,62 A. Medina,20 M. Meier,15 S. Meighen-Berger,26 Y. Merckx,12 J. Micallef,23 D. Mockler,11 T. Montaruli,27 R.W. Moore,24 R. Morse,38 M. Moulai,38 T. Mukherjee,30 R. Naab,62 R. Nagai,15 U. Naumann,61 A. Nayerhoda,46 J. Necker,62 M. Neumann,41 H. Niederhausen,23 M.U. Nisa,23 S.C. Nowicki,23 A. Obertacke Pollmann,61 M. Oehler,30 B. Oeyen,28 A. Olivas,18 R. Orsoe,26 J. Osborn,38 E. O’Sullivan,60 H. Pandya,42 D.V. Pankova,59 N. Park,32 G.K. Parker,3 E.N. Paudel,42 L. Paul,40 C. Pérez de los Heros,60 2022 JINST 17 P11003 L. Peters,63 T.C. Petersen,21 J. Peterson,38 S. Philippen,63 S. Pieper,61 A. Pizzuto,38 M. Plum,48 Y. Popovych,39 A. Porcelli,28 M. Prado Rodriguez,38 B. Pries,23 R. Procter-Murphy,18 G.T. Przybylski,8 C. Raab,11 J. Rack-Helleis,39 M. Rameez,21 K. Rawlins,2 Z. Rechav,38 A. Rehman,42 P. Reichherzer,10 G. Renzi,11 E. Resconi,26 S. Reusch,62 W. Rhode,22 M. Richman,47 B. Riedel,38 E.J. Roberts,1 S. Robertson,7,8 S. Rodan,54 G. Roellinghoff,54 M. Rongen,39 C. Rott,51,54 T. Ruhe,22 L. Ruohan,26 D. Ryckbosch,28 D. Rysewyk Cantu,23 I. Safa,13,38 J. Saffer,31 D. Salazar-Gallegos,23 P. Sampathkumar,30 S.E. Sanchez Herrera,23 A. Sandrock,22 M. Santander,57 S. Sarkar,24 S. Sarkar,45 M. Schaufel,63 H. Schieler,30 S. Schindler,25 B. Schlueter,41 T. Schmidt,18 J. Schneider,25 F.G. Schröder,30,42 L. Schumacher,26 G. Schwefer,63 S. Sclafani,47 D. Seckel,42 S. Seunarine,49 A. Sharma,60 S. Shefali,31 N. Shimizu,15 M. Silva,38 B. Skrzypek,13 B. Smithers,3 R. Snihur,38 J. Soedingrekso,22 A. Søgaard,21 D. Soldin,31 C. Spannfellner,26 G.M. Spiczak,49 C. Spiering,62 M. Stamatikos,20 T. Stanev,42 R. Stein,62 T. Stezelberger,8 T. Stürwald,61 T. Stuttard,21 G.W. Sullivan,18 I. Taboada,5 S. Ter-Antonyan,6 W.G. Thompson,13 J. Thwaites,38 S. Tilav,42 K. Tollefson,23 C. Tönnis,55 S. Toscano,11 D. Tosi,38 A. Trettin,62 C.F. Tung,5 R. Turcotte,30 J.P. Twagirayezu,23 B. Ty,38 M.A. Unland Elorrieta,41 K. Upshaw,6 N. Valtonen-Mattila,60 J. Vandenbroucke,38 N. van Eijndhoven,12 D. Vannerom,14 J. van Santen,62 J. Vara,41 J. Veitch-Michaelis,38 S. Verpoest,28 D. Veske,44 C. Walck,52 W. Wang,38 T.B. Watson,3 C. Weaver,23 P. Weigel,14 A. Weindl,30 J. Weldert,39 C. Wendt,38 J. Werthebach,22 M. Weyrauch,30 N. Whitehorn,23,34 C.H. Wiebusch,63 N. Willey,23 D.R. Williams,57 M. Wolf,38 G. Wrede,25 J. Wulff,10 X.W. Xu,6 J.P. Yanez,24 E. Yildizci,38 S. Yoshida,15 S. Yu,23 T. Yuan,38 Z. Zhang53 and P. Zhelnin13 1 Department of Physics, University of Adelaide, Adelaide, 5005, Australia 2 Department of Physics and Astronomy, University of Alaska Anchorage, 3211 Providence Dr., Anchorage, AK 99508, U.S.A. 3 Department of Physics, University of Texas at Arlington, 502 Yates St., Science Hall Rm 108, Box 19059, Arlington, TX 76019, U.S.A. 4 CTSPS, Clark-Atlanta University, Atlanta, GA 30314, U.S.A. 5 School of Physics and Center for Relativistic Astrophysics, Georgia Institute of Technology, Atlanta, GA 30332, U.S.A. 6 Department of Physics, Southern University, Baton Rouge, LA 70813, U.S.A. 7 Department of Physics, University of California, Berkeley, CA 94720, U.S.A. 8 Lawrence Berkeley National Laboratory, Berkeley, CA 94720, U.S.A. 9 Institut für Physik, Humboldt-Universität zu Berlin, D-12489 Berlin, Germany 10 Fakultät für Physik & Astronomie, Ruhr-Universität Bochum, D-44780 Bochum, Germany 11 Université Libre de Bruxelles, Science Faculty CP230, B-1050 Brussels, Belgium 12 Vrije Universiteit Brussel (VUB), Dienst ELEM, B-1050 Brussels, Belgium 13 Department of Physics and Laboratory for Particle Physics and Cosmology, Harvard University, Cambridge, MA 02138, U.S.A. 14 Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. 15 Department of Physics and The International Center for Hadron Astrophysics, Chiba University, Chiba 263-8522, Japan 16 Department of Physics, Loyola University Chicago, Chicago, IL 60660, U.S.A. 2022 JINST 17 P11003 17 Department of Physics and Astronomy, University of Canterbury, Private Bag 4800, Christchurch, New Zealand 18 Department of Physics, University of Maryland, College Park, MD 20742, U.S.A. 19 Department of Astronomy, Ohio State University, Columbus, OH 43210, U.S.A. 20 Department of Physics and Center for Cosmology and Astro-Particle Physics, Ohio State University, Columbus, OH 43210, U.S.A. 21 Niels Bohr Institute, University of Copenhagen, DK-2100 Copenhagen, Denmark 22 Department of Physics, TU Dortmund University, D-44221 Dortmund, Germany 23 Department of Physics and Astronomy, Michigan State University, East Lansing, MI 48824, U.S.A. 24 Department of Physics, University of Alberta, Edmonton, Alberta, Canada T6G 2E1 25 Erlangen Centre for Astroparticle Physics, Friedrich-Alexander-Universität Erlangen-Nürnberg, D-91058 Erlangen, Germany 26 Physik-department, Technische Universität München, D-85748 Garching, Germany 27 Département de physique nucléaire et corpusculaire, Université de Genève, CH-1211 Genève, Switzerland 28 Department of Physics and Astronomy, University of Gent, B-9000 Gent, Belgium 29 Department of Physics and Astronomy, University of California, Irvine, CA 92697, U.S.A. 30 Karlsruhe Institute of Technology, Institute for Astroparticle Physics, D-76021 Karlsruhe, Germany 31 Karlsruhe Institute of Technology, Institute of Experimental Particle Physics, D-76021 Karlsruhe, Germany 32 Department of Physics, Engineering Physics, and Astronomy, Queen’s University, Kingston, ON K7L 3N6, Canada 33 Department of Physics and Astronomy, University of Kansas, Lawrence, KS 66045, U.S.A. 34 Department of Physics and Astronomy, UCLA, Los Angeles, CA 90095, U.S.A. 35 Centre for Cosmology, Particle Physics and Phenomenology - CP3, Université catholique de Louvain, Louvain-la-Neuve, Belgium 36 Department of Physics, Mercer University, Macon, GA 31207-0001, U.S.A. 37 Department of Astronomy, University of Wisconsin — Madison, Madison, WI 53706, U.S.A. 38 Department of Physics and Wisconsin IceCube Particle Astrophysics Center, University of Wisconsin — Madison, Madison, WI 53706, U.S.A. 39 Institute of Physics, University of Mainz, Staudinger Weg 7, D-55099 Mainz, Germany 40 Department of Physics, Marquette University, Milwaukee, WI, 53201, U.S.A. 41 Institut für Kernphysik, Westfälische Wilhelms-Universität Münster, D-48149 Münster, Germany 42 Bartol Research Institute and Department of Physics and Astronomy, University of Delaware, Newark, DE 19716, U.S.A. 43 Department of Physics, Yale University, New Haven, CT 06520, U.S.A. 44 Columbia Astrophysics and Nevis Laboratories, Columbia University, New York, NY 10027, U.S.A. 45 Department of Physics, University of Oxford, Parks Road, Oxford OX1 3PU, U.K. 46 Dipartimento di Fisica e Astronomia Galileo Galilei, Università Degli Studi di Padova, 35122 Padova PD, Italy 47 Department of Physics, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, U.S.A. 48 Physics Department, South Dakota School of Mines and Technology, Rapid City, SD 57701, U.S.A. 49 Department of Physics, University of Wisconsin, River Falls, WI 54022, U.S.A. 50 Department of Physics and Astronomy, University of Rochester, Rochester, NY 14627, U.S.A. 51 Department of Physics and Astronomy, University of Utah, Salt Lake City, UT 84112, U.S.A. 52 Oskar Klein Centre and Department of Physics, Stockholm University, SE-10691 Stockholm, Sweden 53 Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY 11794-3800, U.S.A. 2022 JINST 17 P11003 54 Department of Physics, Sungkyunkwan University, Suwon 16419, Korea 55 Institute of Basic Science, Sungkyunkwan University, Suwon 16419, Korea 56 Institute of Physics, Academia Sinica, Taipei, 11529, Taiwan 57 Department of Physics and Astronomy, University of Alabama, Tuscaloosa, AL 35487, U.S.A. 58 Department of Astronomy and Astrophysics, Pennsylvania State University, University Park, PA 16802, U.S.A. 59 Department of Physics, Pennsylvania State University, University Park, PA 16802, U.S.A. 60 Department of Physics and Astronomy, Uppsala University, Box 516, S-75120 Uppsala, Sweden 61 Department of Physics, University of Wuppertal, D-42119 Wuppertal, Germany 62 DESY, D-15738 Zeuthen, Germany 63 III. Physikalisches Institut, RWTH Aachen University, D-52056 Aachen, Germany E-mail: [email protected] Abstract: IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1 GeV–100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed background rate, compared to current IceCube methods. Alternatively, the GNN offers a reduction of the background (i.e. false positive) rate by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%–20% compared to current maximum likelihood techniques in the energy range of 1 GeV–30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events. Keywords: Analysis and statistical methods; Data analysis; Neutrino detectors; Particle identification methods ArXiv ePrint: 2209.03042 2022 JINST 17 P11003 Contents 1 Introduction 1 1.1 The IceCube detector 1 1.2 The challenge of IceCube event classification and reconstruction 3 1.2.1 Traditional reconstruction methods 4 1.2.2 Machine-learning-based reconstruction methods 5 2022 JINST 17 P11003 2 Graph Neural Networks applied to IceCube data 5 2.1 Preprocessing of IceCube events 6 2.2 Model architecture 6 2.3 Training configuration and loss functions 8 3 Performance on low-energy neutrino events 9 3.1 Selected datasets 10 3.2 Event classification performance 11 3.3 Event reconstruction performance of dynedge 11 3.4 Runtime performance 15 4 Robustness test 15 4.1 Perturbation of input variables 15 4.2 Variations in ice properties and module acceptance 17 4.2.1 Classification 18 4.2.2 Reconstruction 18 5 Summary and conclusions 22 1 Introduction 1.1 The IceCube detector The IceCube Neutrino Observatory, located at the geographic South Pole, consists of a cubic kilometer of ice instrumented with 5,160 Digital Optical Modules (DOMs) on 86 strings, placed at depths between 1450 m and 2450 m. The main detector array consists of 78 strings arranged in a roughly hexagonal array, with an average horizontal distance of 125 m between neighboring strings (see figure 1). Each string supports 60 DOMs separated vertically by 17 m. Each DOM contains a 1000 Photomultiplier Tube (PMT) facing downwards. Around the center string in the deepest part of the array where the optical transparency of the ice is highest, modules on 8 additional strings have been installed. This volume, named “DeepCore” , has an increased spatial density of DOMs and features PMTs with an enhanced quantum efficiency (QE) compared to the main array. The IceCube Observatory is constructed to detect neutrino interactions spanning the energy –1– range of a few GeV to several PeV, with the purpose of exploring properties of both the cosmos and fundamental properties of neutrinos [3, 4]. 78 77 76 74 75 73 IceCube string 69 70 71 72 67 68 66 65 64 63 DeepCore string 60 61 62 56 57 58 59 55 50 54 53 49 52 47 48 51 46 1000 m 45 40 44 81 43 82 42 86 38 39 41 36 37 35 DeepCore 80 83 30 34 85 79 33 29 32 84 28 31 27 26 25 2022 JINST 17 P11003 24 21 23 19 20 22 18 17 16 15 13 14 12 11 10 9 8 7 6 5 4 3 2 1 1000 m Depth (m) 1500 1600 1700 Veto cap 1800 10 DOMs 10 m vertical spacing 1900 2000 Dust layer 2100 DeepCore 2200 2300 50 HQE DOMs 7 m vertical spacing Absorption Effective 2400 scattering 0.20 0.15 0.10 0.05 0.00 Coefficient (m 1) Figure 1. Top and side view of IceCube detector. Ice properties as a function of depth is shown on the left. The in-ice dust layer is marked in gray. The dust layer is a layer in the ice with dust impurities and therefore reduced optical qualities. DeepCore is placed under the dust layer and shown in green. High Quantum Efficiency (HQE) DOMs are placed under the dust layer in the DeepCore region. Charged particles resulting from neutrino interactions in the ice will emit Cherenkov radiation detectable by IceCube. The number of detected photons is many orders of magnitude lower than those emitted. For the neutrinos detectable by IceCube, the signal in a neutrino event may range from a low of a few to a high of order 100,000 detected photoelectrons. In the energy range this work is focusing on — from a few GeV up to around 100 GeV — a typical interaction deposits about 49 GeV of energy and has just 17 detected photoelectrons. This signal, however, is interspersed with noise stemming from, for example, radioactive decays in the DOMs. Event candidates are read out from the detector at a trigger rate of around 2.7 kHz. These recorded events are dominated by triggers due to downgoing atmospheric muons, followed by first random triggers caused by noise pulses, second by atmospheric neutrinos at a rate less than 10 mHz. –2– 1.2 The challenge of IceCube event classification and reconstruction Event reconstruction can be framed as a problem of parameter inference. Given a set of detector observations, the reconstruction aims to infer the physics properties of a (neutrino) interaction. A reconstruction algorithm defines and implements this process of taking input data and returning parameter estimators. Parameters of interest and categorization of events. The parameters of interest estimated in the reconstruction vary by application. The deposited energy, the direction of the neutrino candidate, and the interaction vertex are of central relevance to many IceCube physics analyses. Events are 2022 JINST 17 P11003 also categorized into multiple morphological classes, of which only two are relevant for this work because the vast majority of the sample is below 100 GeV. These two event classes serve as proxies for the underlying neutrino flavor and interaction type. “Track-like” events are proxies for 𝜈 𝜇 charged-current (CC) interactions from atmospheric and astrophysical neutrinos. These events contain a muon that can travel a long distance inside the detector while emitting Cherenkov radiation, producing a signature that looks like a track going through the ice. However, track-like signatures can also be produced by atmospheric muons from cosmic rays, and from the 17% of 𝜈 𝜏 CC interactions that produce 𝜏 leptons that decay into muons , but these are not considered true tracks for the purposes of this work. The other class comprises “cascade-like” events, containing everything that is not described by the track-like class. These events consist of electromagnetic and hadronic particle showers, such as those produced in 𝜈𝑒, 𝜏, 𝜇 neutral-current (NC) and 𝜈𝑒, 𝜏 CC interactions. For brevity, we will refer to the classification of track (T ) and cascade (C) morphologies as “T /C”. Another classification task is the discrimination of neutrino events from atmospheric muons events, which will be referred to as “𝜈/𝜇”. Input Data. Data from the IceCube PMTs is digitized with waveform digitizers which record the majority of the PMT signals from neutrino interactions. These waveforms undergo an unfolding process that attempts to extract estimated photon arrival times and the corresponding charge of individual photoelectrons each of which is a so-called “pulse”. The pulses form the detector response to an interaction and make up a time series. Each element in this sequence corresponds to the time 𝑡 at which the PMT readout indicates a measured pulse with charge 𝑞. Most pulses in the energy regime considered in this work stem from single photons. These result in pulse charges close to one photoelectron, with variation in the charge due to the stochastic nature of the PMT amplification process. The detection of multiple, nearly instantaneous photons results in higher per-pulse charges. The PMTs themselves are located at fixed DOM positions (𝐷 xyz ) and have an empirically determined QE. The DOMs deployed in DeepCore have an efficiency that is roughly a factor of 1.35 times the QE of standard DOMs. These six variables of the time sequence are summarized in table 1 and form the input data for the reconstruction. The amount of Cherenkov radiation produced in an event depends primarily on the energy of the interacting particle, and less energetic neutrinos often lead to a reduced amount of Cherenkov light. Consequently, the number of pulses for an event, 𝑛pulses , is highly dependent on the event itself, and makes low energy neutrinos particularly difficult to identify and reconstruct. In this study, we focus on neutrinos with energy less than 1 TeV. The range of 1–30 GeV is of particular importance to the study of atmospheric neutrino oscillations [9, 10]. For typical events in this energy range, –3– Table 1. Node features in graph representation of neutrino events. The units shown here are before the preprocessing mentioned in section 2.1. Feature Description Unit 𝐷 xyz Position of DOMs in IceCube coordinates m 𝑡 Pulse time relative to trigger time ns 𝑞 Charge of a pulse P.E. 𝑄𝐸 Quantum efficiency of PMT - 2022 JINST 17 P11003 𝑛pulses can range from below ten to several hundred after noise hit removal. Because some pulses in an event are due to noise pulses, low-energy neutrino events often suffer from a relatively poor signal-to-noise ratio. In addition, at low energies the track events can be so short that they cannot be easily distinguished from cascade-like events. While the irregular geometry of IceCube help distinguish events that would otherwise leave identical detector responses in a regular geometry configuration, it provides an additional layer of complexity to the development of reconstruction algorithms. In addition, reconstruction is further complicated by varying ice properties as a function of 𝑥, 𝑦, 𝑧, position in the IceCube array. The z-dependence of the ice absorption and effective scattering is shown to the left in figure 1. 1.2.1 Traditional reconstruction methods Maximum likelihood estimation is a standard technique for parameter inference. The key ingredients are the likelihood itself and a maximization strategy. The exact reconstruction likelihood for IceCube events is, however, intractable and can only be approximated. In IceCube, event reconstructions range in complexity from simple analytical approximations of likelihoods to highly sophisticated detector response functions based on photon ray tracing. A major challenge in modeling the detector response is including the right amount of detail of an event’s Cherenkov light emission profile, as well as the inhomogeneous optical properties of the ice, which is visualized on the left side of figure 1. Both affect the expected photon pattern in the PMTs. Analytic approximations, which are fast but inaccurate, are predominantly used as first guesses for both online and offline processing. More sophisticated reconstruction techniques come with a higher computational cost and can only be applied to a subset of the events. For the neutrino energies considered in this work, (i.e. low-energy events between 1 GeV–1000 GeV) such likelihood-based reconstructions were used, for example, in the analyses of refs. [9, 10]. We will use the state-of-the-art IceCube low-energy reconstruction algorithm retro as a benchmark for our work. On average, retro requires around 40 seconds to reconstruct one event on a single CPU core, which is one of the main shortcomings of the method. Another limitation is its use of simplifying approximations, such as the assumed uniformity in the azimuthal response of the IceCube DOMs, as well as the assumption that ice properties change as a function of depth only. The IceCube Upgrade will augment IceCube’s detection capabilities by adding additional detector strings featuring new DOM types: the “mDOM” and “DEgg”. The mDOMs carry 24 300 PMTs providing almost homogeneous angular coverage, while the DEggs carry two 800 PMTs, one facing up and one down. Adjusting retro to work with the IceCube Upgrade is difficult –4– while keeping memory usage and computing time at a reasonable level, due to assumed symmetries being broken and the increase of different module types. 1.2.2 Machine-learning-based reconstruction methods An alternative method to maximum likelihood estimation is regression with Neural Networks (NNs). Instead of approximating the likelihood and traversing its parameter space with an optimization algorithm, a regression algorithm returns parameter estimates directly. These regression models are trained by minimizing a defined loss function. Since an optimal regression algorithm may be highly nonlinear, artificial NNs can offer a viable solution. A regression algorithm needs to map 2022 JINST 17 P11003 the ragged input data of shape [𝑛pulses , 6], where the number of columns refers to the six node features (𝐷 𝑥 , 𝐷 𝑦 , 𝐷 𝑧 , 𝑡, 𝑞, 𝑄𝐸) described in table 1, onto an output of shape [1, 𝐷] estimating event level truth information, where 𝐷 is the number of parameters we are interested in. The spatio-temporal nature of IceCube data, with the addition of varying sequence lengths, makes it difficult to map event reconstruction in IceCube to common machine learning techniques. Previously published IceCube machine learning methods embed events into pseudo-images and perform regression using a Convolutional Neural Network (CNN). This class of algorithms is effective at identifying local features in data — i.e., CNNs efficiently group similar data points in the compressed “latent space” representation of the input images. However, embedding IceCube events into images is a lossy operation aggregating pulse information into per-DOM summary statistics, a process that severely degrades the information available in low-energy events. Due to this reason, the CNN reconstruction method is mainly used at energies higher than what is considered in our work, but an adaptation for our energy range exists. An alternative NN architecture employing a Graph Neural Network (GNN) approach is used in , but focuses solely on 𝜈/𝜇 classification. We propose a general reconstruction method based on GNNs that can be applied to the entire energy range of IceCube, is compatible with the IceCube Upgrade, and can reconstruct events studied in this work at speeds fast enough to run in real time online processing at the South Pole. We consider a set of reconstruction attributes: 𝜈/𝜇 classification, T /C classification, deposited energy 𝐸, zenith and azimuth angles (𝜃, 𝜙), and 𝑥, 𝑦, and 𝑧 coordinates of the interaction vertex, denoted by 𝑉xyz. We will benchmark our reconstruction method on a simulated low-energy IceCube sample used for atmospheric neutrino oscillation studies. 2 Graph Neural Networks applied to IceCube data Generally, GNNs are NNs that work on graph representations of data. A graph consists of nodes interconnected by edges. Nodes are associated with data, and the edges specify the relationship among the nodes. By adopting graphs as the input data structure, the idea of convolution generalizes from the application of filters on the rigid structure of grids to abstract mathematical operators that utilize the interconnection of nodes in its computation. This allows such models to naturally incorporate an irregular geometry directly in the edge specification of the graph, without imposing artificial constraints. For this reason, GNNs are a natural choice of machine learning paradigm for problems with irregular geometry, such as IceCube event reconstruction. We choose to represent each event by a single graph. Each observed pulse is represented by a node in the graph and contains the per-DOM information shown in table 1. Each node in the graph is –5– connected to its 8 nearest neighbors based on the Euclidean distance, and for this reason we consider the interconnectivity of the nodes in the graphs to be spatial. 2.1 Preprocessing of IceCube events The variables listed in table 1 come in different units and can span over different ranges, for example the relative positions of triggered DOMs can take values between tens of meters to > 1 km; or the times spanned by pulses may range from tens of nanoseconds to several microseconds. While neural networks can in theory process data in any range of real numbers, the complexity of the loss landscape that a model navigates during training is highly dependent on the relative scale of the 2022 JINST 17 P11003 input data. In addition, distributions not centered around zero can lead to a slower convergence time. Each input variable 𝑥 is therefore transformed into 𝑥¯ using 𝑥 − 𝑃50th (𝑥) 𝑥¯ = , (2.1) 𝑃75th (𝑥) − 𝑃25th (𝑥) where 𝑃𝑖th (𝑥) is the 𝑖-th percentile of the distribution of input feature 𝑥. This transformation brings the input variables into roughly similar orders of magnitude, gives a median of zero and makes them unitless. 2.2 Model architecture Our GNN implementation, henceforth referred to as dynedge, is a general purpose reconstruction and classification algorithm that is applied directly to the event pulses from the IceCube detector. Therefore, the algorithm does not rely on the pulses to be aggregated into summary statistics like. Instead, dynedge uses graph learning to extract features from the pulses, making the mapping of measurements to prediction a fully learnable task. Additionally, dynedge learns the optimal edges between nodes and can be applied to events with any number of pulses. dynedge is implemented using GraphNeT , a software framework for graph neural networks in neutrino telescopes built using PyTorch Geometric. dynedge uses a convolutional operator EdgeConv , developed to act on point cloud graphs in computer vision segmentation analysis. For every node 𝑛 𝑗 with node features 𝑥 𝑗 , the operator convolves 𝑥 𝑗 via local neighborhood of 𝑛 𝑗 as 𝑁neighbors ∑︁ 𝑥˜ 𝑗 = MLP(𝑥 𝑗 , 𝑥 𝑗 − 𝑥 𝑖 ), (2.2) 𝑖=1 where 𝑥˜ 𝑗 denotes the convolved node features of 𝑛 𝑗 , and the Multilayer Perceptron (MLP) takes as input the unconvolved node features of 𝑛 𝑗 and the pairwise difference between the unconvolved node features of 𝑛 𝑗 and its 𝑖-th neighbor. Thus, the connectivity of the node in question serves as a specification of which neighboring nodes contribute to the convolution operation. A full convolution pass of the input graph consists of repeating Equation (2.2) for every node in the graph. When compared to convolutions as understood from CNNs, Equation (2.2) corresponds to a convolutional operator where the kernel size is analogous to 𝑁neighbors and the “pixels” on which the kernel acts are defined by the edges, instead of the position of the kernel. The full architecture of dynedge is shown in figure 2. First, the following 5 global statistics are calculated from the input graph: node homophily ratio of 𝐷 xyz and 𝑡, and number of pulses –6– in the graph. The homophily ratio is the ratio of connected node pairs that share the same node feature, and thus a number between 0 and 1. The homophily ratio of 𝐷 xyz indicates what fraction of connected pulses originate from the same PMT. For a graph where all nodes are connected to each other, a value of 0.5 indicates that half the pulses in the event are same-PMT pulses. As such, the homophily ratio quantifies how many of the pulses originate from the same PMT and how many of the pulses happened at the same time. Second, the input graph is propagated through 4 different EdgeConv blocks such that the output from one flows into the next. As illustrated in the bottom right corner of figure 2, the EdgeConv block is initialized with a k-nearest neighbors (k-nn) computation that redefines the edges of the input graph. This step lets dynedge calculate the 8 nearest Euclidean neighbors and means that the first convolution block connects the nodes in true 2022 JINST 17 P11003 𝑥𝑦𝑧-space and the subsequent convolution blocks connects the nodes in an increasingly abstract latent space. By letting the subsequent convolutional blocks interpret the convolved 𝑥𝑦𝑧-coordinates from the past block, dynedge is allowed to connect the nodes arbitrarily in each convolutional step, which effectively lets dynedge learn the optimal neighborhoods of the event for a specific task given the 8-nearest-neighbors constraint. Input Graph [n,6] Global [1,5] [1,1029] MLP Prediction Statistics [n,6] [1,1024] [1,n_outputs] State Graph 1 [n, 256] Min Max EdgeConv MLP [n, 1030] [n, 256] Mean Sum State Graph 2 Node Aggregation [n, 256] EdgeConv EdgeConv State Graph 3 for j in range(num_nodes): [n, 256] [n,h] k-nn [n,256] EdgeConv State Graph 4 EdgeConv [n, 256] Figure 2. A diagram of the architecture of dynedge. The “Input Graph” is a toy illustration of the input, and the subsequent “State Graphs” illustrates how the node position and connectivity change after convolutions by EdgeConv. An illustration of the inner workings of the EdgeConv block is provided to the right, where h represents an arbitrary number of columns. The number of pulses is represented by 𝑛. Third, the input graph and each state graph are concatenated together into a [𝑛, 1030]-dimensional array that is passed through a fully connected MLP block containing two layers. The block maps the [𝑛, 1030]-dimensional array into a [𝑛, 256]-dimensional array. This array is aggregated node-wise into summary statistics in four parallel ways; mean, min, max, and sum, each corresponding to a many- to-one projection of the form 𝑓 : [𝑛, 256] −→ [1, 256].1 Aggregations are concatenated together to minimize information loss, which produces an array with dimension [1, 4 · 256] that is concatenated 1While more sophisticated pooling methods were tested, none improved upon this choice. –7– together with the initially calculated 5 global statistics, producing a [1, 1029]-dimensional input array to the subsequent 2-layer MLP block that makes the final prediction by mapping the [1, 1029]- dimensional array to a [1, 𝑛outputs ]-dimensional output. This node aggregation scheme removes the need for zero-padding of the input data, and allows the model to function on any number of pulses. The architecture shown in figure 2 is the result of multiple iterations of architecture tests. We find that increasing the number of convolutional layers leads to identical performance but at increased training time, whereas a decrease in the number of convolutional layers leads to a noticeably worse performance. In addition, a wide variety of data-driven pooling operations have been tested, but none improved upon the many-to-one projections shown in figure 2. Also, hyperparameters such as the 2022 JINST 17 P11003 number of nearest neighbors used in the k-nearest-neighbors computation have been subject to tuning. We find that for 𝑘 larger than 8, the convolutions become too coarse to learn local features, whereas for 𝑘 smaller than 8 the convolutions become too fine to be globally descriptive. Also, the internal dimensions of the data referenced both in text and in figure 2 have been subject to hyperparameter opti- mization. These dimensions effectively set the number of learnable parameters in the model. We find that a significant reduction in learnable parameters yields an under-parameterized model that cannot learn the complicated relationships between data and task. A significant increase in learnable param- eters increases training time and yields similar performance as reported for the current configuration. 2.3 Training configuration and loss functions A dynedge network is trained for each of the reconstruction variables: deposited energy 𝐸, zenith and azimuth angles (𝜃, 𝜙), the interaction vertex 𝑉xyz , 𝜈/𝜇 classification and T /C classification. This totals 6 independently trained models. The difference between each model is the choice of loss function, number of outputs, and training selection. Each model is trained with a batch size of 1024, using the ADAM optimizer. The batch size indicates the number of events in each sub sample of the training set, on which the gradients of the loss are calculated. A custom piece-wise-linear implementation of a One-Cycle learning rate schedule with a warm-up period is used. The scheduler increases the global learning rate linearly from 10−5 to 10−3 during the first 50% of iterations in the first epoch, and thereafter linearly decreases the learning rate to 10−5 during the remaining iterations in the 30 epoch training budget. To counteract overfitting, early stopping is implemented with a patience of 5 epochs. For all classification tasks, the Binary Cross Entropy loss is used, since we only consider two categories: neutrino or muon events, and tracks or cascades. However, for each regression task, a specific loss function is chosen. For regression of deposited energy, a LogCosh function is used Loss(𝐸) = ln cosh 𝑅 𝐸 , (2.3) applied to the residual 𝑅 𝐸 = log10 (𝐸 reco /GeV) − log10 (𝐸 true /GeV). The embedding 𝑓 : 𝐸 true −→ log10 (𝐸 true /GeV) is added to account for the large range that the deposited energy spans, and log10 (𝐸 reco /GeV) is the model’s prediction of the embedded, deposited energy. LogCosh is symmetric around zero and, therefore, punishes over- and under-estimation equally. When compared to more conventional choices such as mean-squared error (MSE), LogCosh offers a steadier gradient around zero and is approximately linear for large residuals, both beneficial to the training process. –8– Table 2. Targets of the GNN-based classification and reconstruction algorithm, based on the truth values of the event simulation. Includes definitions of residual distributions for regression targets. Targets Description Residual Definition 𝜈/𝜇 Classification of neutrino vs. muon events – 𝐸 Deposited energy of neutrino interaction 𝑅 𝐸 = log10 (𝐸 reco ) − log10 (𝐸 true ) 𝜃, 𝜙 Zenith and azimuth angles of neutrino 𝑅angle = anglereco − angletrue 𝑟® Direction vector of neutrino 𝑅𝑟® = arccos |𝑟®𝑟®reco reco ·® 𝑟true | | 𝑟®true | 𝑉xyz Vertex position of neutrino interaction 𝑅𝑉xyz = | 𝑃®reco − 𝑃®true | 2022 JINST 17 P11003 T /C Classification into tracks and cascades – For angular regression, a von Mises-Fisher Sine-Cosine loss is used, where the true angle ø is embedded into a 2D vector space using 𝑓 : ø −→ [ sin ø cos ø ]. dynedge is then tasked with predicting this embedded vector together with an uncertainty 𝑘. The 𝑑 = 2 von Mises-Fisher distribution, where 𝑑 is the dimension of the unit vectors, is given by 𝑝( 𝑥| ¯ 𝑢, ¯ 𝑘) = 𝐶2 (𝑘) exp (𝑘 𝑢¯ · 𝑥) ¯ = 𝐶2 exp (𝑘 cos Δø) where 𝑥¯ is the predicted embedded vector; 𝑢¯ is the embedding of the true angle; 𝑘 resembles 1/𝜎 2 of a normal distribution; and 𝐶2 is a normalization constant written in terms of modified Bessel functions. The 𝑑 = 2 von Mises-Fisher distribution describes a probability distribution on a 1-sphere embedded in R2 and bears resemblance to loss functions such as 1 − cos(Δø) but with the added functionality of uncertainty estimation via 𝑘. Note that Δø is chosen to be the angle between the predicted and the true embedding vector.2 The loss function is created by taking the negative log of 𝑝( 𝑥| ¯ 𝑢, ¯ 𝑘): ¯ 𝑘) = − ln 𝐶2 (𝑘) − ln (𝑘/4𝜋) + 𝑘 + ln (1 − 𝑒 −2𝑘 ) − 𝑘 cos (Δø). Loss(ø) = − ln 𝑝( 𝑥|¯ 𝑢, (2.4) After zenith and azimuth angles are regressed individually, the direction is produced by transforming zenith and azimuth into a direction vector 𝑟®reco ∈ R3. 𝑥 𝑦 𝑧 The true interaction vertex is embedded using 𝑓 : (𝑥, 𝑦, 𝑧) −→ | max( 𝑥) | max( 𝑦) | | max(𝑧) | for , , the same reasons mentioned in section 2.1. dynedge then predicts the embedded interaction vertex vector, and the loss function is the Euclidean distance between the true (𝑃˜true ) and reconstructed (𝑃˜reco ) embedding vectors. 3 Performance on low-energy neutrino events In this section, we quantify the performance of our proposed algorithm based on simulated data in the energy range of 1 GeV–1000 GeV, where the majority of selected events are below 100 GeV (see figure 4), and compare it with the state-of-the-art retro algorithm. When possible, we provide comparisons for tracks (𝜈 𝜇 𝐶𝐶) and cascades (all other neutrino interactions) separately. At the end of the section, a short examination of the runtime performance will be provided. 2A more intuitive approach such as a d=3 von Mises-Fisher, where Δø is chosen to be the angular difference between predicted and true direction vectors in R3 , was found to lead to suboptimal results because the azimuthal component of the loss is too dominant at lower energies. Substantial improvement in zenith reconstruction is gained by estimating the two angles separately. –9– 3.1 Selected datasets The IceCube simulation used for training and testing the dynedge classifications and reconstructions is borrowed from the collaboration’s simulation for neutrino oscillation analyses. The dataset and selection process is similar to the ones described in , which is also used in. The interactions were simulated with GENIE , and simulations of the propagation of secondaries, particle showers, and the propagation of Cherenkov light are the same as used in. The event selection aims to provide a clean neutrino sample in the DeepCore region of IceCube by removing pure noise events, atmospheric muon events and by applying containment cuts. retro is run at the late stages of the event selection (level 7 in ) because of it’s high computational cost. We have chosen this late 2022 JINST 17 P11003 stage of the event selection as our dataset because it allows for a direct comparison with retro. The resulting selection totals approximately 8.3 million simulated neutrino and muon events, and because the event selection removes virtually all pure noise events, these have been omitted from this work. Once weighted to a physical spectrum, the data sample contains less than 5% atmospheric muons and the neutrino sample consists of mostly contained interactions below 100 GeV. Distributions of a few event level variables of the selected events are shown in figure 4. From the 8.3 million event sample, we construct three task specific datasets, summarized in table 3. Table 3. Selected datasets for the reconstruction and classification tasks for dynedge. Testing sets are defined as the data on which no back-propagation has been made, and therefore includes the validation set. Task Total 𝜈𝑒𝑁 𝐶 (%) 𝜈𝑒𝐶𝐶 (%) 𝜈 𝜇𝑁 𝐶 (%) 𝜈𝐶𝐶 𝜇 (%) 𝜈 𝜏𝑁 𝐶 (%) 𝜈𝐶𝐶 𝜏 (%) 𝜇 (%) 𝜈/𝜇 (Train) 215k 1.6 15.0 1.6 15.0 4.4 12.3 50.1 𝜈/𝜇 (Test) 106k 1.6 15.1 1.5 15.3 4.5 12.2 49.8 Reconstruction (Train) 3.76M 3.2 30.0 3.1 30.3 9.0 24.4 Reconstruction (Test) 4.37M 1.4 12.8 6.6 64.3 4.0 11.0 T /C (Train) 731k 16.7 16.7 16.7 49.9 T /C (Test) 7.4M 0.8 21.2 3.9 48.2 7.0 18.9 Because the event selection aims to produce a pure neutrino sample, the raw count of muon events in the 8.2 million sample is less than 200k. In addition, there are significantly more cascade events than tracks (𝜈 𝜇 CC), making the full data set class-imbalanced. This work under-samples the full data set into three task-specific training sets with corresponding test sets to counteract the imbalance between the three classes: tracks, cascades and muons. The T /C training data set contains an even amount of 𝜈 𝜇 CC events (considered “tracks”) and CC 𝜈𝑒 + NC events (considered “cascades”), and due to the high number of cascade events, the training sample consists of only 731k events. The 𝜈 𝜏 events are omitted from the training sample (but included in the test sample), since 17% of 𝜈𝐶𝐶 𝜏 events produce track-like signatures that may confuse the model during training with a track-like signature but a cascade-like label. The corresponding test set for T /C constitutes the remaining neutrino events, mostly consisting of cascades. For the 𝜈/𝜇 classification task, the training dataset is chosen such that there are an equal amount of neutrino and muons events. Because of the few muon events, this dataset is the smallest, and the neutrinos have been sampled such that there is an even amount of each flavor. – 10 – For reconstruction, a dataset with equal amounts of neutrino flavors has been chosen. On this dataset, energy, zenith, azimuth, and interaction vertex are regressed. In order to keep the balance between tracks and cascades, one may most naturally choose the dataset from the T /C task, because it is balanced between the two event classes, but this selection yields significantly lower statistics. It was found that the high statistics choice provided general improvements compared to reconstructions from dynedge trained purely on the balanced T /C set. The classification results are available in figure 3 and reconstruction results are shown in figure 5. The normalized distributions of regression targets on test and training sets from table 2 are plotted in figure 4 for each event selection in table 3. The T /C and Reconstruction selections are similar in target distributions, whereas the 𝜈/𝜇 event selection have additional artifacts due to the presence of muons. 2022 JINST 17 P11003 3.2 Event classification performance The performance of the dynedge classifiers and the currently-used Boosted Decision Tree (BDT) methods is characterized using Receiver Operating Characteristics (ROC) curves, from which the Area Under the Curve (AUC) is calculated. The 𝜈/𝜇 classification task is used in event selection, and for this reason a threshold for the classification scores, shown in the bottom left plot of figure 3, is used to make a decision on whether an event is labeled a neutrino or a muon. A typical threshold in classification score is indicated in red in the left panel of figure 3. Intersection points between the red line and the ROC curves represents the corresponding False Positive Rate (FPR) and True Positive Rate (TPR) at this choice of selection. By comparing the TPR of intersection points, one can deduce that dynedge offers increased signal efficiency by 18% relative to the BDT at a FPR of 0.025. Alternatively, the FPR can be reduced by over a factor of 8 from 2.5% to 0.3% relative to the BDT at a fixed TPR of 0.6 for the 𝜈/𝜇 classification task. From the ROC curve for the T /C classification task, dynedge offers an increased AUC score of about 6%. A threshold for the T /C classification task is not shown because the T /C classification scores, shown in the bottom right plot, are typically binned when used in an analysis. When inspecting the T /C classification score distributions in the bottom right of figure 3 it can be seen that dynedge offers a better separation of T /C events than the BDT. The large difference in classification performance shown in figure 3 between the BDT and dynedge can be explained by the fact that classical BDTs require sequential input data, such as IceCube events, to be aggregated from [𝑛pulses , 𝑛features ] into [1, 𝑑]-dimensional arrays as a preprocessing step, where 𝑑 is the number of event variables for the BDTs to act on. This preprocessing step reduces the information available to the BDTs. 3.3 Event reconstruction performance of dynedge The event reconstruction performance for deposited energy, zenith, direction, and interaction vertex is shown in figure 5. The performance metric is the resolution, which we quantify here as the width 𝑊 of the residual distribution 𝑅. For the deposited energy and zenith reconstructions, 𝑊 is defined as 𝑝 84th 𝑅 − 𝑝 16th 𝑅 𝑊= , (3.1) 2 where 𝑝 16th 𝑥 and 𝑝 84th 𝑥 correspond to the 16th and 84th percentile. For a normal distribution, the quantile 𝑊 corresponds to the standard deviation 𝜎, but it is more robust to outliers. For direction and – 11 – / / 1.0 Threshold 0.9 0.8 True Positive Rate 0.6 0.8 0.4 0.7 (0.025, 0.782) BDT BDT 0.2 AUC: 0.923 AUC: 0.671 0.6 (0.025, 0.602) Dynedge Dynedge 2022 JINST 17 P11003 (0.003, 0.602) AUC: 0.962 AUC: 0.713 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.0 0.2 0.4 0.6 0.8 1.0 False Positive Rate False Positive Rate 17.5 Dynedge Dynedge Dynedge Dynedge BDT BDT 4 15.0 Area Normalized Counts BDT BDT 12.5 3 Threshold 10.0 7.5 2 5.0 1 2.5 0.0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Neutrino Score Track Score Figure 3. Left: ROC curves for dynedge and current BDT 𝜈/𝜇 classification models and below are the distribution of classification scores shown with the threshold labeled in red. Right: ROC curve for dynedge and current BDT T /C classification models. Distributions in prediction scores are shown below. interaction vertex reconstructions, (which have strictly positive residual distributions), the resolution 𝑊 is defined as the 50th percentile 𝑝 50th 𝑥. Since reconstruction resolutions are highly dependent on the energy of the neutrino, we provide the resolutions binned in energy. Additionally, we separate the residual distributions in true track (𝜈 𝜇 CC) and cascade events to examine the performance of the algorithms in detail. For comparison between dynedge and retro, the relative improvement defined by 𝑊dynedge relative improvement = 1 − · 100 (3.2) 𝑊retro is used. Here 𝑊dynedge and 𝑊retro corresponds to the resolution of dynedge and retro for a given reconstruction task. The relative improvement provides a percentage comparison of the resolutions. The residual distributions for deposited energy (𝑅 𝐸 )3, angles (𝑅angle ), interaction vertex (𝑅xyz ), and direction (𝑅dir ) are defined in table 2 and some are shown in figure 4. From figure 4 it can be −𝐸true 3Notice that the residual distribution for energy is changed to 𝑅 𝐸 = 𝐸reco 𝐸true · 100 for the calculation of resolutions. – 12 – Truth dynedge Retro Disitribution 0 1 2 3 0 50 100 150 0 100 200 300 200 0 200 200 0 200 800 600 400 200 Residual 1 0 1 100 50 0 50 100 100 0 100 50 25 0 25 50 50 25 0 25 50 50 25 0 25 50 2022 JINST 17 P11003 E (log10 GeV) () () Vx (m) Vy (m) Vz (m) Figure 4. Area normalized distributions of predicted and true target variables on the reconstruction test set and their corresponding residual distributions. Residuals shown here follows the definitions shown in table 2 except for vertex coordinates, which show the relative difference between reconstructed and true coordinate. seen that dynedge tends to over-estimate the deposited energy for very low-energetic events. This is partly due to the lower statistics and the relatively poor signal-to-noise ratio below 30 GeV. On average, it will be optimal for a machine learning model to estimate a value close to the mean of the true distribution for examples where it has not yet learned a more optimal solution. Such examples can be considered to be difficult for the method, and by predicting a value close to the mean of the true distribution, the model minimizes the loss. This behavior is evident for dynedge in the reconstructions of zenith and azimuth, where events with a relatively high estimated uncertainty have a preferred solution that minimizes the loss function. For the regression of azimuth, this behavior yields multimodal artifacts due to the cyclic nature of the variable. As seen in figure 1, DeepCore have more strings in the north/south than east/west. The events with high estimated azimuthal uncertainty piled around 𝜋 and 2𝜋 are generally low energetic track events with a detector response similar to a neutrino traveling east-west, but are in fact traveling north or south. If there isn’t enough information to pinpoint whether the event belongs in either north or south, the loss can be optimized by picking a value close to the observed locations of the pileups, as they correspond to a “mean” between north and south. The timing information of the event can be used to pick the optimal pile. We emphasize that while the distribution of predictions give the impression that retro is closer to the true distribution than dynedge, the correct assessment of the quality of reconstructions comes from interpretations of the residual distributions. From the bottom panel of figure 4 it’s evident that dynedge produces reconstructions with narrower residual distributions and therefore superior resolutions. As can be seen from figure 5, dynedge performance improves upon the existing retro reconstruction in all 6 reconstruction variables for both track and cascade events between 1 GeV– 30 GeV. This energy range is of particular importance to neutrino oscillation analyses, where flavor oscillations of atmospheric neutrinos appear below 25 GeV. The typical reconstruction improvement in this range is at the level of 15%–20%. For zenith and direction reconstruction, the improvement is constant with energy for cascade-like events, while for track-like events the improvements ends around 100 GeV. dynedge performance relative to retro generally decreases at higher energies, and is possibly due to the lower number of available training samples at these energies, as shown in figure 4. As – 13 – mentioned in section 3.1, the reconstruction event selection is chosen to maximize available statistics with an equal amount of neutrino flavors, rather than optimizing the balance of track and cascade events. This choice improves on the overall performance under the given circumstances, but at the same time leads to an underrepresentation of track-like events, which further lowers the available statistics for this event type at higher energies. Resolution Performance 60 200 dynedge Direction Resolution (deg.) dynedge Energy Resolution (%) 50 2022 JINST 17 P11003 175 Retro 150 Retro 40 125 30 100 20 75 50 10 25 40 Rel. Impro. (%) Rel. Impro. (%) 20 20 0 0 20 20 40 40 60 30 24 Zenith Resolution (deg.) 22 Vertex Resolution (m) 25 20 20 18 16 15 14 10 12 10 5 8 40 40 Rel. Impro. (%) Rel. Impro. (%) 20 20 0 0 20 20 40 40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Energy (log10 GeV) Energy (log10 GeV) Figure 5. Event reconstruction performance for dynedge, estimating deposited energy (top left), zenith angle (top right), direction (bottom left), and interaction vertex (bottom right). In all four cases, the performance is compared to retro, and the ratio below the plots shows relative improvement of dynedge w.r.t. retro, where the light blue and dark blue curves represents the relative improvement for cascades and tracks, respectively. Positive values indicate an improvement in resolution. Lines represent reconstruction resolution, and the bands cover 1𝜎 resolution uncertainty. – 14 – 3.4 Runtime performance The inference speed test measures the wall-time required to reconstruct the zenith angle on 1 million neutrino events in batches of 7168 events.4 The test includes the time required to load the batch into memory, apply re-scaling, convert the batch into graphs, move data to GPU memory, and pass them through dynedge. A budget of 40 parallel workers is allocated to feed the model with batches. The test does not include overhead associated with writing the predictions to disk. The test was repeated 5 times on both GPU and CPU. The average speed was found to be 30.6 ± 2.0 kHz on GPU, and 0.221 ± 0.004 kHz for a single CPU core. The inference speed was tested for other variables considered in this work and was found to be within the uncertainties of the above values. Since a 2022 JINST 17 P11003 separate dynedge model is trained for each variable, the individual reconstructions can be run in parallel, in which case the reconstruction rate of an event is equal to the inference speed if fully parallelized. If the reconstruction of each variable is not run in parallel, the reconstruction rate of an event is approximately the inference speed divided by the number of desired variables. For a full reconstruction of energy, zenith, azimuth, interaction vertex, 𝜈/𝜇, and T /C classification, the reconstruction rate is approximately 5.1 kHz on the GPU, or 37 Hz for the single CPU core, respectively. With classification and reconstruction speed of nearly double of the median IceCube trigger rate (2.7 kHz), dynedge is in principle capable of real time reconstruction of IceCube events using a single GPU. 4 Robustness test The results presented in section 3 show that dynedge is suitable for both classification and reconstruction tasks in the low-energy range of IceCube, specifically for the neutrino-oscillation- relevant energy range. However, these results are computed on simulation based on a nominal configuration of the detector. In section 4.1, we investigate the robustness of dynedge to perturbations of input variables 𝐷 xyz , 𝑡 and 𝑞. These variables constitute the node features, and as such the perturbation test probes the robustness of dynedge to systematic variations in the node features themselves. In section 4.2, we investigate the robustness of dynedge to changes in systematic uncertainties associated with the detector medium and the angular acceptance of the DOMs. Variations in such assumptions lead not to variations in the node features, but to the topology of the events. The test therefore probes the robustness of dynedge to variations in the connectivity of the graph representations of the events. 4.1 Perturbation of input variables There are systematic uncertainties associated with the input variables shown in table 1. The position of each string is only known to a precision of a few meters horizontally, and the vertical position is known with a precision better than one meter.5 In this section, we investigate the variation in resolution and AUC from perturbations of input variables byq 𝑥 −→ 𝑥 + 𝜖, where 𝑥 is an input variable and 𝜖 is a random number from a normal distribution with standard deviation 𝜎𝑥. 4The inference speed was tested on a server running Ubuntu 20.04.3 LTS with a 64-core @2.00 GHz AMD EPYC 7662 using a single NVIDIA A100 SXM4 40 GB VRAM, 1TB of RAM, and 5TB of NVME disk space. 5Strings are also not perfectly vertical, which results in a non-uniform uncertainty on the horizontal and vertical position, but this effect is neglected in this test. – 15 – Zenith Energy Direction / (AUC) Vxyz / (AUC) 1.0 0.5 0.0 0.5 Variation (%) 1.0 2022 JINST 17 P11003 1.5 2.0 2.5 3.0 0.25 0.5 0.25 1 5 1 (1, 1) (3, 3) 0.1 DOMTime (ns) Stringz (m) Stringxy (m) DOMcharge (p.e) Figure 6. The variation in resolution and AUC induced by perturbation of input variables for dynedge. The standard deviations of the normal distributions from which the perturbations are drawn are shown in the 𝑥-axis, and the 4 different perturbation tests are labeled in blue at the bottom. Thin vertical lines centered on each bar are error bars. For the DOM positions, one 𝜖 is drawn for each string, resulting in correlated shifts for all DOMs in one string, and uncorrelated shifts between strings. The horizontal and vertical positions of the DOMs are also independently perturbed. Time and charge are perturbed pulse-wise, meaning that each pulse is perturbed independently. dynedge is not re-trained, i.e. we use the networks trained on nominal detector assumptions, and run on perturbed input datasets. We perturb time, 𝑧-coordinate, 𝑥𝑦-coordinates, and charge separately, for a few choices of 𝜎. We calculate the percentage variation with respect to the nominal resolution and AUC score. Our test addresses two questions. First, it gives an indication of the expected change in reconstruction performance due to a systematic shift in the input variables. Second, it serves as a gauge on feature importance for the different reconstruction tasks. In figure 6, variation in resolution and AUC is shown as a function of the perturbation width 𝜎. A negative value indicates a worsening with respect to the nominal performance. For example, −5% means a worsened resolution with a width 𝑊 that is 5% larger. For AUC scores, a variation by −5% means a decrease of 5%. As seen in figure 6, zenith is the most sensitive to perturbations of the time, as expected, since time perturbations of the pulses may reverse the direction. Perturbations of vertical and horizontal positions of strings have little effect on energy and direction, but impact the interaction vertex resolution. – 16 – 4.2 Variations in ice properties and module acceptance The South Pole ice — IceCube’s detector medium — is a glacier with an intricate structure of layers with varying optical properties, and the refrozen boreholes where strings were deployed are understood to have different optical properties than the bulk ice. These sources of systematic uncertainty are constrained from fits to calibration data, but only to a finite precision of about ±10% of the nominal value. In the tests presented in this section, we quantify the robustness of our reconstruction to changes in the ice properties allowed by calibration data, using a five parameter model covering the most important sources of systematic detector uncertainties in IceCube. We 2022 JINST 17 P11003 0.8 Module Angular Acceptance baseline (p0 = 0.1, p1 = -0.05) 0.7 p0 = -1.0 p0 = -0.5 0.6 p0 = 0.3 Relative Acceptance p1 = -0.15 0.5 p1 = -0.10 0.4 p1 = 0.00 p1 = 0.10 0.3 p1 = 0.15 0.2 0.1 0.0 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 cos Figure 7. Variations in the DOM angular acceptance used for the robustness tests. The angle 𝜂 is the photon arrival angle at the Module, with cos 𝜂 = 1 being head-on towards the face of the PMT and cos 𝜂 = −1 towards the backside of the PMT where generally th