Artificial Intelligence-Driven Facial Image Analysis for the Early Detection of Rare Diseases: Legal, Ethical, Forensic, and Cybersecurity Considerations

: This narrative review explores the potential, complexities, and consequences of using artificial intelligence (AI) to screen large government-held facial image databases for the early detection of rare genetic diseases. Government-held facial image databases, combined with the power of artificial intelligence, offer the potential to revolutionize the early diagnosis of rare genetic diseases. AI-powered phenotyping, as exemplified by the Face2Gene app, enables highly accurate genetic assessments from simple photographs. This and similar breakthrough technologies raise significant privacy and ethical concerns about potential government overreach augmented with the power of AI. This paper explores the concept, methods, and legal complexities of AI-based phenotyping within the EU. It highlights the transformative potential of such tools for public health while emphasizing the critical need to balance innovation with the protection of individual privacy and ethical boundaries. This comprehensive overview underscores the urgent need to develop robust safeguards around individual rights while responsibly utilizing AI’s potential for improved healthcare outcomes, including within a forensic context. Furthermore, the intersection of AI and sensitive genetic data necessitates proactive cybersecurity measures. Current and future developments must focus on securing AI models against attacks, ensuring data integrity, and safeguarding the privacy of individuals within this technological landscape.


Introduction
The use of artificial intelligence (AI) systems in the medical context raises many concerns, including legal, ethical, forensic, and, of course, cybersecurity concerns, as they present unique vulnerabilities that must be addressed.Adversaries can exploit AI models through data poisoning, adversarial attacks, or model theft, potentially leading to incorrect predictions, system malfunctions, and sensitive data breaches.Current developments focus on improving AI resilience to these threats, such as adversarial training to improve robustness and differential privacy techniques to protect training data.Future developments are likely to include the integration of explainable AI (XAI) to increase model transparency and help detect anomalies.This is also particularly important in medicine and forensics.In addition, the development of homomorphic encryption can enable the secure processing of sensitive data within AI models without decryption.Cybersecurity and AI will continue to evolve in a co-dependent manner, driving the need for continuous innovation to ensure the safe and responsible implementation of AI.The emergence of big data, including 2D and 3D facial scans collected by governments, has opened up new opportunities for the early diagnosis of rare genetic diseases using artificial intelligence.AI-assisted phenotyping offers the potential for comprehensive and accurate genetic assessments, potentially revolutionizing healthcare.However, this approach raises significant ethical and legal concerns about citizen privacy and the potential for government misuse.Navigating this complex landscape requires a delicate balance between protecting individual privacy and harnessing the transformative power of AI for public health.This paper explores the concept, potential methods, and legal limitations of AI-based phenotyping in the European Union (EU).It considers the ethical and legal implications of using AI to identify rare genetic diseases from facial photographs, highlighting the potential benefits while addressing the associated risks.The potential of AI algorithms, such as the Face2Gene app, to systematically screen large governmental datasets of facial traits over time holds immense promise for public health.However, this approach must be carefully considered, considering its profound implications for individual privacy and the ethical boundaries of government surveillance.This paper provides a comprehensive overview of AI-powered phenotyping for early genetic disease diagnosis, emphasizing the need for a balanced approach that safeguards individual rights while harnessing the transformative power of AI for public health.
The official definition of rare diseases depends on the region and differs between EU, USA, and WHO definitions.Generally, a rare disease is defined as a condition that affects a small percentage of the population.The prevalence of rare diseases in the population is estimated to affect between 3.5% and 5.9% of the global population.This translates to approximately 300 million people worldwide being affected by rare diseases at any given time.According to the definition used in the European Union ("EU") [1], a rare disease is defined as one that affects fewer than 1 in 2000 people.Depending on the definition, there are from 5000 to over 10,000 rare diseases [1,2].They can affect any system in the body and are usually chronic and difficult to manage.One in four affected patients report difficulty in obtaining a correct diagnosis [3].According to the United States (US) definition in the Rare Diseases Act of 2002, a rare disease is defined as one that affects fewer than 200,000 people in the US, which translates to approximately 1 in 1600 people, considering the US population [4,5].In the EU, which has a population of 447 million, the cumulative prevalence of rare diseases equates to an estimated 26-45 million cases.Rare diseases are an important cause of illness and death [3].
Traditionally, the diagnosis of genetic diseases requires obtaining and analyzing biological samples, particularly chromosomal, deoxyribonucleic acid (DNA), or ribonucleic acid (RNA).Unique facial features are recognized symptoms of many genetic diseases, including rare ones [6][7][8][9][10].With the boom of AI algorithms capable of determining phenotypic characteristics of genetic diseases based on facial photos, new possibilities are emerging.Even rare genetic disorders can now be identified by analyzing images of people's faces by artificial intelligence-which has now been shown to be much more accurate than a physician's evaluation [10][11][12].
The analysis of images of people's faces by artificial intelligence requires both the availability of simple facial images and an online AI agent capable of image processing.This has become possible with the advent of mobile phones and AI applications such as Face2Gene, which can instantly analyze 2D facial images free of charge, as the proliferation of internet-enabled mobile phones has exploded over the last decade.The number of smartphone mobile network subscriptions worldwide reached almost 6.6 billion in 2022 and is forecast to exceed 7.8 billion by 2028.China, India, and the United States are the countries with the highest number of smartphone mobile network subscriptions [13].The availability of smartphones resulted in the development of a mobile phone app facilitating early detection of rare genetic diseases-Face2Gene, using proprietary deep phenotyping DeepGestalt AI technology (FDNA Inc., Boston, MA, USA; www.face2gene.com).Possibly, DeepGestalt was the first digital health technology that could identify rare genetic disorders in people based on facial features alone.Since 2018, Face2Gene had a prominent impact on the diagnosis and management of genetic diseases [14,15].
As a result of such development, the privacy of genetic data and its safeguards have become a topic for discussion due to the increased amount of genomic data that are collected, used, and shared [16].AI-based genetic screening could support early risk stratification, accelerate diagnosis, and reduce mortality and morbidity through preventive care [12].Safeguarding genetic data is extremely important in a world where just a facial photo can reveal sensitive genetic information [15].
Mandatory national ID schemes collect large amounts of personal data into a centralized database.Following 11 September 2001, biometric identification has been increasingly included in these databases [17].While national ID is issued at a certain age, a passport is required even for newborns for international travel.As a result, national ID databases include 2D/3D photos combined with a set of biometric and personal data covering most of the nation's population.Increasing global usage of social networks allows for the scrapping of personal data and photos, even for non-state players who have sufficient resources and determination [18].
Such databases may eventually be used as data sources for population genetic screening.No such project is publicly known as of April 2024.Therefore, we discuss the viability of large-scale genetic screening using governmental datasets within the EU legal framework while suggesting limitations and privacy safeguards.
The goal of this narrative review is to map and bridge interdisciplinary aspects around the central theme of risks, potential, and consequences of using AI to screen large government-held facial image databases for the early detection of rare genetic diseases.This represents a multidisciplinary link between AI-driven technologies in the following areas:

•
Medicine, specifically genetics; • Technology, with a focus on cutting-edge AI models for facial analysis;

•
Law, with a focus on privacy laws (e.g., GDPR); • Ethics, i.e., ethical considerations around using citizen data; • Forensics in identifying missing persons and suspects within a legal framework; • Cybersecurity in addressing security risks of handling highly sensitive facial and genetic data.
The secondary objectives of this paper are as follows: -Raise awareness: Highlight the transformative potential of AI in healthcare, particularly for rare diseases where early detection is critical; -Spark ethical discussion: Provoke debate about the ethical boundaries of the use of government-held data, emphasizing the individual's right to privacy versus the potential public health benefits; -Inform policymakers: Provide insights to help policymakers craft responsible legislation governing the use of AI for genetic screening within existing privacy laws; -Highlight cybersecurity risks: Provide information on the specific vulnerabilities associated with AI-assisted genetic analysis and the handling of large amounts of AI data, highlighting the need for robust security measures; -Explore forensic applications: Open a discourse on the potential and ethical considerations of using AI facial analysis tools in forensics; -Stimulate further research: Identify gaps in current knowledge and technology and challenge researchers to address limitations, biases, and ethical dilemmas in the field.

The Concept
This narrative review aims to explore a complex concept with multidisciplinary implications.The definition of narrative and methodology in this intellectual work is fundamental to the findings and interpretations.When applied to real life, in an imaginary situation with an unprecedented opportunity to detect a rare disease in you or your child at an early stage and thus prevent premature health complications and even death through early therapeutic intervention, it seems like a simple choice.However, there are ethical, privacy, and cybersecurity risks with the application of AI-big government data processing-with considerable Orwellian implications.
This narrative review constructs a narrative focusing on the transformative potential of AI facial analysis in rare disease detection while highlighting the complex legal, ethical, forensic, and cybersecurity considerations.Its outline key themes are the following: 1.
The technological advancements enabling early detection; 2.
Conflicting perspectives on privacy in the age of AI; 3.
The potential for AI-powered forensics; 4.
The urgency of cybersecurity safeguards in this domain.

The Outline
The step-by-step approach covered within this narrative review is structured below.

Data Acquisition and Preprocessing
Source Databases: Origin of facial image data.This includes the following:  Research on synthetic image generation for data augmentation; c.
Integration with genetic sequencing for higher precision.
The use of AI-driven facial image analysis in forensic applications presents several specific ethical dilemmas.Privacy and consent are major concerns, as individuals are often analyzed without their knowledge or consent.Bias in criminal identification can lead to unequal treatment of different demographic groups, resulting in wrongful accusations or convictions, particularly affecting minority communities.Ensuring due process and a fair trial is challenging with opaque AI algorithms, as defendants must be able to challenge the evidence against them.Additionally, there is a risk of misuse of AI technology for unauthorized surveillance or political targeting, raising significant ethical and civil liberties concerns.

Conceptual System Model
To fully understand the transformative potential and implications of AI-driven facial image analysis for rare disease detection, it is essential to consider a comprehensive system model.This model integrates various inputs, processes, and outputs that align with this paper's core themes of legal, ethical, forensic, and cybersecurity considerations.By mapping out the interconnected elements and workflows, we can illustrate how AI technology can be effectively and responsibly implemented in real-world scenarios, ensuring that the benefits of early disease detection are realized while safeguarding individual rights and data integrity.The following section delineates these components in detail, providing a clear framework that supports the interdisciplinary approach advocated throughout this paper.Inputs:

•
Facial image databases maintained by the government; Figure 1 provides a visual representation of the key interconnected elements and processes involved in AI-driven facial analysis for the early detection of rare diseases, highlighting the legal, ethical, and security considerations.

Workflow-Based System Model
To operationalize the AI-powered facial image analysis for rare disease detection, it is crucial to establish a detailed workflow that outlines the data flow and decision-making processes.This section builds on the conceptual framework previously discussed and translates it into a practical, step-by-step model.By doing so, we can ensure that each stage of the system-from data acquisition to secure storage and output generation-adheres to the legal, ethical, forensic, and cybersecurity standards highlighted in this paper.The following workflow model provides a clear roadmap for implementing AI-driven solutions in a manner that maximizes public health benefits while minimizing risks to individual privacy and data security.

1.
Image acquisition and preprocessing: Images are obtained from secure databases, ensuring that all legal and privacy requirements are met.These images are then standardized in terms of size, resolution, and orientation to prepare them for AI analysis; 2.
Anonymization/pseudonymization: Techniques such as anonymization and pseudonymization are applied to the images to protect individual identities in compliance with ethical and privacy regulations; 3. AI analysis: The AI model processes the images using advanced algorithms to identify potential phenotypic markers indicative of rare genetic diseases; 4.
Data interpretation and risk assessment: The results generated by the AI model are carefully evaluated, considering potential risks and ethical implications.This step ensures that the analysis is accurate and that any identified risks are appropriately managed; 5. Decision-making: • Prevalence study branch: Anonymized data are utilized to contribute to populationlevel statistics, helping to map the prevalence of rare genetic diseases without revealing individual identities; • Individual risk branch: With adequate safeguards and informed consent in place, individuals flagged as potentially at-risk by the AI analysis may be contacted for further assessment or preventive measures.

6.
Secure storage: All data, including images and analysis results, are stored securely, with strict access controls and robust cybersecurity measures to prevent unauthorized access and data breaches; 7.
Output and recommendations: The findings from the AI analysis are translated into actionable insights and recommendations.These outputs can be used for various purposes, including public health interventions, policymaking, and forensic applications, ensuring that the data are utilized effectively while maintaining privacy and security.

Workflow-Based System Model
To operationalize the AI-powered facial image analysis for rare diseas is crucial to establish a detailed workflow that outlines the data flow and de  7. Output and recommendations: The findings from the AI analysis are transla actionable insights and recommendations.These outputs can be used for purposes, including public health interventions, policymaking, and forensic tions, ensuring that the data are utilized effectively while maintaining priv security.
Figure 2 shows the flow of data and decisions through the system, highligh critical stages and safeguards.The proposed model is flexible, allowing for upd enhancements based on feedback and evolving AI regulations.

Practical Challenges Specific to Forensic Applications
Implementing AI in forensic contexts of described workflows involves several practical challenges.Data quality and integrity are critical, as forensic investigations often deal with low-quality or partial data.Ensuring AI systems can accurately process such data while maintaining their integrity is essential.Interoperability with existing law enforcement systems requires standardization and compatibility across various platforms and jurisdictions.Legal admissibility of AI-derived evidence in court necessitates demonstrating the reliability and validity of the AI methods used.Providing adequate training and ongoing support to law enforcement personnel for effective use of AI tools is also crucial.Lastly, establishing robust ethical oversight and governance mechanisms to monitor AI systems and address any emerging ethical or legal issues is necessary for responsible implementation in forensic work.

Review Methodology
This narrative review synthesized existing knowledge on the use of AI-powered facial image analysis for rare disease detection, focusing on its legal, medical, ethical, forensic, and cybersecurity implications.The comprehensive search was conducted by representing experts in these fields (medicine, law, forensics, epidemiology, cybersecurity, and others) and across relevant databases such as PubMed, Web of Science, and Google Scholar, using search terms including "AI facial analysis", "rare diseases", "government databases", "privacy", "ethics", "forensics", and "cybersecurity".Included sources consisted of peer-reviewed research articles, reports from governmental and non-governmental organizations, legal analyses, and news publications.No date limits were applied to capture the rapidly evolving nature of the field.Findings from the collected literature were thematically synthesized to identify key concepts, opportunities, challenges, and areas for further exploration within this multidisciplinary domain.
We present the possibility of implementing a comprehensive population screening with an option to adopt preventive measures that ensure the privacy of citizens.We see two scenarios for common use.
The first option is represented by a technological solution that removes all identification of individuals from the AI algorithm output but provides epidemiological information on the prevalence of identified specific genetic diseases in specific age and sex groups.This scenario protects individual privacy while providing public health benefits to society.
To study the prevalence of genetic diseases at the population level, individual identification is not necessary.To achieve the objective, the target photo database without any link to identity data needs to be screened by face AI analysis.However, geographical characteristics of prevalence are desired to provide important public health information.Thus, the regions with locally increased prevalence of particular genetic diseases can be identified.Personal data protection is a challenge, given the low prevalence of most genetic diseases.A low level of data granularity will be necessary to maintain and guarantee the anonymity of individual cases.Prevalence on a municipality or county level may not be sufficiently anonymous as only a single case of genetic disease may exist within its population.Administrative region-level prevalence with a population above tens or hundreds of thousands of inhabitants may provide for a sufficient level of anonymity for most genetic diseases.Nomenclature of territorial units for statistics ("NUTS") used by Eurostat divides the territory of the EU and the UK into 92 regions at NUTS 1 (major socio-economic regions), 244 regions at NUTS 2 (basic regions for the application of regional policies), and 1165 regions at NUTS 3 level (small regions for specific diagnoses) [19].NUTS 2 regions meet the above requirements.
The outcome of this scenario is the population-level prevalence data without the possibility of identifying any individual with a particular genetic disease.Mapping the true population prevalence of particular genetic diseases is of high value as it allows for the assessment of the validity and accuracy of routine screening from an analysis of biological samples.
The second option is the population screening for genetic disorders to adopt preventive measures for those at risk.In this scenario, identification would be necessary at a certain stage of the process.
In this scenario, data pseudonymization would be necessary to ensure the privacy of the screened population.The Regulation (EU) 2016/679 of the European Parliament and of the Council on 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC ("GDPR"), governs the processing of personal data in the EU.GDPR Article 4 (5) defines pseudonymization as the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.To pseudonymize the target database, the identity linked to the photo shall be replaced by some sort of code (e.g., alphanumeric).The identity and the assigned code can be kept in a separate database with secure access, which allows the identification of those at risk if preventive action is possible and necessary.
The database of mandatory national ID schemes ("NID") seems to be the ideal candidate for population-wide testing.However, there are legal limitations and considerations for population screening in these databases.The European Commission already proposed a new regulation to create European Health Data Space ("EHDS"), which would allow researchers, innovators, policymakers, and regulators at the EU and member state levels to access relevant electronic health data to promote better diagnosis, treatment, and well-being of natural persons and lead to better and well-informed policies [20].

General Legal Limitations and Considerations
The presented concept, while already technically possible [with available technologies and means such as Face2Gene], cannot be implemented without considering applicable legislation.The Charter of Fundamental Rights of the European Union recognizes the protection of personal data as one of the fundamental freedoms.Charter Article 8 provides personal data must be processed fairly for specified purposes and based on the consent of the person concerned or some other legitimate basis laid down by law.From the privacy legislation perspective (e.g., GDPR), facial photos are considered personal data as they relate to an identified or identifiable natural person.Considering the ultimate aim of the proposed study-to identify generic disease-the photo shall be considered data concerning health as it reveals information about health status.Facial images may also be deemed biometrical data.Article 4 (13) of the GDPR defines genetic data as personal data relating to the inherited or acquired genetic characteristics of a natural person which give unique information about the physiology or the health of that natural person and which result, in particular, from an analysis of a biological sample from the natural person in question.However, GDPR Recital 34 notes that genetic data may also originate from the "analysis of another element enabling equivalent information to be obtained".Therefore, within the context of the intended aim, facial photos shall also be considered genetic data.
Due to the sensitive nature of health-related, biometric, and genetic data, data processing is subject to more stringent rules for the processing of personal data in general.We need to point out that GDPR also has extraterritorial reach if personal data concerns EU data subjects.
The importance of genetic data and the human genome resulted in the adoption of several international documents and treaties, such as United Nations Economic and Social Council resolutions 2001/39 on Genetic Privacy and Non-Discrimination on 26 July 2001 and 2003/232 on Genetic Privacy and Non-Discrimination on 22 July 2003, UNESCO's Universal Declaration on the Human Genome and Human Rights [21], International Declaration on Human Genetic Data [22], or Council of Europe Additional Protocol to the Convention on Human Rights and Biomedicine, concerning Genetic Testing for Health Purposes ("APGT") [23].
From the GDPR perspective, processing personal data shall be lawful and based on the appropriate Article 6 legal basis, including consent, contract to which the data subject is party, legal obligation, protection of the vital interest, performance of a task conducted in the public interest, or the exercise of official authority.Health-related, biometric, and genetic data are deemed a special category of personal data subject to GDPR Article 9, allowing the processing of such data only under certain conditions.Our assessment of data processing legal basis suitability is presented in Table 1.
Access to the NID databases is strictly regulated.Such data are as accurate as possible and updated regularly.Only a fraction of data stored in NID would be required for the prevalence study.The test data set shall follow the GDPR data minimization principle and be limited to necessary information only.The minimal dataset should consist of the following: (a) Photos and NUTS 2 location only for the prevalence study; (b) Photos and the assigned pseudonymous code for studies with an option to identify those at risk and facilitate preventive actions, with a separate database allowing identification when necessary.
Creating databases required for both prevalence study and identification of those at risk would require amending the NID legislation.
Table 1 demonstrates that upon meeting certain conditions, it is legally feasible to perform both a population prevalence screening study and an individual risk study.For the prevalence study, the adoption of appropriate legislation will create sufficient and suitable legal basis for personal data processing in compliance with the GDPR in form of legal obligation, public interest, exercise of official authority, medical diagnosis, or scientific research.The individual risk study requires adoption of the appropriate legislation but shall be accompanied by additional steps and measures.The least problematic would be an opt-in study design, where a subject or his/her legal representative consents to participate in the study, although that subject or his/her legal representative has the right to withdraw the consent at any time.
Having an appropriate legal basis will, however, not be enough to comply with EU privacy regulations.Such regulations will likely be the most important legal limitation for the intended studies, although others must be considered as well.
The GDPR contains a set of fundamental principles in Article 5.One of the principles is transparency, which requires that appropriate information is provided to the individual to ensure their understanding of who processes personal data and why.With an appropriate information campaign, especially in an opt-in study, this principle could be satisfied.Another principle is "purpose limitation", which requires that secondary uses of data are in line with the purpose for which such data were initially collected.In the case at hand, this may present a possible conflict as even a population study could be considered incompatible with the initial purpose, following requirements for compatibility test under Article 6 (4) GDPR.On the other hand, processing in the public interest or scientific research benefits from certain exemptions, which may, in turn, form the appropriate legal bases.We consider the "data minimization", "accuracy", and "integrity and confidentiality" principles as not imposing a significant hurdle, although they will have to be considered carefully in detail at the point of designing the study(ies).
Any study (either prevalence, individual risk, or any other study in the scope of this article) will have to be subject to a detailed assessment of the risks.In particular, a data protection impact assessment will have to be conducted, which will evaluate the necessity and proportionality of the processing with its purpose and the rights of the data subjects, including the measures addressing security and mitigating the identified risks.
The specific evaluation may even present unexpected risks, as the application of GDPR has reached, in some cases, extreme interpretations that may adversely impact data processing, which is the focus of this article.Judgment of the Court of Justice of the European Union in Case C-184/20 OT v Vyriausioji tarnybin ės etikos komisija clarified that GDPR Article 9 outlined that sensitive personal data should be interpreted rather broadly as even data that are liable to indirectly disclose the sexual orientation of a natural person is to be considered as sensitive personal data.Following this judgment, it seems that even any initial processing of photographs to derive indirect information relating to genes or genetic diseases should be, from the outset, considered sensitive data.
It remains unclear whether this interpretation would result in any processing of the same information to be considered as sensitive personal data.Such extreme interpretation might render the use of social media or even CCTV in public spaces impossible and would likely result in the re-categorization of a wide range of "normal" personal data (such as family photos) into sensitive genetic data.For the processing discussed within this article, it is sufficient to say that the underlying data, even if simple photographs, would be deemed sensitive genetic data from the initial point of collection.AI used for both population prevalence study and individual risk study is a software designed and intended for diagnosis, prevention, prediction, or prognosis of genetic disease.As such, it may be covered by the definition of medical device under Article 2 (1) of the Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices (MDR).While such software is used for research purposes only, MDR certification may not be necessary.Usage of the same software in a clinical or public health setting to diagnose rare diseases will require certification under MDR as a medical device of class I or, eventually, of class IIa.
We must note AI will become subject to EU regulation shortly.The first proposal for an AI Act was published by the European Commission in April 2021 [24].The Council of the European Union approved a compromise version of the proposed Artificial Intelligence Regulation (AI Act) on 6 December 2022 [25,26].On 9 December 2023, European Parliament reached a provisional agreement with the Council on the AI Act.The agreed text was adopted by Parliament on 13 March 2024 and will now have to be formally adopted by Council to become EU law [27].The AI Act is expected to come into force at the end of May 2024.In addition to AI Act, processing of health-related data may become subject to and be facilitated by the proposed EDHS regulation [20].
We intentionally limited legal limitation and consideration only to those listed above, as going into further details is beyond the scope of this paper [28].

Privacy Safeguards in the Study Design
The issue of personal privacy is critical when accessing National ID (NID) databases for research.While our approach emphasizes creating derived databases for enhanced privacy, it could be considered that preserving privacy means limiting the possibility of even having access to the NID database in the first place.Privacy regulations often restrict such access, complicating the generation of synthetic data and comprehensive analysis.In practice, preserving privacy requires strict legal and ethical guidelines, advanced anonymization techniques, and limited, monitored access to data.These measures can make analysis difficult, but they are necessary to balance privacy with research needs.
The intended studies shall not be performed on the NID data itself but on databases specifically created and derived from it.Considering the sensitivity of the study databases, security and privacy by design approach shall be the leading principle.To minimize security concerns, two separate data sets shall be created-one for population study and the other for individual risk study.Both data sets shall be anonymized/pseudonymized in as much as reaching the aim of the study would permit.
To minimize the risk of database compromise or leakage, appropriate security measures shall be implemented.Using cloud storage or outsourced IT solutions would significantly increase the risk of security breaches.It is important to note that violations of integrity and confidentiality of personal data, as one of the fundamental principles of data processing, may result in significant administrative fines.
Furthermore, any data transfer outside of the country of origin will be highly undesirable, and outside of the European Economic Area would also likely be problematic due to the international data transfer requirements set out under the GDPR.Dedicated IT infrastructure operated by EU member states or its dedicated agencies as public health authorities or national e-health operators would likely be preferred over commercially available solutions.Also, the tailored AI solution shall be operated on such dedicated infrastructure, as sending sensitive personal data for processing via the internet to the AI right holder constitutes additional risk.Court of Justice of the European Union (CJEU) addressed such risks, e.g., in the Schrems II ruling (Data Protection Commissioner v. Facebook Ireland and Maximillian Schrems, Case C-311/18).
The legislation allowing the intended studies shall create an environment for fair and transparent processing of the test data sets.To achieve the highest attainable standard of data and cyber security, minimal security standards and required safeguards shall be integral parts of such legislation.

Cybersecurity Aspects and Specific Strategies for Mitigating Risks
The cybersecurity aspect of implementing AI-driven facial image analysis for rare disease detection and forensic applications is critical due to the sensitive nature of the data involved.Protecting facial images and genetic information from unauthorized access and data breaches is paramount.Specific strategies to mitigate these risks include implementing robust encryption methods for data storage and transmission and ensuring that sensitive data are encrypted both at rest and in transit.
One specific strategy is the use of homomorphic encryption, which allows data to be processed and analyzed without the need to decrypt it.This technique ensures that sensitive information remains secure even while being computed, significantly reducing the risk of exposure.
In addition, the use of multi-factor authentication (MFA) to access databases can add an extra layer of security and reduce the risk of unauthorized access.Regular cybersecurity audits and vulnerability assessments should be conducted to identify and address potential security gaps.Advanced threat detection systems, such as intrusion detection systems (IDS) and intrusion prevention systems (IPS), can help identify and mitigate potential cyber-attacks in real time.
Implementing differential privacy techniques can protect individual data by adding noise to the data set, making it more difficult to extract personal information.Ensuring strict access controls so that only authorized personnel have access to sensitive data is essential.Educating and training staff on cybersecurity best practices can further strengthen an organization's security posture.
Integrating these strategies, particularly the use of homomorphic encryption, can significantly mitigate the risks associated with handling sensitive data in AI-driven applications, ensuring the confidentiality, integrity, and availability of the data.

Discussion
Current advances in AI are transforming all relevant fields, including medical education [29], in comparison to historical paradigm shifts and have a significant impact on various forensic applications, including human remains identification [30].Omnipresent mobile devices can recognize 3D faces with increasing precision [31].Various forensic applications can be significantly improved with AI applications, from morphometric analy-sis [32] or AI behavioral analyses and predictions in the prevention of violent or murderous behavior [33] to age estimation in minors [34].
Genome-wide association studies (GWAS) have investigated the association between normal facial variation and millions of single-nucleotide polymorphisms (SNPs).Over 50 loci associated with facial traits have already been identified [35].
Described AI technologies could facilitate automated population screening for certain genetic traits or disorders without the need to obtain any biological sample from nationstates or even private companies.The most significant barrier preventing this (ab)use of AI is the privacy legislation, such as the European Union's General Data Protection Regulation (GDPR) [36,37].From the GDPR perspective, consent, legal obligation, and public interest may be considered viable options for the selection of an appropriate legal basis.Therefore, the adoption of legislation allowing facial-based AI population screening for certain health/genetic traits could be imaginable.
Facial-based population genetic screening using AI may have been initially tested as a scientific research project.This would allow to benefit from the research exception under GDPR Article 89 (2), allowing the EU or a member state to derogations from the data subject rights referred to in Articles 15 (right of access), 16 (right to rectification), 18 (right to restriction of processing), and 21 (right to object).Recital 156 provides an even broader scope of derogations, including information requirement, erasure, right to be forgotten, and data portability [22].
The available AI applications in genetics open new possibilities.Population screening using facial AI recognition of genetic and health traits would be a cheap and effective tool to identify individuals at risk with subsequent targeted prevention or health care provision to save the limited resources of the health care system.Law enforcement and intelligence agencies are, in some cases, exempt from privacy-protecting legislation and may obtain access to, process, and combine almost any type of personal data in the interest of national security.Therefore, the dystopia of Orwell's 1984 world could significantly underestimate the advanced surveillance modalities readily available in the 21st century [38][39][40].
Without a doubt, AI applications in genetics can create extreme risks to privacy and anonymity.AI-supported genetic applications can be used to discriminate minorities based on race, ethnicity, or health traits.The ethical implications of AI-based genetic applications are still only partially understood and should be carefully analyzed to allow for effective legal regulation.Any such study shall be governed by ethical principles of autonomy, beneficence, non-maleficence, and justice [41].
From forensic considerations, the potential uses of AI-powered phenotyping for law enforcement cannot be overstated.This could revolutionize missing persons cases, helping investigators quickly match unidentified remains or photos of missing persons to facial scans in existing databases.Furthermore, facial recognition could aid in the identification of suspects from surveillance footage, providing valuable leads and potentially even allowing for identification based on genetic markers extracted via AI.However, ethical considerations are paramount in this domain.Questions about consent, accuracy in suspect identification, and the potential for biases that could lead to wrongful convictions must be carefully navigated to ensure the responsible implementation of such powerful technologies.
Population-wide screening for genetic disorders without the necessity to obtain biological samples is already possible with available technologies such as Face2Gene [14,15].This conclusion applies to both epidemiological and individual risk studies.The development and usage of such advanced technologies face significant legal limitations, especially in the heavily regulated EU landscape.AI for proposed studies may, in certain applications, require certification as a medical device and eventually will also be subject to regulation by the AI Act, adopted on 13 March 2024 by the European Parliament.The AI Act provides AI regulatory sandboxes a controlled environment that facilitates the development, testing, and validation of innovative AI systems for a limited time before their placement on the market or putting into service pursuant to a specific plan (Article 57 of the AI Act).It is noteworthy that Article 59 of the AI Act allows the processing of personal data lawfully collected for other purposes in the sandbox for developing and testing certain innovative AI systems inter alia also in the area of public safety and public health, including disease prevention, control, and treatment [24,42].
The EU privacy legislation introduces additional significant barriers (or safeguards, depending on the point of view), but with properly drafted specific legislation, such studies could eventually be performed.
AI applied to big datasets derived from NID databases may generate unprecedented epidemiological data on genetic diseases and their development over time.The comparison of data obtained from present sources and data that may be obtained with proposed studies may result in the discovery of discrepancies.In such cases, further analysis may be performed.With NID data on kinship, genetic diseases may be tracked over the generations, which could bring valuable information from the public health point of view.
However, the AI algorithm is only as good as its training data set-and there is a risk, especially where rare disorders affect only small numbers of people.There are several instances when the unbalanced data set composition unintentionally created biased AI [30].A 2021 study on the performance of two prediction models for death by suicide after mental health visits accurately predicted suicide risk for visits of White, Hispanic, and Asian patients, but performance was poor for visits of Black and American Indian/Alaskan native patients and patients without race/ethnicity reported [43,44].Training data sets containing mostly Caucasian faces remain a concern.A 2017 study of children with an intellectual disability found that while Face2Gene's recognition rate for Down syndrome was 80% among white Belgian children, it was just 37% for black Congolese children [45].The program's accuracy has improved slightly as more medical professionals upload patient photos to the app.There are now more than 200,000 images in the FDNA database [14].
The mere possibility of the studies does not necessarily mean such studies shall be implemented without further discussion.While we presented significant potentially beneficial use cases, taking into consideration ethical aspects will be necessary in the future.One such ethical dilemma would be the discussion of whether the person who is an unwitting carrier of certain genetic disorders would like to obtain information on his/her carrier status.With the properly provided advance information and informed consent duly obtained, such a dilemma seems to be easily resolvable.But there may be several arguments to the contrary.AI's transformative power extends beyond genetic diseases.A recent publication in Forensic Sciences by Brown 2023 [46] highlighted AI applications in other forensic domains.In this paper, AI is considered as a solution to the marijuana controversies including forensics of abuse.The paper similarly discusses aspects of legal and ethical challenges (e.g., concerns about accuracy, bias impacting prosecution, and the need for rigorous standards).We are in agreement with the need for robust ethical guidelines and regulations governing innovative AI applications across the forensic landscape.Authors have discussed these ideas in their preliminary preprint-by Kováč et al. (2023)-Phenotyping Genetic Diseases Through Artificial Intelligence Use of Large Datasets of Government-stored Facial Photographs: Concept, Legal Issues, and Challenges in the European Union [47].
While the Face2Gene app represents a significant advancement in AI-driven facial image analysis for the early detection of rare diseases, it also has several limitations.Legally, using the app requires navigating complex regulations around patient data privacy and consent, particularly across different jurisdictions.Ethically, there are concerns about potential biases in the AI algorithms that could lead to misdiagnosis or unequal access to accurate diagnosis for different populations.Forensically, the integration of the app into clinical practice must ensure rigorous standards to avoid misuse or over-reliance on AI without adequate human oversight.Finally, from a cybersecurity perspective, protecting the sensitive health data used and generated by the app is paramount to prevent unauthorized access and data breaches that could compromise patient confidentiality and trust.Similar considerations apply to any rapidly evolving AI agents used in other steps of the concept presented, including data anonymization.To enrich the discussion on the ethical and legal issues surrounding AI-driven facial image analysis for rare disease detection, the following two examples can be considered.They can be used to better illustrate the practical implications and challenges of deploying AI-driven facial image analysis for rare disease detection in different legal and ethical landscapes:

Application in the United States: Integration with National Health Databases
In the United States, the integration of AI-driven facial recognition tools like Face2Gene with national health databases poses significant legal and ethical challenges.One potential application is using these tools for early screening of genetic disorders in newborns.While this could revolutionize early diagnosis and treatment, it raises concerns about compliance with the Health Insurance Portability and Accountability Act (HIPAA) and other statespecific privacy laws.These regulations require stringent measures to protect patient data and ensure consent.Additionally, the risk of data breaches and misuse of sensitive genetic information by unauthorized parties necessitates robust cybersecurity protocols.The ethical dilemma of potential biases in AI algorithms, leading to misdiagnoses or unequal treatment across different racial and ethnic groups, also demands careful consideration and mitigation strategies.

Application in the European Union: Cross-Border Data Sharing and GDPR Compliance
Within the European Union, AI-driven facial image analysis for rare disease detection could be applied through collaborative cross-border health initiatives aimed at improving diagnostic accuracy and sharing genetic research data.However, this application faces substantial legal hurdles due to the General Data Protection Regulation (GDPR).GDPR imposes strict requirements on data privacy and cross-border data transfers, necessitating comprehensive data protection impact assessments and ensuring explicit informed consent from individuals.Challenges include harmonizing diverse national regulations within the EU member states and addressing potential conflicts with GDPR's principles of data minimization and purpose limitation.Moreover, ethical concerns about the potential for AI to perpetuate biases and the need for transparent, explainable AI models are crucial to maintaining public trust and achieving equitable healthcare outcomes across different regions.
Future research can create more accurate, fair, and ethically sound AI-driven facial image analysis technologies.It could prioritize the following areas:

Conclusions
AI-based phenotyping has profound potential for enhancing public health initiatives through the early detection of rare genetic diseases.However, this technology also carries significant implications for the forensic domain.Its ability to identify individuals by their facial features presents both potential benefits and ethical challenges for law enforcement.

IFigure 1 .
Figure 1.The Conceptual System Model shows the key interconnected elements an volved in AI-powered facial analysis for rare disease detection, emphasizing the le security aspects.

Figure 1 .
Figure 1.The Conceptual System Model shows the key interconnected elements and processes involved in AI-powered facial analysis for rare disease detection, emphasizing the legal, ethical, and security aspects.

Figure 2
Figure2shows the flow of data and decisions through the system, highlighting the critical stages and safeguards.The proposed model is flexible, allowing for updates and enhancements based on feedback and evolving AI regulations.

Figure 2 .
Figure 2. The Workflow-Based System Model shows the flow of data and decisions thr system, highlighting the critical stages and safeguards.

Figure 2 .
Figure 2. The Workflow-Based System Model shows the flow of data and decisions through the system, highlighting the critical stages and safeguards.

•
Knowledge repositories on genetic diseases and corresponding facial phenotypes;• Applicable legislation and regulations (such as GDPR, AI Act, etc.); • • AI algorithms designed for facial image analysis; • Processes for anonymization and pseudonymization; • Data storage and access control mechanisms; • Decision-making frameworks (e.g., identifying cases for further evaluation); • Communication channels (e.g., dissemination of results for public health interventions).Outputs: • Data on the prevalence of rare genetic diseases; • Identification of individuals at risk; • Potential findings for forensic applications; • Policy recommendations for lawmakers; • Insights for developers in technology and security sectors.

• Diverse data collection:
Collect comprehensive datasets from underrepresented populations, including different ethnicities, ages, and genders, to reduce bias in AI models; • Bias detection and mitigation: Develop and implement fairness-aware algorithms and bias correction techniques.Use transfer learning to improve model generalizability and reduce bias; • Explainable AI (XAI): Focus on building models that provide transparent and interpretable results to build trust and facilitate bias identification; • Robust validation frameworks: Establish rigorous validation protocols that assess AI models across diverse demographic groups before deployment; • Multidisciplinary collaboration: Encourage collaboration between geneticists, data scientists, ethicists, and legal experts to ensure comprehensive evaluation and ethical deployment; • Standardized protocols: Develop standardized data collection, model training, and validation protocols across jurisdictions to improve reliability and acceptability.