A Conditional Privacy-Preserving Identity-Authentication Scheme for Federated Learning in the Internet of Vehicles

With the rapid development of artificial intelligence and Internet of Things (IoT) technologies, automotive companies are integrating federated learning into connected vehicles to provide users with smarter services. Federated learning enables vehicles to collaboratively train a global model without sharing sensitive local data, thereby mitigating privacy risks. However, the dynamic and open nature of the Internet of Vehicles (IoV) makes it vulnerable to potential attacks, where attackers may intercept or tamper with transmitted local model parameters, compromising their integrity and exposing user privacy. Although existing solutions like differential privacy and encryption can address these issues, they may reduce data usability or increase computational complexity. To tackle these challenges, we propose a conditional privacy-preserving identity-authentication scheme, CPPA-SM2, to provide privacy protection for federated learning. Unlike existing methods, CPPA-SM2 allows vehicles to participate in training anonymously, thereby achieving efficient privacy protection. Performance evaluations and experimental results demonstrate that, compared to state-of-the-art schemes, CPPA-SM2 significantly reduces the overhead of signing, verification and communication while achieving more security features.


Introduction
With the rapid development of intelligent transportation systems and Internet of Things (IoT) technology, the Internet of Vehicles (IoV) has become an essential component of smart cities [1].IoV enables real-time sharing of traffic information and intelligent coordination of vehicles through communication between vehicles and between vehicles and infrastructure.Additionally, with the advancement of machine learning technology, many automotive companies are leveraging machine learning in the IoV to provide more intelligent and efficient services to users [2].By collecting a large amount of vehicle data to train models, they offer applications such as autonomous driving and traffic flow prediction [3].However, traditional centralized model training requires gathering vehicle data to the central server for training.Since this vehicle data often contains a significant amount of personal information, such as driving habits, travel routes, home and work locations, many users are concerned about privacy breaches and are reluctant to send their data to the central server [4].Moreover, recent data security regulations prohibit automotive companies from collecting user data without authorization.To address these privacy concerns, federated learning (FL) has emerged as a solution [5].FL is a decentralized machine learning approach where multiple clients (such as smartphones, vehicles or other devices) collaboratively train a shared model under the orchestration of a central server while keeping the data localized [6].Instead of sending raw data to a central server, each client processes the data locally and only shares the model updates (like gradients or parameters) with the central server.The server then aggregates these updates to form a global model.Currently, FL has been widely applied in various IoV scenarios, such as trajectory prediction, advanced driver-assistance systems and traffic flow prediction and management [7].
Although FL addresses the issue of data silos, researchers have found that without proper protection of the transmitted model parameters, attackers can still infer privacy information about user data [8].Additionally, during the aggregation of parameters by the central server, there is a risk that the server may attempt to infer original data information from the uploaded model parameters.Moreover, due to the open nature of the IoV, attackers can easily eavesdrop on and manipulate messages transmitted between vehicles, gaining access to the vehicles' real identities and further tracking their behaviors, posing a threat to user privacy [9].
To address the issue of privacy leakage in federated learning, existing solutions are mainly categorized into differential privacy (DP) [10][11][12] and encryption techniques [13][14][15][16][17][18].DP protects the privacy of original data by adding random noise to model parameters.Wei et al. [10] proposed a differential privacy-based federated learning framework, which achieves different levels of differential privacy protection by adding artificial noise to client parameters before aggregation.Zhao et al. [11] combined DP with federated learning, proposing four localized differential privacy mechanisms to perturb gradients generated by vehicles, thereby preventing privacy leakage.Zhou et al. [12] achieved high-level privacy protection by adding noise and theoretically proved the convergence of their algorithm.Although DP-based solutions have been extended to all machine learning algorithms in deep learning, the added random noise can degrade model accuracy and extend the model convergence time.Encryption-based solutions can be divided into homomorphic encryption and secure multiparty computation (SMC).Zhou et al. [13] combined differential privacy, blinding and Paillier homomorphic encryption to resist model attacks and achieve secure aggregation of model parameters.Ma et al. [14] proposed a dual-trapdoor homomorphic encryption scheme, ShieldFL, which can defend against model poisoning attacks and protect privacy.They also introduced a secure cosine similarity method for Byzantinerobust aggregation.Hijazi et al. [15] introduce four different fully homomorphic encryption (FHE)-based methods for FL, which securely transmit model parameters in encrypted form, thereby enhancing robust privacy and security protection.Zhang et al. [16] present a lightweight dual-server secure aggregation protocol based on secret sharing, achieving both privacy protection and Byzantine robustness.A typical example is secret sharing.This method reduces computational overhead compared to homomorphic encryption but increases the number of communication rounds and communication overhead, thereby hindering the training efficiency of federated learning.Furthermore, encryption-based solutions prevent the cloud server from directly accessing plaintext local model parameters during aggregation.This hinders integration with Byzantine-robust federated learning defense mechanisms [17,18], as existing Byzantine-robust defense mechanisms focus on computing similarities directly on plaintext model parameters.Therefore, it is necessary to research a privacy-preserving federated learning solution suitable for the IoV that can balance efficiency and practicality.
To ensure the authenticity and integrity of communication data in the IoV, many identity-authentication protocols have been proposed [19].Currently, existing identityauthentication protocols in the IoV can be primarily categorized into three types: public key infrastructure-based (PKI-based) [20], identity-based (ID-based) [21][22][23][24] and certificatelessbased [25][26][27][28].PKI-based identity-authentication protocols bind a vehicle's identity to its public key through digital certificates.Vehicles use their private keys to sign messages, and verifiers use the public keys from the vehicle's digital certificates to verify the signatures.The main drawback of this method is the significant storage and maintenance overhead associated with managing a large number of digital certificates and certificate revocation lists.Identity-based authentication protocols directly use the vehicle's identity information as the public key, thereby avoiding the overhead of certificate management and maintenance.Zhao et al. [22] proposed an identity-based federated learning collaborative authentication protocol for shared data, achieving efficient anonymous authentication and key agreement between vehicles and other entities.Zhang et al. [23] proposed an ID-based conditional privacy-preserving identity-authentication scheme that does not require bilinear pairings or hash-to-point operations, enabling efficient vehicle authentication.Kanchan et al. [24] proposed a federated learning algorithm based on group signatures, enhancing the protection of node identities.Although ID-based identity-authentication schemes can achieve efficient vehicle authentication, they have the issue of key escrow.Therefore, certificateless identity-authentication schemes have been proposed as a promising solution.However, this approach has a key escrow problem, as the Trusted Authority (TA) has full control over the vehicle's private keys and can generate legitimate signatures for any vehicle.To address the key escrow issue, certificateless authentication protocols have been proposed.In these protocols, a vehicle's private key consists of two parts: one part is a secret value selected by the vehicle itself, and the other part is a partial private key generated by TA.Lin et al. [25] proposed a certificateless authentication and key agreement protocol for IoV based on blockchain.This protocol utilizes the decentralized architecture of blockchain to achieve decentralized trusted third-party services, thus mitigating issues such as single-point failure and the risk of trusted third-party disclosure.It aims to achieve efficient authentication between vehicles.Jiang et al. [26] proposed a certificateless anonymous identity-authentication scheme, which aims to anonymize the relationship between terminal identities and data.However, the use of bilinear pairing operations affects authentication efficiency.Ma et al. [27] extended Jiang's work by proposing a certificateless identity-authentication scheme that does not require bilinear pairing operations and supports batch verification.However, this scheme lacks dynamic member-management capabilities, and the pseudonyms generated by vehicles cannot be dynamically updated.Currently, most existing certificateless authentication protocols use bilinear pairing operations or do not support batch verification, leading to low authentication efficiency.Additionally, most certificateless authentication protocols are independently designed and are not integrated with existing international standard cryptographic algorithms, making them inconvenient for practical application and widespread adoption.Therefore, it is necessary to study an efficient authentication protocol to establish a secure communication environment for the IoV.
To address the aforementioned challenges, we propose a conditional privacy-preserving authentication scheme called CPPA-SM2, which provides secure authentication and privacy protection for vehicle communication and federated learning in the IoV.Specifically, it is based on the fact that if vehicles send messages and participate in training anonymously, even if attackers or the cloud server obtain the plaintext local model parameters and infer some data information, they cannot associate this information with a specific real vehicle identity, thus achieving privacy protection.Our main contributions are as follows: When a malicious vehicle is detected in the system, TA can use the system master secret key to trace its real identity and then revoke it from the federated learning system.
• We conducted a security proof and an informal security analysis of the CPPA-SM2 scheme.Additionally, we evaluated its performance through experiments and compared it with other schemes.The experimental results show that CPPA-SM2 can achieve efficient and secure authentication for vehicles while providing privacy protection for federated learning.
The remainder of this paper is organized as follows.Section 2 presents the notation definitions, mathematical background, system model, threat model, security model and design objectives.Section 3 details the implementation of the CPPA-SM2 scheme.Section 4 provides the correctness and security proof of the CPPA-SM2 scheme along with an informal security analysis.Section 5 evaluates the performance of the CPPA-SM2 scheme and compares it with other schemes.Section 6 concludes the paper.

Preliminaries
In this section, we mainly introduce the preliminary knowledge, system model, threat model, security model and design goals.The relevant symbols used in this paper are explained in Table 1.

Chinese Remainder Theorem
The Chinese Remainder Theorem (CRT) [23,28] is a theorem of number theory that allows one to solve systems of simultaneous congruences with different moduli.It asserts that if one knows the remainders of the division of an integer by several pairwise coprime integers, then one can determine uniquely the remainder of the division of that integer by the product of these integers, under certain conditions.
Let sk 1 , sk 2 , . . ., sk n be pairwise co-prime positive numbers and l 1 , l 2 , . . .l n be any given n positive integers.Then, CRT asserts that the following simultaneous congruence equation X ≡ l 1 mod sk 1 , X ≡ l 2 mod sk 2 , . . ., X ≡ l n mod sk n (1) has a unique solution X module θ, where θ = sk 1 sk 2 • • • sk n = ∏ n i=1 sk i , and the X can be obtained by the following equation: where a i = θ/sk i and b i = (a i ) −1 modsk i .

Elliptic Curve Cryptosystem
Consider a finite field F p determined by a prime number p. Let E(F p ) be a set of elliptic curve points over F p defined by the equation y 2 = x 3 + ax + b mod p, where a, b ∈ F p and (4a 3 + 27b 2 ) mod p ̸ = 0.The elliptic curve E(F p ) includes both scalar multiplication and point addition operations.G is an additive cyclic group with order q.The Elliptic Curve Discrete Logarithm Problem (ECDLP) is defined as follows: Given two random points P, Q ∈ G on elliptic curve E(F p ), where Q = xP, x ∈ Z * q , it has been proven that calculating x from Q is computationally difficult.In other words, it is infeasible to find x in polynomial time with a non-negligible probability [29,30].

SM2 Digital Signature Algorithm
The SM2 digital signature algorithm [31] is a public key cryptographic algorithm based on elliptic curve cryptography, developed by the Chinese State Cryptography Administration.It is part of the Chinese National Standards (GB/T 32918.[32] and is widely used for secure communications in China.The SM2 digital signature algorithm consists of three main phases: Key Generation, Signature Generation and Signature Verification.
B outputs false and exits.Then B computes true; otherwise, it outputs false.

System Model
In the IoV, a federated learning system primarily includes four entities: a trusted authority (TA), cloud server (CS), roadside units (RSUs) and vehicles, as shown in Figure 1.
TA: This is a trusted third party, typically the traffic-management department.It is primarily responsible for system initialization, registration of vehicles and RSUs, generating related keys for them and managing identities.In this paper, when a malicious vehicle uploads false local model parameters or forges identity information, the TA can trace its real identity and revoke it from the system.
Vehicles: These are the data owners and participants in federated learning.They use their locally collected data to train the global model received from CS, and then upload the local model parameters.In this paper, vehicles participate in federated learning using pseudonyms, sign the locally trained model parameters and then send them to the nearby RSU.
RSUs: These verify the authenticity and integrity of the local model parameters uploaded by vehicles.They use the FedAvg algorithm [5] to perform local aggregation on these parameters to obtain local aggregation results, which are then uploaded to the cloud server for global aggregation.Additionally, they broadcast the global model issued by TA to the vehicles within their communication range.
CS: Upon receiving the local aggregation results uploaded by RSUs, CS uses FedAvg to perform global aggregation to obtain the global model for the next round of training.The new global model is then distributed to the vehicles to begin the next training round.Through multiple iterations, the performance of the global model can be improved, enabling the cloud server to utilize the results for practical predictions, judgments and applications.( , )

System Model
In the IoV, a federated learning system primarily includes four entities: a trusted authority (TA), cloud server (CS), roadside units (RSUs) and vehicles, as shown in Figure 1.

Threat Model and Security Model
In the threat model, CS and RSUs are considered honest-but-curious.This means they will honestly follow the protocol to verify vehicle identities and the authenticity and integrity of model parameters, and they will aggregate local models to obtain the global model [33].However, they are curious about the private data owned by the vehicles and may attempt to recover the vehicles' original data and reveal their true identities by analyzing the received model parameters.Therefore, they might pose a threat to vehicle privacy.Vehicles may be malicious and can launch free-riding attacks and data-poisoning attacks by uploading false model parameters.They may also forge identities and signatures to attempt to have fake messages successfully authenticated by RSUs.Additionally, they might try to infer the privacy information of other vehicles.Attackers can fully control the wireless communication channels between vehicles, RSUs, TA and CS.They can intercept messages on the channel, tamper with messages, replay old messages and attempt to impersonate other vehicles to send messages [34].
Based on the aforementioned threats and the certificateless signature security model [27,28,30], our proposed security model is as follows.The hash functions used in this model are assumed to be random oracles.
In the security model, we consider two types of adversaries, A I and A I I .A I can launch public key-replacement attacks but cannot access system master secret key s.A I I can access the system master secret key but cannot perform public key-replacement attacks.Both types of adversaries will engage in two separate games with the challenger C.
Game 1: This security game is executed between A I and C. C initializes the system using the security parameter λ generating system master secret key s and system public parameters param.C secretly keeps s and sends the public parameters to A I .A I can perform the following queries. - Definition 1. CPPA-SM2 is existentially unforgeable under adaptive chosen-identity and chosenmessage attacks if no polynomial-time adversary A I can win the above game with non-negligible advantage.
Game 2: This security game is executed between A I I and C. C initializes the system using the security parameter λ generating system master secret key s and system public parameters param.C sends them to A I I .-Query: A II can perform all the queries from Game 1 except for Public-Key-Replace-queries.
-Forgery: Once A II has completed the desired queries, it outputs Definition 2. CPPA-SM2 is existentially unforgeable under adaptive chosen-identity and chosenmessage attacks if no polynomial-time adversary A I I can win the above game with non-negligible advantage.

Design Goals
Under the security model, CPPA-SM2 primarily has the following design goals: Anonymity and Privacy-Preserving: CPPA-SM2 should protect the privacy of vehicles participating in federated learning training.No entity other than TA should be able to infer the true identity of the vehicles.
Authenticity and Integrity: CPPA-SM2 should ensure that the local model parameters received by RSUs are from legitimate vehicles and that they have not been tampered with during transmission.
Un-linkability: Attackers cannot link any two messages sent by the same vehicle.Un-forgeability: Attackers cannot forge signatures of other vehicles on messages, allowing RSUs to successfully verify the signatures.
Non-repudiation: Once a vehicle uploads local model parameters and they are authenticated, the vehicle cannot deny its contribution to the global model.
Forward Security: When a vehicle joins a group, it cannot access communications that occurred before its joining, meaning it cannot participate in previous federated learning training processes of the group.
Backward Security: When a vehicle leaves the group or is revoked by the TA, it cannot participate in the current model training process or access communications that occur after its departure from the group.
In addition to achieving the aforementioned security goals, CPPA-SM2 should also have efficient authentication efficiency and lower communication overhead to adapt to the communication environment of IoV.In particular, when a large number of vehicles participate in federated learning training, RSUs should be able to authenticate them in batches.

The Proposed Scheme
In this section, we present a certificateless conditional privacy-preserving identityauthentication protocol based on CRT and the SM2 digital signature algorithm, named CPPA-SM2.CPPA-SM2 aims to provide privacy protection for vehicles participating in federated learning.It consists of five phases: system initialization, registration, message sign, message verification and group member management.First, TA initializes the system and publishes the system's public parameters.Then, vehicles and RSUs register with TA before participating in communications.Through registration, they obtain the public and private keys required for subsequent communications.In the message signing phase, vehicles train a model based on their local datasets and then sign the local model parameters before sending them to RSU.RSU, upon receiving the local model parameters from nearby vehicles, verifies the signatures and aggregates the verified local model parameters to obtain a local aggregation result.RSU then sends this local aggregation result to CS for global aggregation, resulting in the next round of the global model.If a malicious vehicle is detected uploading malicious model parameters or forging signatures, TA can trace its identity and revoke it from the system.The overall workflow of CPPA-SM2 is illustrated in Figure 2 and Protocol 1.The details of the scheme are as follows.
For TA: 1: Use λ to generate two large prime numbers p and q. 2: Randomly select s ∈ Z * q and calculates P pub = s • G.

⃝ Registration
For each vehicle: For each RSU: 1: RSU j sends ID RSUj to TA. 2: TA generates a pair of public and private keys (sk RSU j , pk RSU j ) and sends them to RSU j .3. RSU j sets (sk RSU j , pk RSU j ).For TA: 2: Randomly pick a group key K ∈ Z * q and calculate the group public key β = K • u and D pub = K • G. 3: Sign β, D pub and the K's valid period T K using its private key sk TA and broadcast the information , where T a represents the arrival time, continues; otherwise, discards.2:   , 1, 2,3, 4,5 { , , ( ), , , , , , , , , }

System Initialization
TA uses a security parameter λ to generate two large prime numbers p and q, where p > q, q ≤ ⌈p/4⌉.Let E(F p ) denote an elliptic curve over the finite field F p and G denote a base point on the elliptic curve E(F p ) with order q.Let G be an additive cyclic group generated by G. TA randomly selects s ∈ Z * q as the system master secret key and calculates the system public key P pub = s • G.Then, TA chooses five one-way hash functions . TA secretly holds s and publishes the system's public parameters param = p, q, E(F p ), G, G, Z * q , P pub , H 1 , H 2 , H 3 , H 4 , H 5 .

Registration
In the registration phase, both vehicles and RSUs need to register with TA to obtain the relevant keys for subsequent communications.We assume that TA is fully trusted and that the entire registration phase is conducted over a secure channel, eliminating the risk of privacy leaks and security attacks.

Vehicle Registration
For a vehicle V i with its real identity RID i , it first randomly selects x i ∈ Z * q as its secret value and calculates X i = x i • G as its first part of the public key.Then, V i sends (RID i , X i ) to TA. Upon receiving (RID i , X i ), TA calculates where y i and Y i serve as V i 's partial private key and the second part of the public key.In addition, TA randomly selects a prime number sk i ∈ Z * q as a secret key for V i .Completing these computations, TA returns y i , Y i and sk i to V i .Upon receiving y i , Y i and sk i , V i sets (x i , y i ) as its full private key, (X i , Y i ) as its full public key and uses sk i for subsequent group communications.

RSU Registration
For a roadside unit RSU j with its identity ID RSUj , TA generates a pair of public and private keys (sk RSU j , pk RSU j ).Then, TA distributes them to RSU j .Here, we assume that all vehicles know the public keys of TA and RSUs.

Group Key Generate
To ensure that the uploaded local model parameters come from legitimate vehicles and to support efficient group communication, TA constructs a communication group C n for them based on the secret keys sk i of n vehicles and CRT.TA first calculates θ = ∏ n i=1 sk i , Then, TA randomly picks a group key K ∈ Z * q and calculates the group public key β = K • u and D pub = K • G. TA signs β, D pub and the K's valid period T K using its private key sk TA and broadcasts the information {β, D pub , SIG sk TA (β||D pub ||T K )} to vehicles and RSUs in C n .Once receiving the broadcast information, any authorized vehicle in C n can obtain K by performing a modulus operation K ≡ β mod sk i according to CRT.

Message Sign
In the t − th round of training, the vehicle V i trains the global model W t global using its local dataset D i to obtain the local model parameters W t i , i.e., W t i ← W t global − η∇L(W t global , D i ) .Before sending the local model parameter W t i to the nearby RSU j , the vehicle V i signs it as follows to ensure the authenticity and integrity of W t i .V i randomly selects c i ∈ Z * q to generate a pseudo identity where len PID i,2 represents two bytes converted from the bit length of PID i,2 , a and b are elements in F p that define an elliptic curve over E(F p ) and T i represents the current timestamp.Next, V i randomly selects k i ∈ Z * q and calculates K For simplicity, we omit the notation t of PID i , Z i , φ i , sgk i , K i , e i , r i and s i .Finally, V i obtains the signature σ t i = (r i , s i ) of W t i and sends messages {W t i , σ t i , (X i , Y i ), PID i , T i } to the nearby RSU j .

Single Message Verification
Upon receiving the messages {W t i , σ t i , (X i , Y i ), PID i , T i } from V i , RSU j first checks the validity of the timestamp.If ∆T ≥ T a − T i , where T a represents the arrival time, it continues; otherwise, it discards.Then RSU j calculates Finally, RSU j checks the equality of R = e i + x ′ 1 = r i for authentication and validity.

Batch Messages Verification
When receiving a batch of messages T n from the vehicles {V 1 , V 2 , . . . ,V n }, RSU j first checks the validity of timestamp T i , where i = 1, 2, . . ., n.If T i is valid, it continues; otherwise, it discards.To prevent confusion attacks while ensuring non-repudiation, CPPA-SM2 uses a set of small exponents {v 1 , v 2 , . . . ,v n } for batch verification [23,35], where v i ∈ [1, 2 t ] and t is a small integer.Next, RSU j calculates and checks whether If true, all messages are valid; otherwise, some of these messages are invalid.The detection algorithm for invalid message signatures has been proposed in [36].The details of this algorithm are beyond the scope of this paper.

Trace
When RSU j detects that a vehicle V i has uploaded malicious local model parameters or has engaged in identity forgery, it sends the vehicle's pseudonym PID i to TA. TA then uses the system's master private key s to recover the vehicle's true identity RID i = PID i,2 ⊕ H 2 (s • PID i,1 ).

Revoke
Upon obtaining the true identity RID i of the malicious vehicle V i , TA can completely remove it from the federated learning system by revoking its legitimate information from the group.TA first removes c i related to V i from u by computing u ′ = u − c i .Then, TA randomly selects a new group key K ′ ∈ Z * q , calculates new group public keys , the remaining vehicles in C n can use their secret key sk j to compute the updated group key K ′ = β ′ mod sk j .Since u ′ no longer contains the legitimate information of V i , it cannot compute the new group key K ′ .When a vehicle leaves the communication group C n , TA can also revoke it in this way.

Add
When a vehicle V i applies to join the federated learning system, TA randomly selects a new group key K ′ ∈ Z * q and calculates θ

Correctness and Security Proof and Analysis
In this section, we first provide a proof of correctness for the proposed scheme.Then, under the random oracle model, we prove the security of the scheme.Finally, we conduct an informal security analysis of the scheme.

Correctness Proof
The correctness verification of the single message signature is ensured by Equations ( 4) and (5).
The correctness verification of the batch message signatures is ensured by Equations ( 6) and (7).
Based on the signing and verification process, if the local model parameter W t i and signature σ t i = (r i , s i ) transmitted by the vehicle V i have not been tampered with and the signature σ t i = (r i , s i ) is generated using the legitimate vehicle's private key, then according to (4)- (7), RSU can correctly compute that The correctness of legitimate vehicles in C n obtaining the correct group key K is ensured by Equation (8).
When vehicle V i is revoked from the group C n by TA, since the revoked vehicle will be unable to obtain the correct group key according to Equation (9).

Security Proof
The security of CPPA-SM2 relies on the ECDLP.In the random oracle model, if there exist adversaries A I and A I I who can win games 1 and 2 with non-negligible probabilities, respectively, then there exists a probabilistic polynomial-time simulator that can solve the ECDLP with non-negligible probability.
Theorem 1. CPPA-SM2 is existentially unforgeable under adaptive chosen-identity and chosenmessage attacks against A I with the assumption that ECDLP is hard to resolve.
Proof of Theorem 1.Let C be the solver of the ECDLP.Suppose that A I can succeed in forging a valid signature by interacting with C. C utilizes A I to solve the ECDLP.Here, we give an ECDLP instance {G, G ′ = g • G}.C executes the simulation to compute g through interacting with A I as follows.
-Setup: On input {G, G ′ }, C sets P pub = G ′ and returns {p, q, E(F p ), G, Z * q , P pub , H 1 , H 2 , H 3 , H 4 , H 5 } to A I .A I selects PID i = (PID i,1 , PID i,2 ) as a target vehicle.In addition, C maintains five lists randomly and adds {φ i , PID i,1 , T i } to L H 4 .Then, C returns φ i to A I .-H 5 -queries: Upon receiving the queries from A I with {Z i , M i , T i }, C checks whether {Z i , M i , T i } exists in L H 5 .If it does, C returns e i to A I .Otherwise, C selects e i ∈ Z * q randomly and adds {e i , Z i , M i , T i } to L H 5 .Then, C returns e i to A I .-Partial-Private-Key-Extract-queries: After receiving the queries from A I with Public-Key-Extract-queries: After receiving the queries from A I with Otherwise, C does the Partial-Private-Key-Extract-queries to obtain y i .Then, C selects x ∈ Z * q randomly and computes Secret-Value-Extract-queries: After receiving the queries from A I with PID i = (PID i,1 , PID i,2 ), C checks whether {PID i,1 , PID i,2 , x i , y i , X i , Y i } exists in L. If it does, C returns x i to A I .
Otherwise, C does the Public-Key-Extract-queries to obtain (x i , X i , Y i ).After that, C adds {PID i,1 , PID i,2 , x i , y i , X i , Y i } into L and returns x i to A I .-Public-Key-Replace-queries: After receiving the queries from A I with PID i,1 -Sign queries: After receiving the queries from Forgery: After all queries have been completed, A I outputs a forged tuple 1 modq = r * i holds.If it does not hold, C terminates the simulation.Otherwise, C replays the above process by choosing different H 1 , H 3 and H 4 based on forking lemma.A I will output three other distinct valid signatures σ * ( Finally, we can obtain four equations as below.
In the above four equations, k i , g, K and x i represent the discrete logarithms of K i , P pub , D pub and X i , respectively, which are not known to C. C can obtain the four unknown values by solving the above four linear independent equations, where g is the solution of ECDLP.□ Theorem 2. CPPA-SM2 is existentially unforgeable under adaptive chosen-identity and chosenmessage attacks againstA I I with the assumption that ECDLP is hard to resolve.
Proof of Theorem 2. Let C be the solver of the ECDLP.Suppose that A I I can succeed in forging a valid signature by interacting with C. C utilizes A I I to solve the ECDLP.Here, we give an ECDLP instance {G, G ′ = g • G}.C executes the simulation to compute g through interacting with A I I as follows. - -Forgery: After all queries have been completed, A I I outputs a forged tuple {M * i , 1 modq = r * i holds.If it does not hold, C terminates the simulation.Otherwise, C replays the above process by choosing different H 3 and H 4 based on forking lemma.A I I will output two other distinct valid signatures σ * (2) i and σ * (3) i .
Finally, we can obtain three equations as below.
In the above three equations, k i , K and x i represent the discrete logarithms of K i , D pub and X i , respectively, which are not known to C. C can obtain the three unknown values by solving the above three linear independent equations, where x i is the solution of ECDLP.
However, it is difficult to solve the ECDLP in polynomial time.So, under the random oracle model, CPPA-SM2 is existentially unforgeable under adaptive chosen-identity and chosen-message attacks.□

Informal Security Analysis
Anonymity and Privacy-Preserving: In the CPPA-SM2 scheme, vehicles use pseudonyms PID i = (PID i,1 , PID i,2 ) to communicate with other entities.To obtain the vehicle's real identity RID i , the adversary must compute However, due to the hardness of the Computational Diffie-Hellman (CDH) problem, the adversary is unable to obtain RID i , thereby protecting the vehicle's identity privacy.Additionally, since vehicles participate in federated learning using pseudonyms, and these pseudonyms are updated with each message sent, even if external adversaries or RSUs gain access to the plaintext local model parameters, they cannot link them to specific vehicles.This prevents the inference of any private information, thus providing privacy protection during the federated learning process.
Traceability: When a vehicle with malicious behavior is detected, TA can trace its real identity RID i = PID i,2 ⊕ H(s • PID i,1 ) from its pseudonym PID i = (PID i,1 , PID i,2 ) using the system's master private key s.
Message integrity and authentication: According to Theorem 1 and Theorem 2, as long as the ECDLP is hard to solve, the CPPA-SM2 scheme is existentially unforgeable under adaptive chosen-identity and chosen-message attacks against the attackers A I and A I I .
Non-repudiation: Since only the message signer V i can compute the signature key sgk i , an adversary cannot forge valid signatures for a specific vehicle identity.Additionally, the TA can execute the Trace algorithm to obtain the vehicle's real identity.Therefore, once a vehicle's message passes the signature verification, it cannot be denied.
Un-linkability: Since the vehicle pseudonym identity PID i is generated during the signing process and the random number used in the signature generation process is nonrepetitive, each PID in every signature is unique.As a result, any adversary cannot link any number of signatures sent by the same vehicle.
Forward privacy: When a new vehicle joins the group C, the new group key K ′ is randomly generated by the TA and is independent of the old group key K. Therefore, the newly joined vehicle cannot access the group's communications prior to joining.
Backward privacy: When a vehicle is revoked or leaves the group, the TA will remove the legitimate information c i associated with that vehicle from u and compute a new group key K ′ and group public key Since the revoked vehicle cannot obtain the updated group key K ′ , it cannot access the communications after leaving the group.
Impersonation attack: If an adversary wants to impersonate vehicle V i to the RSUs nearby or other vehicles V j , they must generate a valid message {M i , σ i , (X i , Y i ), PID i , T i } that passes the verification algorithm.However, according to Theorem 1 and Theorem 2, it is evident that no polynomial adversary can forge a valid message.
Modification attack: According to Theorem 1 and Theorem 2, any modification of the message {M i , σ i , (X i , Y i ), PID i , T i } can be detected by the verification algorithm.Therefore, the proposed CPPA-SM2 scheme can withstand the modification attack.
Replay attack: In the proposed CPPA-SM2 scheme, vehicles use the current timestamp T i when generating message signatures.Therefore, message verifiers can resist replay attacks by verifying the freshness of the timestamp T i .
Collusion attack: Several vehicles would collaborate to try to compute the new group key K ′ after they left the group.However, since their legitimate information c i has been removed from u, these leaving vehicles cannot conspire to calculate the new group key K ′ .

Performance Evaluation
In this section, we will evaluate the performance of the proposed CPPA-SM2 scheme from both security features, computation overhead and communication overhead perspectives, and compare and analyze it with the existing works.For bilinear pairings-based CPPA schemes for IoV, we construct a bilinear pairing e : G 1 × G 1 → G T , where G 1 is an additive group generated by a point G with the order q on the super singular elliptic curve E : y 2 = x 3 + xmodp with embedding degree 2, p is a 512-bit prime number, q is a 160-bit prime number.For ECC-based CPPA schemes for IoV, we construct an additive group G generated by a point G with the order q on a non-singular elliptic curve E : y 2 = x 3 + ax + bmodp, where p, q are two 256-bit prime numbers and a, b ∈ Z * p .We calculate the execution time of basic cryptographic operations using the MIRACL library in VS 2019 with Windows 11 operating system over an Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz, as shown in Table 2.

|H|
Output size of hash function 32 bytes

Computation Costs
We compared the computational costs of the CPPA-SM2 scheme with other relevant schemes in terms of signature generation, single signature verification, batch verification and member management, as shown in Tables 3 and 4, and Figures 3 and 4, where "-" indicates that the property is not considered in the scheme, MS denotes the message sign and MV denotes the message verification.
Zhao et al. scheme [22] offers relatively low computational overhead, but RSU needs to send a request to TA for each identity verification, and there is a key escrow issue.In Kanchan et al. scheme [24] based on bilinear pairings, group signature is used instead of an individual signature for message authentication, and the group manager achieves tracing of malicious vehicles.Generating a group signature requires performing T + 4T bp + 4T bpe2 + 6T e + 9T m .Verifying the group signature requires performing T h + 2T m + 4T bpe2 + 5T bp + 8T e , resulting in a relatively high computational overhead.In Jiang et al. scheme [26], similarly, bilinear pairing operations are used, requiring 2T h + 5T bpe1 computations to generate a signature and T i + T bpe1 + T bpe2 + T h + T DE + T bpm2 + 3T bp computations to verify the signature.In Yang et al. scheme [37], generating a signature requires performing T ⊕ + 2T h + 3T mtp + 4T bpm1 + 6T bpe1 .To verify the signature, T ⊕ + T bpm1 + 2T h + 3T bpe1 + 3T bpm2 + 5T bp + 5T mtp operations are needed.Due to the involvement of bilinear pairings and hash-to-point mappings, this method incurs the highest computational overhead.In Lin et al. scheme [38], a vehicle calculates 2T h + 2T ea + 3T m + 6T em to generate the anonymous public keys and a signature.Upon receiving the signature, RSU verifies it by performing T h + 3T em + 4T ea .Additionally, Zhao et al. scheme [22], Kanchan et al. scheme [24], Jiang et al. scheme [26] and Lin et al. scheme all require maintaining a revocation list for revocation purposes, which incurs additional lookup and maintenance overhead.CPPA-SM2 does not require bilinear pairings or hash-to-point mappings, relying only on basic ECC operations, thus reducing computational costs.Specifically, when a vehicle sends a message, it first generates an unlinkable pseudonym PID i by performing one T em , one T ⊕ and one T h .Then, it generates the signature by performing three T h , one T em , four T m and one T i .Therefore, the computation cost for signature generation is T ⊕ + T i + 2T em + 4T m + 4T h .To authenticate the message sent by the vehicle, the RSU, upon receiving the message, needs to perform 3T h + 3T ea + 4T em .Therefore, the total computation cost for signature generation and signature verification in CPPA-SM2 is T ⊕ + T i + 3T ea + 4T m + 6T em + 7T h .When RSU receives messages sent from n vehicles, it performs batch verification of the messages by executing (2n + 1)T ea + (2n + 2)T em + 3nT h .To test the effectiveness of batch verification, we conducted experimental comparisons between CPPA-SM2 and Xiong et al. scheme [28] and Shen et al. scheme [39].In batch verification, the RSU will verify the n messages received simultaneously from n vehicles, meaning n represents both the number of signatures received by the RSU at the same time and the number of vehicles.In the experiment, we tested with n set to 20, 40, 60 and 100, respectively.In CPPA-SM2, when RSU simultaneously receives n messages from n vehicles, it needs to compute three T h , two T em and two T ea for each vehicle.Finally, it performs two T em and one T ea to verify multiple messages.Therefore, the total cost of batch verification is 3nT h + (2n + 2)T em + (2n + 1)T ea .In Xiong et al. scheme [28], it performs four T h , two T em and three T ea for each vehicle.Then, it also executes three T em and one T ea .Therefore, the total cost of batch verification is 4nT h + (2n + 3)T em + (3n + 1)T ea .In Shen et al. scheme [39], RSU invokes one exponent operation, one bilinear pairing and one multiplication to confirm the equation m = e(η, pk i )e(P, P) −r 2 .Its batch verification is based on ∏ n e(η n , pk n )e(P, P) −r 2,n = ∏ n m n , which needs n times T bp , n times nT bpe2 and (3n − 2)T bpm1 .The results are shown in Table 4 and Figure 4. From the experimental results, it can be seen that the batch-verification performance of our scheme is better than these two schemes.In terms of tracing cost, Kanchan et al. scheme [24], Yang et al. scheme [37], Lin et al. scheme [38] and CPPA-SM2 are 1.3451 ms, 0.1759 ms, 1.6320 ms and 0.3027 ms, respectively.All these approaches can achieve fast identity tracing.But in terms of revocation, all schemes except CPPA-SM2 utilize revocation lists, leading to additional maintenance and lookup overheads, while CPPA-SM2 only requires a single modular operation to efficiently revoke vehicles.Therefore, overall, compared to other schemes, CPPA-SM2 not only reduces the computational costs of signature generation and verification, and supports batch verification, but it also achieves efficient tracing and revocation of vehicles while preserving vehicle privacy.Zhao et al. scheme [22] offers relatively low computational overhead, but RSU needs to send a request to TA for each identity verification, and there is a key escrow issue.In Kanchan et al. scheme [24] based on bilinear pairings, group signature is used instead of an individual signature for message authentication, and the group manager achieves tracing of malicious vehicles.Generating a group signature requires performing     Zhao et al. scheme [22] offers relatively low computational overhead, but RSU needs to send a request to TA for each identity verification, and there is a key escrow issue.In Kanchan et al. scheme [24] based on bilinear pairings, group signature is used instead of an individual signature for message authentication, and the group manager achieves tracing of malicious vehicles.Generating a group signature requires performing

Communication Costs
We compared the communication costs of CPPA-SM2 with other schemes, mainly including the following: the size of single signature (SSS), the total number of transmitted messages (NTMs), their sizes (STMs) and the number of interactions (NIs).The results are shown in Table 5 and Figure 5.In Zhao et al. scheme [22], to complete the authentication, interaction is required four times, making it the highest number of interactions.Its total computational cost is 476 bytes.The communication overhead for the group signature D 1 , D 2 , D 3 , c, s α , s β , s x , s δ 1 , s δ 2 generated in Kanchan et al. scheme [24] is the highest, at 576 bytes.Jiang et al. scheme [26], Yang et al. scheme [37] and CPPA-SM2 all require only one interaction to complete message authentication.In Lin et al. scheme [38], vehicles need to transmit {σ n , k n , U n , D n , Z ′ n } for message authentication, with a total size of 480 bytes.In CPPA-SM2, the generated signature, denoted as σ i = (r i , s i ), consists of two elements from Z * q ; hence, its size is merely 64 bytes.To authenticate the signature, three additional messages {PID i , (X i , Y i ), T i } of size 228 bytes need to be transmitted, resulting in a total transmission cost of 292 bytes.In Yang et al. scheme [37], The generation of a single signature is denoted as C i = {R i , c i , s i }, where R i , c i and s i belongs to G 1 ; thus, the size of C i is 384 bytes.

Security Features
We compared the security features (SFs) satisfied by these schemes, including the following: 1: anonymity; 2: traceability; 3: authenticity; 4: integrity; 5: non-repudiation; 6: un-linkability; 7: forward security; 8: backward security; 9: key escrow-free; 10: batch verification; 11: revocability; 12: dynamic member management; and 13: un-forgeability.The In Lin et al. scheme [38], the obtained signature is denoted as {c i , z i,1 , z i,2 , R i,1 , R i,2 }, with a length of 224 bytes.Additionally, to resist replay attacks, ts i , APK 1 a , APK 2 a are also sent, making the total message length for transmission 356 bytes.From the experimental results, it can be observed that CPPA-SM2 has the smallest signature size and total cost of transmitting messages.This makes it more suitable for operation in bandwidth-constrained vehicular networking environments.

Security Features
We compared the security features (SFs) satisfied by these schemes, including the following: 1: anonymity; 2: traceability; 3: authenticity; 4: integrity; 5: non-repudiation; 6: un-linkability; 7: forward security; 8: backward security; 9: key escrow-free; 10: batch verification; 11: revocability; 12: dynamic member management; and 13: un-forgeability.The results are shown in Table 6, where 1-13 represent these security features in order, with √ indicating that the security feature is met and × indicating that it is not met.From the results, it can be seen that all schemes achieve 1: anonymity, 3: authenticity, 4: integrity and 6: un-linkability.Zhao et al. scheme [22], Kanchan et al. scheme [24], Jiang et al. scheme [26] and CPPA-SM2 use digital signatures to verify the authenticity and integrity of the local model parameters uploaded by vehicles.However, in Zhao et al. scheme [22] and Kanchan et al. scheme [24], since TA possesses all users' private keys, there is a key escrow issue.Jiang et al. scheme [26] satisfies most of the security features; however, it uses a revocation list for identity management, resulting in additional verification and maintenance overhead.Furthermore, it does not support 12: dynamic member management.To achieve 6: unlinkability, Yang et al. scheme [37] and Lin et al. scheme [38] use a set of pseudonyms to hide real identities, whereas CPPA-SM2 achieves 6: un-linkability by randomly generating pseudonyms each time a signature is made.Overall, compared to these schemes, CPPA-SM2 achieves more comprehensive security attributes, supports 10: batch verification and 12: dynamic member management, and has lower computational and communication costs.dynamic member management, and supports batch verification.Security proofs and analyses demonstrate that it can ensure the authenticity and integrity of local model parameters, achieving secure vehicle authentication.Experimental results show that, compared to existing advanced schemes, CPPA-SM2 offers high computational efficiency and low communication overhead.Additionally, its integration with standard algorithms endows it with the potential for widespread application.
However, the focus of this paper is on identity-authentication schemes and privacy protection in the federated learning process.There are still some malicious clients in the federated learning process that may launch data-poisoning attacks by uploading malicious local model parameters, thereby affecting the performance of the global model.Therefore, future research could integrate Byzantine robust detection schemes to achieve privacypreserving Byzantine robust federated learning.Additionally, with the development of postquantum algorithms, the ECDLP may be efficiently solved by post-quantum algorithms, making ECC-based authentication schemes no longer secure.Future work can explore quantum-resistant identity-authentication schemes, such as lattice-based cryptography.
outputs false and exits.Finally, B calculates
which are empty initially.-Query: C responds to -H i -queries (i = 1, 3, 4, 5), Partial-Private-Key-Extract-queries, Secret-Value-Extract-queries and Sign queries as in Theorem 1. C responds to Public-Key-Extract-queries as follows.-Public-Key-Extract-queries: After receiving the queries from A I I with

Figure 4 .
Figure 4. Comparison of the scheme proposed by [28,39], and our scheme in batch validation time.

.
Verifying the group signature requires performing

Figure 4 .
Figure 4. Comparison of the scheme proposed by [28,39], and our scheme in batch validation time.

.
Verifying the group signature requires performing

Figure 4 .
Figure 4. Comparison of the scheme proposed by [28,39], and our scheme in batch validation time.

Table 1 .
Notations and definitions used.

•
Key Generation (params) → (d A , P A ) : Assume the signer of the message is user A. TA chooses the elliptic curve parameters param = (p, a, b, q, G), selects a random integer d A ∈ [1, n − 1] as the private key and calculates the public key P A = d A G for user A. • Signature Generation (params, m, d A ) → σ A : Given a message m.A computes Z A = H(len ID A ||ID A ||a||b||G||P A ) and e A = H(Z A ||m), where len ID A represents two bytes converted from the bit length of user A's identity ID A , a and b are elements in F p that define an elliptic curve over E(F p ), G denotes the base point in the elliptic curve group G and P A denotes user A's public key.Then, A randomly chooses k where d A denotes user A's private key.User A's signature on the message m is σ A = (r A , s A ).•Signature Verification (params, m, σ A , P A ) → true or f alse : Assume the verifier of the signature σ A is user B. Given user A's signature σ A Hash queries: Upon receiving a query from A I , C returns the corresponding hash values to A I .-Partial-Private-Key-Extract-queries: Upon receiving a query with a pseudonym PID i , C returns the partial private key y i of the vehicle to A I .-Public-Key-Extract-queries: Upon receiving a query with a pseudonym PID i , C returns the public key (X i , Y i ) of the vehicle to A I .-Secret-Value-Extract-queries: Upon receiving a query with a pseudonym PID i , C returns the secret value x i of the vehicle to A I .
-Public-Key-Replace-queries: Upon receiving a query with (PID i , (X ′ i , Y ′ i )), C replaces public key with the new public key (X ′ i , Y ′ i ).-Sign queries: After receiving a query from A I with {PID i,1 , PID i,2 , M i , T i }, C responds with a signature σ i .-Forgery: Once A I has completed the desired queries, it outputs M * i , PID * i,1 , PID * i,2 , T * i , σ * i under the pseudo identity PID * i .A I wins the game if the following conditions are met: σ * i passes verification.-Partial-Private-Key-Extract-queries oracle has not received the request with PID * i .-Sign queries oracle has not received the request with , D pub , SIG sk TA (β||D pub ||T K ) to vehicles and RSUs in C n .
authentication and validity.4: RSU j uses the FedAvg algorithm to locally aggregate the verified local model parameters {W t CS signs the global model with its private key and sends messages {W t+1 global , SIG sk CS (W t+1 global )} to the vehicles within the communication group via RSUs.TA uses the system's master private key s to recover the vehicle's true identity RID and broadcasts the updated information {β ′ , D ′ pub , SIG sk TA (β ′ ||D ′ pub ||T K ′ )} to vehicles and RSUs in C n .Add: 1. TA randomly selects a new group key K [1, n]and n denotes the number of vehicles participating in the training within the RSU j 's range.It then signs this result with its private key and sends messages {W t j , m) , where j ∈ [1, m] and m denotes the number of RSUs.CS signs the global model with its private key and sends messages {W t+1 global , SIG sk TA (W t+1 global )} to the vehicles within the communication group via RSUs.
After receiving the queries from A I with {X i , P pub }, C checks whether {X i , P pub } exists in L H 1 .If it does, C returns h i to A I .Otherwise, C selects h i ∈ Z * q randomly and adds {h i , X i , P pub } to L H 1 .Then, C returns h i to A I .-H 3 -queries: When receiving the queries with {len(PID i,2

Table 2 .
Execution time of basic cryptographic operations and element size.
q Size of elements on Z * q 32 bytes

Table 4 .
Comparison of batch-verification costs.

Table 5 .
Comparison of communication costs for different schemes.