An Efficient 24–30 GHz GaN-on-Si Driver Amplifier Using Synthesized Matching Networks

This paper presents a broadband GaN microwave monolithic integrated circuit driver amplifier (MMIC DA) with compact dimensions of 1.65 mm × 0.78 mm for 5G millimeter-wave communication. The optimal impedance domain satisfying the preset goals was first acquired using the simplified load-pull procedure and small-signal simulations, followed by a weighted average method to determine the reference center matching point from which the optimal intrinsic load can be deduced. By means of de-embedding load-pull contours, modeling based on theoretical analysis, and simulation fitting for parameter identification, the nonlinear output capacitance and a series RLC model circuit approximating the input impedance response of the stabilized transistor were extracted. Under the design principle of fully absorbing the parasitic parameters of the device, explicit formulas and tabulated methods related to the Chebyshev impedance transformer were applied to construct filter-based synthesized matching networks at each stage and finally convert them into an implementable mixed-element form via the single-frequency equivalence technique. Measured on-wafer pulsed results for the proposed two-stage DA across 24–30 GHz demonstrated up to 31.1 dBm of saturated output power (Psat) with less than 1 dB total fluctuation, 19.3 ± 1 dB of small-signal gain, and 39.8% of peak power-added efficiency (PAE) at the mid-frequency.


Introduction
During this period of coexistence with the COVID-19 pandemic, online education and cloud offices are on the rise, greatly supported by the extensive 5G wireless infrastructure. Commercial 5G networks have thus far been deployed chiefly in the popular but congested sub-6 GHz frequency range 1 (FR1), with an accelerated evolution toward millimeter-wave (mmW) bands, which possess an abundant spectrum with much wider bandwidths to address the ever-growing demand for mobile data traffic. In particular, the 24-30 GHz range spanning the overlapping bands n257, n258, and n261 that has been planned, licensed, and launched the most by nations worldwide is the focus [1]. Despite their tremendous potential, due to the short wavelengths, mmW communications have some natural drawbacks, such as significant propagation loss and the susceptibility to blockage. Hence, to remedy the lack of mmW signal coverage, the rated output power (P out ) levels of 5G microcells are often higher. Under such circumstances, a driver amplifier (DA) unit is needed to provide good linearity or efficiency performance as the gain stage prior to the final power amplifier (PA) in an RF transmitter chain [2].
Given that frequency allocations differ by country, a promising low-cost solution from a market perspective is a broadband MMIC DA capable of operating over multiple 3GPP bands in 5G NR FR2, facilitating both the integration and robustness of the front-end system. However, obtaining a sufficient bandwidth, while maintaining decent efficiency in the smallest possible physical footprint poses a challenge to amplifier design because

Circuit Design Considerations
The adopted technology is OMMIC's (OMMIC SAS, Limeil-Brévannes, France) 0.1 μm double-heterojunction AlN/GaN/AlGaN HEMT process known as D01GH, based on a 3-inch high-resistivity silicon substrate thinned down to 100 μm with a dielectric con stant εr of 11.7 and a loss tangent tg(δ) of 0.015. As depicted in Figure 1, it uses in situ passivation to avoid memory effects and regrown non-alloyed ohmic contacts to minimize access resistance. The thin AlN barrier serves to diminish short-channel effects, whereas the AlGaN back barrier improves electron confinement. Mushroom gates of 100-nm length and a short gate-source distance enable the cutoff frequency (fT, i.e., f @ |H21| = 1 to exceed 100 GHz. The typical RF power density is around 3.3 W/mm @ 30 GHz but can go up to 5.7 W/mm for a recommended bias supply VD of 12 V, with a gate-drain break down voltage of 36 V. Depletion-mode transistors have been fully modeled, taking into account the electro-thermal nonlinear attribute. Various passive components, including spiral inductors, NiCr thin-film resistors (40 Ω/□), GaN resistors (400 Ω/□), and two type of MIM capacitors (50 pF/mm 2 and 400 pF/mm 2 ), via holes and microstrip lines with meta thickness options of 1.25 μm or 2.5 μm, are available in the foundry design kit [21]. The process we chose is built on a Si platform rather than the general silicon carbide (SiC) one, which has superior thermal conductivity along with low loss. On the one hand the required power levels for 5G mmW applications make the adoption of a Si substrate feasible, and the other major reason is that SiC remains expensive and subject to expor controls, whereas Si-based GaN is relatively cheap, mature enough, and compatible with heterogeneous integration with SiGe/CMOS [22]. It offers a competitive performance and the potential for large economies of scale, opening up the possibility of a realistic volume The process we chose is built on a Si platform rather than the general silicon carbide (SiC) one, which has superior thermal conductivity along with low loss. On the one hand, the required power levels for 5G mmW applications make the adoption of a Si substrate feasible, and the other major reason is that SiC remains expensive and subject to export controls, whereas Si-based GaN is relatively cheap, mature enough, and compatible with heterogeneous integration with SiGe/CMOS [22]. It offers a competitive performance and the potential for large economies of scale, opening up the possibility of a realistic volume production of 5G MMICs.

Fundamental Load-Pull Analysis and Determination of the Optimal Impedance Domain
In consideration of the maximum available gain of transistors, estimated transmission link loss, process variation, and design redundancy, a cascaded two-stage common-source architecture was employed for the DA. Transistor peripheries were staged at a ratio of 1:2, with a 46 × 8 µm cell chosen for the latter stage, to meet the specifications of 1 W P sat with over 30% PAE and beyond 18 dB of linear gain. Due to the nonlinear nature of the device and essentially the output parasitic capacitance C out , which strongly depends on the operating conditions and transistor geometry, the trajectory formed by connecting the optimal load impedance points Z L,opt obtained through load-pull incrementally by frequency travels counterclockwise when mapped on the Smith chart. However, the recorded impedance locus Z L (f) after transforming the standard 50 Ω by the output matching network (OMN) rotates in the reverse direction with increasing frequency compared to the former. Consequently, even if these two profiles happen to be exactly congruent, phase mismatch will exist for all but one frequency. This unfavorable phenomenon of opposite impedance rotation explains why the amplifier is prone to severe mismatch at and near the edge frequencies of the wide target band, leading to a dramatic deterioration in performance, thus troubling the design of the broadband MN. Instead of attempting to track the frequency-dependent Z L,opt over broadband at the expense of a high component overhead, as in the common strategy, we defined an optimal impedance domain that fulfills the preset goals of PAE > 40%, P 1dB > 31 dBm, and output-stage gain > 8 dB. Meanwhile, a compromised center impedance Z L,ctr was identified. The DA's broadband output capability is basically assured as long as the in-band Z L (f) rotates in a knotted shape around Z L,ctr and falls inside the prescribed constraint space. In addition, considering that the transistor's gain roll-off characteristics and the actual bandwidth may narrow or frequency-shift from evaluation, the circuit-level design should prioritize an upper-frequency performance, while introducing some low-frequency mismatch to enhance the broadband effect via stagger-tuning for a balanced large-signal response [17].
In mmW bands, C out shorts out most of the harmonic contents. Therefore, sophisticated harmonic control is no longer viable. As harmonics are far from the passband that can be readily suppressed by a low-pass OMN, the impacts of harmonic loads were first ignored in the course of successive single-tone load-pull simulations by initializing them and source harmonic terminations as open-circuit (e.g., 1 kΩ). It is worth mentioning that a simplified alternative was used here, unlike the conventional process, where the load-pull needs to be iterated with the source-pull to get optimal convergence results. We regarded the HEMT with a parallel RC stabilization network as a new cell, figuring out the voltage-to-current ratio at the equivalent gate node when the cell is excited to the P 1dB state by a continuouswave (CW) signal to derive the fundamental input impedance Z in,fund and then took its conjugate as the source impedance Z S,fund . Since the device's non-unilateral operation and PAE is more critical, Z S,fund must be updated with the variation of Z L,opt picked at the best PAE for each iteration until the two, which interact, cease to change, i.e., each converges to a certain fixed value. Table 1 compares the conventional and simplified load-pull scripts conducted on the last stage at 24 GHz, 28 GHz, and the extended corner of 32 GHz under the same simulation conditions. There is only a negligible difference in the PAE between them, and the corresponding Z L,opt are identical, so the source-pull step for seeking the optimal source impedance Z S,opt can be omitted to facilitate a more in-depth load analysis with guaranteed correctness. With this simplified method, joint small-signal simulations for generating constant power gain (G P ) circles resulting from load mismatch allow the rapid determination of the wanted optimal impedance domain, shaded in Figure 2. About 2/3 of its boundaries are bound by the contours of PAE and P 1dB at 32 GHz due to intensifying parasitic effects (mainly C out ) that cause the contours to move counterclockwise toward the real axis of the Smith chart and gradually shrink as the frequency increases. In addition, the design margin of 2 GHz extended to higher frequencies is an experience-based trade-off and the concentric circle distribution of G p outcomes in Figure 2 is drawn only for the inner circle at 30 GHz, which is just externally tangent to the optimal domain where three groups of contours overlap. The best performance marks P 1 and P 2 , wrapped around the PAE and P 1dB contours, respectively, do not coincide, and they each become closer at higher fundamentals, which in turn are farther apart from one another at the lower side. To accommodate the uneven dispersion of contour peaks, a two-round weighted average calculation was proposed to identify the suitable Z L,ctr . In the beginning, a weighted interpolation was carried out between P 1 and P 2 for every designated frequency. Because PAE is the primary concern, a weight of 2/3 was assigned to P 1 to bring the interpolation point P i closer to P 1 . The first round of applying Equation (1) yields P i at three frequencies, listed in Table 2. On top of that, recognizing the importance of medium-and high-frequency performance, similarly P i at 24, 28, and 32 GHz was assigned the weighting factors W i of 0.2, 0.5, and 0.3, respectively, in accordance with Equation (2) to obtain the Z L,ctr trade-off efficiency and P out of 11 + j12.9 Ω. With this simplified method, joint small-signal simulations for generating constant power gain (GP) circles resulting from load mismatch allow the rapid determination of the wanted optimal impedance domain, shaded in Figure 2. About 2/3 of its boundaries are bound by the contours of PAE and P1dB at 32 GHz due to intensifying parasitic effects (mainly Cout) that cause the contours to move counterclockwise toward the real axis of the Smith chart and gradually shrink as the frequency increases. In addition, the design margin of 2 GHz extended to higher frequencies is an experience-based trade-off and the concentric circle distribution of Gp outcomes in Figure 2 is drawn only for the inner circle at 30 GHz, which is just externally tangent to the optimal domain where three groups of contours overlap. The best performance marks P1 and P2, wrapped around the PAE and P1dB contours, respectively, do not coincide, and they each become closer at higher fundamentals, which in turn are farther apart from one another at the lower side. To accommodate the uneven dispersion of contour peaks, a two-round weighted average calculation was proposed to identify the suitable ZL,ctr. In the beginning, a weighted interpolation was carried out between P1 and P2 for every designated frequency. Because PAE is the primary concern, a weight of 2/3 was assigned to P1 to bring the interpolation point Pi closer to P1. The first round of applying Equation (1) yields Pi at three frequencies, listed in Table 2. On top of that, recognizing the importance of medium-and high-frequency performance, similarly Pi at 24, 28, and 32 GHz was assigned the weighting factors Wi of 0.2, 0.5, and 0.3, respectively, in accordance with Equation (2) to obtain the ZL,ctr trade-off efficiency and Pout of 11 + j12.9 Ω.

Frequency increases
Opt. impedance domain Figure 2. Simulated load-pull contours at 1 dB gain compression and the G P circle of 8 dB at 30 GHz for the output stage cell when given a nominal bias point of V D = 12 V and V G = -1 V, which yields the maximum device DC and AC transconductance [17]. For more precise guidance on the OMN design and to consolidate the comprehensive performance of the DA, a G p circle of 8 dB at 32 GHz has been added in Figure 3, whose intersection with the acquired optimal impedance domain represents the desired matching zone under stricter conditions. It contains Z L,ctr , but the range is cut in half from the previous one, which will probably increase the matching difficulty. As G p is a less prominent metric, the newly planned impedance space was handled as a preferred region rather than a mandatory objective. For more precise guidance on the OMN design and to consolidate the comprehensive performance of the DA, a Gp circle of 8 dB at 32 GHz has been added in Figure 3, whose intersection with the acquired optimal impedance domain represents the desired matching zone under stricter conditions. It contains ZL,ctr, but the range is cut in half from the previous one, which will probably increase the matching difficulty. As Gp is a less prominent metric, the newly planned impedance space was handled as a preferred region rather than a mandatory objective. In any case, it is necessary to establish the cell's impedance model to analyze broadband matching and as a meaningful reference for deciding the network topology. Because the intrinsic output circuit of a field-effect transistor (FET) could be thought of as a parallel connection of the voltage-controlled current source (VCCS), internal conductance Gds, and Cout, the output impedance of a GaN HEMT equates to the parallel RoutCout model, as illustrated in Figure 4. Cout is mostly made up of Cds and Cgd, with Cgd playing a minor role. The optimal load resistance Ropt can be roughly estimated using the Cripps loadline method [4], written in Equation (3); empirically by Equation (4); or straight from the acquired ZL,ctr, as indicated in Equation (5). These correspond to Ropt values of 40 Ω, 32 Ω, and 26 Ω for the adopted 46 × 8 μm device, with VD = 12 V, knee voltage Vknee = 2 V, maximum current Imax = 0.5 A, and conservative simulation target Pout = 32 dBm. Since the classic loadline theory is based on several ideal assumptions and does not account for non-negligible parasitic effects regardless of operational class, the Ropt inverse from load-pull results will be more in line with the practical scenario. Cout can be obtained in a similar manner. However, note that unlike Ropt, which is somewhat customized, Cout is an innate parameter of the device, with an exact value in a given condition. Thus, there will be some errors when using the familiar extraction Formula (6), but the computed figure of 0.26 pF at 28 GHz is worth taking as an initial guess for Cout. Next, different Cout values are de-embedded from the load-pull contours derived at plane 'B' (see Figure 4) in 0.01 pF steps within a small interval of 0.24-0.28 pF. When their conjugate mirror contours are observed back to the position symmetrical to the real axis of the Smith chart, the de-embedding process is judged as complete, as contours at the current generator plane ought to be frequencyindependent [4,23]. Cout was then identified to be 0.27 pF. In the same way, the Cout of the 46 × 4 μm cell is 0.13 pF, and the Ropt for the driver stage was selected as 75 Ω. G p _8 dB @ 30 GHz G p _8 dB @ 32 GHz Z L,ctr (11 + j12.9 Ω)

Preferred region
Opt. impedance domain In any case, it is necessary to establish the cell's impedance model to analyze broadband matching and as a meaningful reference for deciding the network topology. Because the intrinsic output circuit of a field-effect transistor (FET) could be thought of as a parallel connection of the voltage-controlled current source (VCCS), internal conductance G ds , and C out , the output impedance of a GaN HEMT equates to the parallel R out C out model, as illustrated in Figure 4. C out is mostly made up of C ds and C gd , with C gd playing a minor role. The optimal load resistance R opt can be roughly estimated using the Cripps loadline method [4], written in Equation (3); empirically by Equation (4); or straight from the acquired Z L,ctr , as indicated in Equation (5). These correspond to R opt values of 40 Ω, 32 Ω, and 26 Ω for the adopted 46 × 8 µm device, with V D = 12 V, knee voltage V knee = 2 V, maximum current I max = 0.5 A, and conservative simulation target P out = 32 dBm. Since the classic loadline theory is based on several ideal assumptions and does not account for non-negligible parasitic effects regardless of operational class, the R opt inverse from load-pull results will be more in line with the practical scenario. C out can be obtained in a similar manner. However, note that unlike R opt , which is somewhat customized, C out is an innate parameter of the device, with an exact value in a given condition. Thus, there will be some errors when using the familiar extraction Formula (6), but the computed figure of 0.26 pF at 28 GHz is worth taking as an initial guess for C out . Next, different C out values are de-embedded from the load-pull contours derived at plane 'B' (see Figure 4) in 0.01 pF steps within a small interval of 0.24-0.28 pF. When their conjugate mirror contours are observed back to the position symmetrical to the real axis of the Smith chart, the de-embedding process is judged as complete, as contours at the current generator plane ought to be frequency-independent [4,23]. C out was then identified to be 0.27 pF. In the same way, the C out of the 46 × 4 µm cell is 0.13 pF, and the R opt for the driver stage was selected as 75 Ω.

Harmonic Load-Pull Analysis and Determination of the Phase Avoidance Interval
According to energy conservation law, the total source power comprising the DC supply and the incident power Pin amounts to the Pout at fundamental and harmonic frequencies, plus the power dissipates into heat in the transistor, written as Equation (7). Harmonic impedances could be treated as almost purely reactive, i.e., the harmonic power contribution is modest, but they still act on the overlap of output V-I waveforms in the time domain, altering Pdiss. Thus, the second and third harmonic loads were varied one by one along the near periphery of the Smith chart, while keeping ZL,ctr at all fundamentals and the open circuit for other harmonic terminations to investigate the effect of their phase on the PAE of the 46 × 8 μm cell separately [14,24]. As seen from Figure 5a, the resulting PAE drops remarkably once the optimal phase is reached and curve families show deep notches in the 170-270° range, which indicates that the second harmonic termination leads to an increase in unwanted power dissipation when it is transformed into this phase interval. The peak-to-peak PAE reduction is about 8% to 3% from 24 GHz to 30 GHz. In contrast, except for a milder decline in the range of 180-280°, PAE curves in Figure 5b fluctuate little in the rest of the phase interval, which can be accepted as suitable ranges. These findings echo the earlier supposition that higher-order harmonics dampened by Cout lack enough strength to significantly affect the drain voltage waveform and thus are less effective against PAE. For most cases, only the first two-order harmonics deserve to be discussed. Manipulating more but minor harmonic objects in the broadband will multiply layout patterns and introduce higher losses than they are worth. Additionally, after repeating the present simulations at different gain compression points corresponding to Pin, we found that the phase ranges to be avoided converge, just with distinguishable differences in the degree of depression, and the deviation of the phase of the load reflection coefficient at the PAE valley due to Pin variations is less than 10° for each frequency. Figure  6 shows the detailed harmonic load-pull simulation results for PAE at 23 dBm of Pin.
In summary, the OMN design should strike a balance to prevent the harmonic im- Figure 4. Equivalent parallel RC model for the large-signal output impedance of an HEMT.

Harmonic Load-Pull Analysis and Determination of the Phase Avoidance Interval
According to energy conservation law, the total source power comprising the DC supply and the incident power P in amounts to the P out at fundamental and harmonic frequencies, plus the power dissipates into heat in the transistor, written as Equation (7). Harmonic impedances could be treated as almost purely reactive, i.e., the harmonic power contribution is modest, but they still act on the overlap of output V-I waveforms in the time domain, altering P diss . Thus, the second and third harmonic loads were varied one by one along the near periphery of the Smith chart, while keeping Z L,ctr at all fundamentals and the open circuit for other harmonic terminations to investigate the effect of their phase on the PAE of the 46 × 8 µm cell separately [14,24]. As seen from Figure 5a, the resulting PAE drops remarkably once the optimal phase is reached and curve families show deep notches in the 170-270 • range, which indicates that the second harmonic termination leads to an increase in unwanted power dissipation when it is transformed into this phase interval. The peak-to-peak PAE reduction is about 8% to 3% from 24 GHz to 30 GHz. In contrast, except for a milder decline in the range of 180-280 • , PAE curves in Figure 5b fluctuate little in the rest of the phase interval, which can be accepted as suitable ranges. These findings echo the earlier supposition that higher-order harmonics dampened by C out lack enough strength to significantly affect the drain voltage waveform and thus are less effective against PAE. For most cases, only the first two-order harmonics deserve to be discussed. Manipulating more but minor harmonic objects in the broadband will multiply layout patterns and introduce higher losses than they are worth. Additionally, after repeating the present simulations at different gain compression points corresponding to P in , we found that the phase ranges to be avoided converge, just with distinguishable differences in the degree of depression, and the deviation of the phase of the load reflection coefficient at the PAE valley due to P in variations is less than 10 • for each frequency. Figure 6 shows the detailed harmonic load-pull simulation results for PAE at 23 dBm of P in .
Micromachines 2023, 14, x FOR PEER REVIEW  8 of 26 nately, the wide tolerance range of 250° allows us to confidently concentrate on developing the fundamental MN, supplementing interventions with harmonic control, if necessary. The preferred phase location is considered between 0 and 150°.     nately, the wide tolerance range of 250° allows us to confidently concentrate on developing the fundamental MN, supplementing interventions with harmonic control, if necessary. The preferred phase location is considered between 0 and 150°.     In summary, the OMN design should strike a balance to prevent the harmonic impedances, especially the second one, from falling into the low-efficiency region. Fortunately, the wide tolerance range of 250 • allows us to confidently concentrate on developing the fundamental MN, supplementing interventions with harmonic control, if necessary. The preferred phase location is considered between 0 and 150 • .

Mixed-Element Realization Method and Layout Considerations
Objectively speaking, in a manufactured MMIC, only some operable parameters of components can undergo limited unidirectional adjustment by physical trimming (e.g., laser), which requires sensible forethought and a well-planned setup during the design phase. The difficult-to-change nature implies that to accurately anticipate the real-world behavior of MMICs, especially for compact designs, we need rigorous electromagnetic (EM) simulations, which are time-consuming. When individually designed MNs and cells are cascaded into a complete amplifier following a generic modular implementation process, significant performance offsets tend to occur which are hard to eliminate. Each modification often involves performing thorough EM/circuit co-simulations to understand the corresponding knock-on effects, making it difficult to locate the root causes of problems or sensitive factors.
In view of the above facts, the schematic design alternated with layout replacement, continuously considering various EM influences throughout the development procedure, and the overall circuit was built in a step-by-step fashion to greatly reduce the difficulty and the number of optimization iterations in the final joint-tuning phase, ensuring that the desired results can be obtained efficiently.
For layout convenience and physical feasibility, MNs were constructed in a mixedelement style with microstrip lines and MIM capacitors. Generally, a section of transmission line with electrical length θ and characteristic impedance Z 0 can be represented by a symmetric π-shaped network, described in Figure 7a. According to the definition of the Z-matrix, we have Letting Z| Tline = Z| π , gives For a transmission line segment whose physical length l is much smaller than the wavelength λ (l < λ/8), its equivalent parallel susceptance and series reactance are proximately linear to ω, that is where v p is the phase velocity and β is the propagation constant. Therefore, the singlefrequency equivalence technique illustrated in Figure 7b enables the approximate substitution of a series inductor with a distributed element [25]. Firstly, the obtained lumped prototype is decomposed into several cascaded subsections as needed, and one of the lossless units C 1 -L 1 -C 2 is designated. Then, a prescribed symmetric π ladder with parallel capacitors C equ centered on the inductor L 1 is abstracted from its interior and replaced by a commensurate transmission line using the relations shown in Equations (12) and (13), with the chosen Z 0 and in-band angular frequency ω 0 , where L equ = L 1 , ω = ω 0 , and the C 1 and C 2 are deemed to be no less than C equ . The termination capacitors C An in the resulting mixed-element network are In cases where the complete C1-L1-C2 combination cannot be divided, Equation (12) suggests that the remaining fringe capacitance Cequ can be reduced by selecting a larger Z0, hence decreasing the perturbation caused by unequal substitution on matching. Further, the narrow transmission line facilitates repeated bends to save the layout area, and as reflected by Equation (13), it also behaves more like an inductor. Thereby, small transmission line widths were chosen here. The line width was set to 10 μm with the exception of the OMN, where the line width was set to 15 μm. Supply routes and the OMN were deliberately thickened with double metal to diminish ohmic losses, enhancing the current handling capability for the former. In addition, 45° segments with perpendicular access lines were used at the corners to mitigate discontinuities and a gap of at least three times the line width was maintained between the adjacent matching microstrip lines to lower signal coupling. The 2.5D field simulator (Momentum) built into the Advanced Design System (ADS) software was used to solve all kinds of EM effects.

Synthesized Low-Pass OMN and ISMN
The Bode-Fano criterion clarifies the relationship between bandwidth Δω and the reflection coefficient Γ(ω). For the parallel RC-type load, the achievable lossless MN must comply with the constraint (Equation (15)), where Γm is the minimum return loss assumed to be constant over a certain frequency bandwidth ∆f. Ropt and Cout are now known to be 26 Ω and 0.27 pF, respectively, and with an ideal S11 of -25 dB, the theoretical limit ∆f can be calculated from the rearranged Equation (16) as 24.75 GHz, which is four times the actual demand. As a result, a large enough target simulation bandwidth BWsim could be chosen for the OMN, while maintaining a low passband ripple, yet this implies additional cost and insertion loss simultaneously. Bearing in mind the principle of miniaturization and design margin, the BWsim was set to be 22 to 32 GHz, a compromise, and the rest of the MN design would follow suit. In cases where the complete C 1 -L 1 -C 2 combination cannot be divided, Equation (12) suggests that the remaining fringe capacitance C equ can be reduced by selecting a larger Z 0 , hence decreasing the perturbation caused by unequal substitution on matching. Further, the narrow transmission line facilitates repeated bends to save the layout area, and as reflected by Equation (13), it also behaves more like an inductor. Thereby, small transmission line widths were chosen here. The line width was set to 10 µm with the exception of the OMN, where the line width was set to 15 µm. Supply routes and the OMN were deliberately thickened with double metal to diminish ohmic losses, enhancing the current handling capability for the former. In addition, 45 • segments with perpendicular access lines were used at the corners to mitigate discontinuities and a gap of at least three times the line width was maintained between the adjacent matching microstrip lines to lower signal coupling. The 2.5D field simulator (Momentum) built into the Advanced Design System (ADS) software was used to solve all kinds of EM effects.

Synthesized Low-Pass OMN and ISMN
The Bode-Fano criterion clarifies the relationship between bandwidth ∆ω and the reflection coefficient Γ(ω). For the parallel RC-type load, the achievable lossless MN must comply with the constraint (Equation (15)), where Γ m is the minimum return loss assumed to be constant over a certain frequency bandwidth ∆f. R opt and C out are now known to be 26 Ω and 0.27 pF, respectively, and with an ideal S 11 of -25 dB, the theoretical limit ∆f can be calculated from the rearranged Equation (16) as 24.75 GHz, which is four times the actual demand. As a result, a large enough target simulation bandwidth BW sim could be chosen for the OMN, while maintaining a low passband ripple, yet this implies additional cost and insertion loss simultaneously. Bearing in mind the principle of miniaturization and design margin, the BW sim was set to be 22 to 32 GHz, a compromise, and the rest of the MN design would follow suit.
It is possible to have infinite ∆f with zero value capacitance, so properly handling C out is the key to realizing a broadband OMN. There are two main ways: (1) tune existing matching elements or add specialized susceptance cancelation circuitry to minimize the effect of C out [26,27], where passive negative susceptance networks, such as a compensating shunt inductor are preferable and more prevalent [23] and (2) absorb C out into matching.
Unfortunately, the C out of 0.27 pF seems so large that neither compensation nor absorption appears to be a cost-effective or even a feasible option for on-chip wideband matching. To think otherwise, since Z L,ctr is the trade-off matching point for the entire target band, and not far from the real axis of the Smith chart, the OMN design can start with an impedance transformation from 48 Ω to real(Z L,ctr ) and imag(Z L,ctr ) is regarded as another sense of the quantity to be compensated. It should be noted that the L-shaped MN consisting of the parasitic shunt capacitor of the 100 µm square output pad and a 1.8 pF DC-block capacitor shifts the external 50 Ω load slightly down to the capacitive half-plane of the Smith chart, which was pulled back to 48 Ω at 27 GHz in advance by a high-impedance microstrip line. The RF input side received the same treatment.
A good OMN is one that is concise and easy to implement, with adequate harmonic suppression outside the band. According to the ITR, FBW, and passband ripple of 4, 0.4, and 0.1, respectively, a fourth-order Chebyshev low-pass filter was adopted as the matching prototype, where the normalized g value of each element was determined by lookup tables in [9] and then scaled to the 50 system and 27 GHz center frequency f 0 to obtain the preliminary inductance and capacitance. Afterward, parameters of the acquired real-to-real network were automatically adjusted in the order of random-before-gradient type with the help of an ADS optimizer, which finally transformed the intermediate impedance of 12 Ω to the desired Z L,ctr [13]. Following the rules described in the preceding subsection to translate an OMN into a mixed-element form, a short-circuit stub TL DB was subsequently inserted as a drain bias branch, whose layout is displayed in Figure 8, along with the magnitude of its equivalent input impedance Z DB . The characterized open-circuit point lies at 63.3 GHz, somewhat higher than the second harmonic of the top fundamental frequency. Owing to the area limitation, the physical length of TL DB cannot be increased aggressively and the current equivalent effect of close to λ/8 makes the |Z DB | provided in the target band only 2.4-3.4 times larger than the |Z L,ctr |, which is well below the 100 Ω magnitude. Therefore, the OMN must be further optimized to minimize the disturbance to the original frequency response after integrating the indispensable TL DB . It is worth noting that the bias tee was intentionally placed 45 µm away from the transistor's drain terminal, leaving this physical connection as a decoupling spacer to lessen the influence of the bias trace on the cell, which can only be presented with an established vendor model so that these inevitable EM interferences are not reflected in the simulation process. Figure 9 shows how the response curve of the OMN rotates with the frequency inside the predefined optimal impedance domain and forms a knot signifying broadband matching, the junction of which corresponds to frequencies of 24 GHz and 31.5 GHz. Although the 22-32 GHz trajectory fails to sandwich Z L,ctr perfectly, it manages to fall into the narrow, preferred matching region as a whole, and the impedance at 32 GHz highly coincides with Z L,ctr , implying that the DA's performance at the high-end fundamental is well secured and will not sharply deteriorate there. Furthermore, the second and third harmonic impedances exhibit relatively widespread scattering alongside the Smith chart's margin, but neither are located in the aforementioned phase avoidance interval and therefore do not require extra harmonic manipulation.  Figure 9 shows how the response curve of the OMN rotates with the frequency inside the predefined optimal impedance domain and forms a knot signifying broadband matching, the junction of which corresponds to frequencies of 24 GHz and 31.5 GHz. Although the 22-32 GHz trajectory fails to sandwich ZL,ctr perfectly, it manages to fall into the narrow, preferred matching region as a whole, and the impedance at 32 GHz highly coincides with ZL,ctr, implying that the DA's performance at the high-end fundamental is well secured and will not sharply deteriorate there. Furthermore, the second and third harmonic impedances exhibit relatively widespread scattering alongside the Smith chart's margin, but neither are located in the aforementioned phase avoidance interval and therefore do not require extra harmonic manipulation.      Figure 9 shows how the response curve of the OMN rotates with the frequency inside the predefined optimal impedance domain and forms a knot signifying broadband matching, the junction of which corresponds to frequencies of 24 GHz and 31.5 GHz. Although the 22-32 GHz trajectory fails to sandwich ZL,ctr perfectly, it manages to fall into the narrow, preferred matching region as a whole, and the impedance at 32 GHz highly coincides with ZL,ctr, implying that the DA's performance at the high-end fundamental is well secured and will not sharply deteriorate there. Furthermore, the second and third harmonic impedances exhibit relatively widespread scattering alongside the Smith chart's margin, but neither are located in the aforementioned phase avoidance interval and therefore do not require extra harmonic manipulation.    The gate bias circuit of the output stage consists of open and shorted transmission lines with characteristic impedance Z 1 (98 Ω for a single-layer metal microstrip that is 10 µm wide) and a length equal to λ/8 in parallel, as shown in Figure 10, whose input impedance Z in seen at the central node is expressed as Equation (17). Compared with the conventional quarter-wave short-circuit stub, the proposed structure not only serves the same function but also improves the layout flexibility, making full use of the free space on the top and bottom sides of the chip, and the extra multiplication factor of 0.5 helps to provide a better short-circuit termination state with extended bandwidth near the zeros, reducing even harmonic distortions from the front stage to be amplified together and mixed into the final product.
The gate bias circuit of the output stage consists of open and shorted transmission lines with characteristic impedance Z1 (98 Ω for a single-layer metal microstrip that is 10 μm wide) and a length equal to λ/8 in parallel, as shown in Figure 10, whose input impedance Zin seen at the central node is expressed as Equation (17). Compared with the conventional quarter-wave short-circuit stub, the proposed structure not only serves the same function but also improves the layout flexibility, making full use of the free space on the top and bottom sides of the chip, and the extra multiplication factor of 0.5 helps to provide a better short-circuit termination state with extended bandwidth near the zeros, reducing even harmonic distortions from the front stage to be amplified together and mixed into the final product. Figure 10. Impedance response of the ideal suggested even harmonic trap circuit designed at f0 compared with that of the shorted λ/4 transmission line.

( )
After loading the EM model of the well-developed OMN to the output stage cell, source-pull simulations were conducted where a series 0.73 pF DC-block capacitor was placed. The real and imaginary parts of the resulting in-band ZS,opt vary in a small range, from 4.4 to 5.2 Ω and from 5.3 to -4.4 Ω, respectively, so the task of the interstage matching network (ISMN) can be simply specified as a real impedance transformation from 5 to 75 Ω. It would be more economical and advantageous for the broadband by participating the entire transistor's intrinsic output parasitic contribution into matching, rather than canceling out Cout through an inverse characteristic network or reoptimizing the finished realto-real MN to compensate for it, which is important for the ISMN design that needs to cope with high ITR and simple requirements. Herein, a two-section Chebyshev impedance transformer centered at f0 was again adopted. Figure 11 illustrates a brief implementation flow of the ISMN for the lumped schematic phase, analogous to the OMN. After loading the EM model of the well-developed OMN to the output stage cell, source-pull simulations were conducted where a series 0.73 pF DC-block capacitor was placed. The real and imaginary parts of the resulting in-band Z S,opt vary in a small range, from 4.4 to 5.2 Ω and from 5.3 to -4.4 Ω, respectively, so the task of the interstage matching network (ISMN) can be simply specified as a real impedance transformation from 5 to 75 Ω. It would be more economical and advantageous for the broadband by participating the entire transistor's intrinsic output parasitic contribution into matching, rather than canceling out C out through an inverse characteristic network or reoptimizing the finished real-to-real MN to compensate for it, which is important for the ISMN design that needs to cope with high ITR and simple requirements. Herein, a two-section Chebyshev impedance transformer centered at f 0 was again adopted. Figure 11 illustrates a brief implementation flow of the ISMN for the lumped schematic phase, analogous to the OMN.

Input Impedance Model and the Synthesized Band-Pass IMN
The input matching network (IMN) is responsible for providing a complex conjugate match at the driver stage cell's input in order to optimize gain flatness and achieve a nice

Input Impedance Model and the Synthesized Band-Pass IMN
The input matching network (IMN) is responsible for providing a complex conjugate match at the driver stage cell's input in order to optimize gain flatness and achieve a nice input VSWR. Contrary to the output case, Z in seen from the gate of a GaN HEMT has a series RC nature, as demonstrated below. Referring to Figure 12 [28], applying Kirchhoff's laws to the small-signal equivalent circuit of the HEMT input, the gate impedance Z g can be derived. Therefore, the impedance Zins seen at the new input node corresponds to a series RLC arrangement after the parallel RaCa stabilization network with a quality factor of Q and the connection line TLa are attached to the gate of the 46 × 4 μm cell loaded with all builtup post-stage circuits, expressed as Equation (23). It comprises a capacitor Cb and a resistor Rb resulting from the conversion of a parallel-to-series network, an inductor Lb approximating TLa, and Zin simplified as Rin in series with Cin, as illustrated in Figure 13, where the parameters of the equivalent RLC model fitted from simulations, and the agreement between its characterized impedance Zequ and Zins are also shown together. It is clear that the fit of the derived model is excellent at low frequencies in 22-32 GHz but declines to varying degrees as the frequency rises. Figure 14 reveals that this phenomenon is attributed to irregular changes in the real part of Zins, whose peaks and valleys occur at 28.3 GHz and 32 GHz, respectively, and the resistance in the model circuit was taken as their arithmetic mean value of 7.1 Ω. Though the imaginary parts of Zins and Zequ are closely matched within the 10 GHz bandwidth, there are no major undulating departures from each other, indicating that LC series resonance is the dominant form of Zins's reactance, despite its complex behavior. The simulation results confirm the RLC model suggested by the theoretical analysis and the acceptability of the selected circuit parameters.  Since i gd is small and there exists i gd i gs , neglecting it for the sake of simplicity, we have where C g = C gs g m R s It is evident that Z g can be regarded as a series connection of an effective gate resistor R g_eff formed by the sum of R g , R i , and R s plus a capacitor C g . Thus, Z in is expressed as Therefore, the impedance Z ins seen at the new input node corresponds to a series RLC arrangement after the parallel R a C a stabilization network with a quality factor of Q and the connection line TL a are attached to the gate of the 46 × 4 µm cell loaded with all built-up post-stage circuits, expressed as Equation (23). It comprises a capacitor C b and a resistor R b resulting from the conversion of a parallel-to-series network, an inductor L b approximating TL a , and Z in simplified as R in in series with C in , as illustrated in Figure 13, where the parameters of the equivalent RLC model fitted from simulations, and the agreement between its characterized impedance Z equ and Z ins are also shown together. It is clear that the fit of the derived model is excellent at low frequencies in 22-32 GHz but declines to varying degrees as the frequency rises. Figure 14 reveals that this phenomenon is attributed to irregular changes in the real part of Z ins , whose peaks and valleys occur at 28.3 GHz and 32 GHz, respectively, and the resistance in the model circuit was taken as their arithmetic mean value of 7.1 Ω. Though the imaginary parts of Z ins and Z equ are closely matched within the 10 GHz bandwidth, there are no major undulating departures from each other, indicating that LC series resonance is the dominant form of Z ins 's reactance, despite its complex behavior. The simulation results confirm the RLC model suggested by the theoretical analysis and the acceptability of the selected circuit parameters.
ωC gs (23) varying degrees as the frequency rises. Figure 14 reveals that this phenomenon is at-tributed to irregular changes in the real part of Zins, whose peaks and valleys occur at 28.3 GHz and 32 GHz, respectively, and the resistance in the model circuit was taken as their arithmetic mean value of 7.1 Ω. Though the imaginary parts of Zins and Zequ are closely matched within the 10 GHz bandwidth, there are no major undulating departures from each other, indicating that LC series resonance is the dominant form of Zins's reactance, despite its complex behavior. The simulation results confirm the RLC model suggested by the theoretical analysis and the acceptability of the selected circuit parameters. Figure 13. Input impedance characteristics of the stabilized 46 × 4 μm cell and the series RLC equivalent circuit.  The dominant reactive constraint of the transistor input is a vital limiting factor for bandwidth extension, as is Cout. Thus, the optimal realization form of the broadband IMN will be the band-pass structure to absorb the extracted series resonant components. As a rule of thumb, higher-order filters are able to broaden the bandwidth and achieve steeper stopband attenuation, but we should be aware that a rise in order requires more elements to be employed, more time for EM optimization procedures, and will not offer linear improvement [11]. Another aspect to emphasize is that the band-pass network costs twice the same-order low-pass counterpart. On balance, the IMN was built as a second-order band-pass filter using closed-form solutions via mathematical derivations [10,14,15], and the detailed design steps are given in Figure 15. Because the RLC model is not entirely equivalent to Zins, the synthesized network must be fine-tuned for the actual frequency response before being converted into a mixed-element form, with the microstrip line TL2 serving as the gate bias path.
Given that the OMN is pivotal in deciding whether Pout and PAE objectives can be met, great efforts were undertaken early on and good outcomes were achieved. Hence, the joint-tuning phase solely entails adjusting the ISMN and the IMN. To enable a broadband gain response and input match, while maintaining favorable large-signal perfor- The dominant reactive constraint of the transistor input is a vital limiting factor for bandwidth extension, as is C out . Thus, the optimal realization form of the broadband IMN will be the band-pass structure to absorb the extracted series resonant components. As a rule of thumb, higher-order filters are able to broaden the bandwidth and achieve steeper stopband attenuation, but we should be aware that a rise in order requires more elements to be employed, more time for EM optimization procedures, and will not offer linear improvement [11]. Another aspect to emphasize is that the band-pass network costs twice the same-order low-pass counterpart. On balance, the IMN was built as a second-order band-pass filter using closed-form solutions via mathematical derivations [10,14,15], and the detailed design steps are given in Figure 15. Because the RLC model is not entirely equivalent to Z ins , the synthesized network must be fine-tuned for the actual frequency response before being converted into a mixed-element form, with the microstrip line TL 2 serving as the gate bias path.  Figure 16 depicts the entire circuit schematic, where the gate and drain feeds for each stage are provided independently, isolated from one another to eliminate interference and thus improve stability. To quantify the insertion loss (IL) associated with the MN, i.e., to calculate the difference between the power entering and leaving the MN as described in Equation (24), the DA was injected with a fixed low-power stimulus Pin away from the 1 dB gain compression point. Based on Figure 17 for an intuitive illustration, the six power sweep curves corresponding to various nodes indicated by the red labels in Figure 16 are linearly increasing in the interval of Pin less than 10 dBm, and the power difference between nodes in such class-A operation is nearly constant. Therefore, we read all node power with an arbitrary Pin of 4 dBm and obtained ILs of 0.78, 1.48, and 3.07 dB for the OMN, ISMN, and IMN at 28 GHz, respectively. When the RF output power is progressively saturated, the curve Pc displays a quasi-proportional growth trend mildly deviating from linearity, which denotes that the front stage neither works in the deep back-off region of low efficiency nor demonstrates insufficient driving power, avoiding unnecessary Given that the OMN is pivotal in deciding whether P out and PAE objectives can be met, great efforts were undertaken early on and good outcomes were achieved. Hence, the joint-tuning phase solely entails adjusting the ISMN and the IMN. To enable a broadband gain response and input match, while maintaining favorable large-signal performance, concurrent small-and large-signal graded optimizations were carried out. The term "graded" refers to assigning more weight to optimization goals at the upper frequencies, which are an area of concern, so that a better fit gets produced sooner than with equalweight settings. Figure 16 depicts the entire circuit schematic, where the gate and drain feeds for each stage are provided independently, isolated from one another to eliminate interference and thus improve stability. To quantify the insertion loss (IL) associated with the MN, i.e., to calculate the difference between the power entering and leaving the MN as described in Equation (24), the DA was injected with a fixed low-power stimulus P in away from the 1 dB gain compression point. Based on Figure 17 for an intuitive illustration, the six power sweep curves corresponding to various nodes indicated by the red labels in Figure 16 are linearly increasing in the interval of P in less than 10 dBm, and the power difference between nodes in such class-A operation is nearly constant. Therefore, we read all node power with an arbitrary P in of 4 dBm and obtained ILs of 0.78, 1.48, and 3.07 dB for the OMN, ISMN, and IMN at 28 GHz, respectively. When the RF output power is progressively saturated, the curve P c displays a quasi-proportional growth trend mildly deviating from linearity, which denotes that the front stage neither works in the deep back-off region of low efficiency nor demonstrates insufficient driving power, avoiding unnecessary DC consumption and distortion generation. As a consequence, the bias Q-point and the staging ratio of 1:2 settings were verified to be appropriate to some extent. Likewise, the overall frequency responses shown in Figure 18 can be obtained. To achieve the highest possible PAE in the context of compactness, the OMN was implemented in a simple structure and used high characteristic impedance microstrip lines with a double-layer metal, whose maximum IL does not exceed 0.9 dB over 24-30 GHz. In contrast, the IL trend of the ISMN and the IMN compensates well for the device's negative gain roll-off slope, with the IMN being the main contributor, which reveals the reason for its large low-frequency loss. The total IL of three MNs is regulated within 9.23 dB in the designed band and exhibits a trend of rapidly decreasing with increasing frequency and then leveling off and steadily growing, with the valley at 28 GHz, corresponding to a minimum value of 5.33 dB.

Loss Analysis of the MN and Stability of DA
Micromachines 2023, 14, x FOR PEER REVIEW 18 of 26 DC consumption and distortion generation. As a consequence, the bias Q-point and the staging ratio of 1:2 settings were verified to be appropriate to some extent. Likewise, the overall frequency responses shown in Figure 18 can be obtained. To achieve the highest possible PAE in the context of compactness, the OMN was implemented in a simple structure and used high characteristic impedance microstrip lines with a double-layer metal, whose maximum IL does not exceed 0.9 dB over 24-30 GHz. In contrast, the IL trend of the ISMN and the IMN compensates well for the device's negative gain roll-off slope, with the IMN being the main contributor, which reveals the reason for its large low-frequency loss. The total IL of three MNs is regulated within 9.23 dB in the designed band and exhibits a trend of rapidly decreasing with increasing frequency and then leveling off and steadily growing, with the valley at 28 GHz, corresponding to a minimum value of 5.33 dB. Figure 16. Schematic diagram of the two-stage DA. Pout is simulated at nodes a-d in Figure 17.   Figure 16. Schematic diagram of the two-stage DA. P out is simulated at nodes a-d in Figure 17.
Micromachines 2023, 14, x FOR PEER REVIEW 18 of 26 DC consumption and distortion generation. As a consequence, the bias Q-point and the staging ratio of 1:2 settings were verified to be appropriate to some extent. Likewise, the overall frequency responses shown in Figure 18 can be obtained. To achieve the highest possible PAE in the context of compactness, the OMN was implemented in a simple structure and used high characteristic impedance microstrip lines with a double-layer metal, whose maximum IL does not exceed 0.9 dB over 24-30 GHz. In contrast, the IL trend of the ISMN and the IMN compensates well for the device's negative gain roll-off slope, with the IMN being the main contributor, which reveals the reason for its large low-frequency loss. The total IL of three MNs is regulated within 9.23 dB in the designed band and exhibits a trend of rapidly decreasing with increasing frequency and then leveling off and steadily growing, with the valley at 28 GHz, corresponding to a minimum value of 5.33 dB. Figure 16. Schematic diagram of the two-stage DA. Pout is simulated at nodes a-d in Figure 17.
where ηloss is the MN's transmission efficiency, which equals 1 in the absence of the MN. The matching efficiency ηmatch can be further derived as Equation (26), which equals 1 in the case of conjugate matching, where ZS = RS + jXS is the source impedance and Zin = Rin + jXin is the input impedance of the MN with a load attached. Table 3 details the simulated ηmatch and ηloss for all MNs in the range of 23-31 GHz, and their product ηtol is plotted in Figure 19. These findings validate the effectiveness of the filter synthesis theory in creating broadband MNs and explicitly reflect the design focus of each MN. The OMN scores well among various efficiency indicators, where the in-band ηtol surpasses 77.2% and fluctuates no more than 5.4%, ensuring a smooth and low-loss power transfer. As for the ISMN, ηmatch shows a more prominent trend of increasing with frequency than ηloss, with the difference between the highest and lowest fundamental frequencies of ηmatch being up to 27.2%, which represents the broadband strategy implemented by the ISMN to better match at the highfrequency side and introduce a proper low-frequency mismatch to cope with the demanded driving power of the output stage growing with frequency. In addition, ηloss remains above 59.2%, so the adverse impact of the associated IL on gain and PAE is still tolerable. Thanks to the applied band-pass filter structure, the IMN manifests superior and nearly uniform ηmatch. Nonetheless, we intentionally sacrifice ηloss to attain an eventual broadband gain response with enhanced stability in return for losing more than half of the transmitted signal energy, which seldom compromises large-signal performance. Overall, they exhibit a gradient decreasing ηtol from the OMN to the IMN, with peaks lying in the upper-frequency portion (28)(29)(30). Apart from the inherent dissipation loss of the MN, another is the mismatch loss, which together ensures the gain flatness of the DA. The total efficiency η tol of a lossy MN is defined as the ratio of the delivered power P L (i.e., P out_MN ) absorbed by load to the available power P avs from the source V g , expressed as where η loss is the MN's transmission efficiency, which equals 1 in the absence of the MN. The matching efficiency η match can be further derived as Equation (26), which equals 1 in the case of conjugate matching, where Z S = R S + jX S is the source impedance and Z in = R in + jX in is the input impedance of the MN with a load attached. Table 3 details the simulated η match and η loss for all MNs in the range of 23-31 GHz, and their product η tol is plotted in Figure 19. These findings validate the effectiveness of the filter synthesis theory in creating broadband MNs and explicitly reflect the design focus of each MN. The OMN scores well among various efficiency indicators, where the in-band η tol surpasses 77.2% and fluctuates no more than 5.4%, ensuring a smooth and low-loss power transfer. As for the ISMN, η match shows a more prominent trend of increasing with frequency than η loss , with the difference between the highest and lowest fundamental frequencies of η match being up to 27.2%, which represents the broadband strategy implemented by the ISMN to better match at the high-frequency side and introduce a proper low-frequency mismatch to cope with the demanded driving power of the output stage growing with frequency. In addition, η loss remains above 59.2%, so the adverse impact of the associated IL on gain and PAE is still tolerable. Thanks to the applied band-pass filter structure, the IMN manifests superior and nearly uniform η match . Nonetheless, we intentionally sacrifice η loss to attain an eventual broadband gain response with enhanced stability in return for losing more than half of the transmitted signal energy, which seldom compromises large-signal performance. Overall, they exhibit a gradient decreasing η tol from the OMN to the IMN, with peaks lying in the upper-frequency portion (28)(29)(30).   Before tape-out, it is mandatory to carefully check the overall stability of the DA. To ensure that the DA is unconditionally stable under any possible operating conditions, countermeasures were incorporated into the design by (1) connecting a parallel RC network in front of the two transistors, (2) inserting a resistor in both gate bias paths, (3) paralleling a series combination of a 3 pF bypass capacitor and a 40 Ω resistor to the gate and drain supply lines of the driver stage, and (4) in conjunction with the resistive losses of all MNs. Figure 20 proves that the simulated stability factors μ and K are greater than unity over DC-100 GHz.  Before tape-out, it is mandatory to carefully check the overall stability of the DA. To ensure that the DA is unconditionally stable under any possible operating conditions, countermeasures were incorporated into the design by (1) connecting a parallel RC network in front of the two transistors, (2) inserting a resistor in both gate bias paths, (3) paralleling a series combination of a 3 pF bypass capacitor and a 40 Ω resistor to the gate and drain supply lines of the driver stage, and (4) in conjunction with the resistive losses of all MNs. Figure 20 proves that the simulated stability factors µ and K are greater than unity over DC-100 GHz.

Probed Measurement Results
Instead of assembling the bare die into a fixture to avoid bonding inductive interconnected gold wires to the RF pads and causing uncertainty in the original impedance matching, we conducted on-wafer testing of the fabricated MMIC DA using ground-signal-ground (GSG) microwave probes and dedicated DC probe cards for proper DC de-

Probed Measurement Results
Instead of assembling the bare die into a fixture to avoid bonding inductive interconnected gold wires to the RF pads and causing uncertainty in the original impedance matching, we conducted on-wafer testing of the fabricated MMIC DA using ground-signal-ground (GSG) microwave probes and dedicated DC probe cards for proper DC decoupling with a manually-controlled probe station. Figure 21 shows a close-up micrograph where tapers were added that fit with the device drain pin width to reduce step junction discontinuities. However, the thermal contact between the backside of the chip and the heat sink surface will be poor and lead to inferior heat dissipation [22], which may provoke thermal degradation or even burnout of the transistor. To prevent overheating, a 12 V pulsed supply with a 5% duty cycle (200 µs pulse width and 4 ms period) was applied. First, performing necessary but quick partial stability inspections, the gate bias voltage was slowly raised from the threshold of -1.5 V to bring the quiescent current draw of the DA to 179 mA, in line with the simulation environment in the absence and presence of weak sinusoidal excitation, with no oscillations monitored across the full spectrum during both processes. Afterward, a series of measurements were carried out at room temperature using the N5183A MXG analog signal generator, N5245A PNA-X network analyzer, and N9021B MXA signal analyzer from Keysight (Santa Rosa, CA, USA).

Probed Measurement Results
Instead of assembling the bare die into a fixture to avoid bonding inductive interco nected gold wires to the RF pads and causing uncertainty in the original impedan matching, we conducted on-wafer testing of the fabricated MMIC DA using ground-si nal-ground (GSG) microwave probes and dedicated DC probe cards for proper DC d coupling with a manually-controlled probe station. Figure 21 shows a close-up micr graph where tapers were added that fit with the device drain pin width to reduce st junction discontinuities. However, the thermal contact between the backside of the ch and the heat sink surface will be poor and lead to inferior heat dissipation [22], which m provoke thermal degradation or even burnout of the transistor. To prevent overheatin a 12 V pulsed supply with a 5% duty cycle (200 μs pulse width and 4 ms period) w applied. First, performing necessary but quick partial stability inspections, the gate bi voltage was slowly raised from the threshold of -1.5 V to bring the quiescent current dra of the DA to 179 mA, in line with the simulation environment in the absence and presen of weak sinusoidal excitation, with no oscillations monitored across the full spectrum du ing both processes. Afterward, a series of measurements were carried out at room tem perature using the N5183A MXG analog signal generator, N5245A PNA-X network an lyzer, and N9021B MXA signal analyzer from Keysight (Santa Rosa, CA, USA).  Figure 22 exhibits that S 21 ranges between 18.3 and 20.3 dB within the 24-30 GHz operating bandwidth, which is higher than predicted in the CW mode from 24 GHz onward, with an average in-band excess of about 1.7 dB. Simulated S 11 gives a distinct band-pass Chebyshev equal-ripple response, whereas S 22 is relatively poor, as described from the conceptual standpoint that the output stage is power matching via load-pull instead of small-signal conjugate matching, so the S 22 performance is sacrificed at some expense in exchange for enhanced P out . Their respective in-band measured values stay below -12 dB and -13.9 dB, suggesting that the input/output ports are well matched.

Small-Signal Characterization
operating bandwidth, which is higher than predicted in the CW mode from 24 GHz onward, with an average in-band excess of about 1.7 dB. Simulated S11 gives a distinct bandpass Chebyshev equal-ripple response, whereas S22 is relatively poor, as described from the conceptual standpoint that the output stage is power matching via load-pull instead of small-signal conjugate matching, so the S22 performance is sacrificed at some expense in exchange for enhanced Pout. Their respective in-band measured values stay below -12 dB and -13.9 dB, suggesting that the input/output ports are well matched.

Large-Signal Characterization
A close examination of Figures 23 and 24 reveals that the collected P4dB and Psat over the band of interest are 29.7-30.8 dBm and 30.1-31.1 dBm, respectively, following almost the same trend as expected, with only an overall decline of about 1.7 dB and 1.5 dB from their simulated counterparts. The PAE at P4dB exceeds 30.9% in the band, with a peak of 39.8% at f0, and the curve is displaced by 1 GHz to higher frequencies compared to the simulation. Figure 25 depicts power sweep curves at low, medium, and high fundamentals, which show that the DA characterizes the anticipated soft compression behavior, generally reaching the critical saturation state in the vicinity of P6dB, but more boost power is required for the low-frequency case. This might be because of nonlinear trapping effects, which are more pronounced in GaN-on-Si than GaN-on-SiC, as the former has a higher defect density. In addition, the highest PAE occurs at about 2 dB back from the saturation point and total current consumption grows to 241-283 mA when the DA is driven into the 8 dB gain compression state. No case of instability was detected throughout the tests. Therefore, the applied stability measures seem to be sufficient.

Large-Signal Characterization
A close examination of Figures 23 and 24 reveals that the collected P 4dB and P sat over the band of interest are 29.7-30.8 dBm and 30.1-31.1 dBm, respectively, following almost the same trend as expected, with only an overall decline of about 1.7 dB and 1.5 dB from their simulated counterparts. The PAE at P 4dB exceeds 30.9% in the band, with a peak of 39.8% at f 0 , and the curve is displaced by 1 GHz to higher frequencies compared to the simulation. Figure 25 depicts power sweep curves at low, medium, and high fundamentals, which show that the DA characterizes the anticipated soft compression behavior, generally reaching the critical saturation state in the vicinity of P 6dB , but more boost power is required for the low-frequency case. This might be because of nonlinear trapping effects, which are more pronounced in GaN-on-Si than GaN-on-SiC, as the former has a higher defect density. In addition, the highest PAE occurs at about 2 dB back from the saturation point and total current consumption grows to 241-283 mA when the DA is driven into the 8 dB gain compression state. No case of instability was detected throughout the tests. Therefore, the applied stability measures seem to be sufficient.          The measured and simulated outcomes agree well in terms of trends, but they differ somewhat in data values and also partly show slight frequency deviations, which are mainly attributed to: the accuracy of the transistor modeling; inevitable and stochastic manufacturing errors, especially in MIM capacitors; channel temperature variations due to self-heating from gate finger dissipation; the calibration status of the entire test system; and the contact quality between the probes and RF pads. Note that the nonlinear scalable model of the device provided by the foundry was established on the basis of the measurement of typical test kits, and the ones with specific sizes used here were extrapolated through mathematical fitting from several validated samples. Therefore, they may not accomplish the same high accuracy as typical cells. Table 4 summarizes the experimental results in comparison to previously reported PAs with similar frequency bands.

Conclusions
This paper demonstrates a fully integrated 1-watt GaN MMIC DA for the 24-30 GHz band. The optimal impedance domain with a moderate range was delineated through simplified load-pull simulations according to reasonably preset goals. The reference center impedance Z L,ctr was calculated by weighted averaging, with an emphasis on highfrequency PAE. As per the proposed target space consisting of an area and a single point instead of discrete compromise matching points at different fundamental frequencies or arcs formed by theoretical impedance sets, as in class-J amplifiers, the implementation process avoids addressing the tricky impedance-tracking problem and relieves the pressure of broadband design. The low-loss OMN without harmonic control circuitry was thereafter developed applying the Chebyshev filter synthesis theory and paired with CAD postoptimization to ensure that the transformed in-band impedance trajectory nestles inside the preferred region, which can better solve the contradiction between multiple metrics under a broadband. With the same fourth-order low-pass ladder based on a Chebyshev response, the ISMN differs from the OMN in that it exploits the transistor's C out de-embedded from load-pull contours to complete the compact broadband matching at an ITR as high as 15.
For the series RLC model established by theoretical analysis and simulation, equivalent to the stabilized cell's input impedance, the IMN was realized as a band-pass filter using closed-form solutions to absorb the resonant input parasitics, achieving a return loss of 12 dB minimum in the 6 GHz bandwidth. The step-by-step circuit construction strategy rather than direct modular assembly allows the remaining MNs to be designed on the basis of frequency responses from a progressively built and refined EM model, which lessens the risk of unmitigated performance deviations or laborious adjustments during the joint-tuning phase, thus increasing the design efficiency and success rate. Moreover, this is greatly beneficial for amplifiers with larger-scale architectures, such as N-way combining and differential, as the reduction in simulation time is more significant than with conventional ways. The answers provided in this work in tackling broadband problems were outlined in terms of analyzing the losses of each MN, and experimental results validate the proposed design scheme. The realized two-stage DA has a linear gain of 19.3 dB on average, with a less than 1 dB fluctuation in the operating bandwidth and a state-of-the-art PAE of up to 39.8%, with tight integration and a small size of 1.29 mm 2 , contributing to lower fabrication and production costs, all of which prove the suitability of the presented design for massive commercial 5G mmW applications.