A Programmable Complex Impedance IC for Scalable and Reconfigurable Meta-Atoms

This article presents the design of a fully-custom integrated circuit (IC) suitable for adjusting the complex impedance of individual meta-atoms to enable programmability of metasurface systems. Implemented in a 0.18 μm mixed-signal CMOS process, the circuit utilizes four integrated complex impedance elements, each one with 216 available states. The impedance elements are optimized between 2–6 GHz, thus covering the entire S-band as well as half of the C-band. An asynchronous digital control circuit based on Quasi Delay Insensitive (QDI) circuits has been employed, offering basic communication capabilities and software programmability, but most importantly, the clockless operation allows for extreme scalability, ultra-low static power consumption and low electromagnetic (EM) radiated emissions. An array of meta-atoms utilizing the ICs can form metasurfaces with arbitrary sizes and shapes, on both rigid and flexible substrates, adjusting their surface impedance without interfering with incoming EM waves. Measurement results of the 2.2 mm × 2.2 mm IC demonstrate that it achieves a reconfiguration frequency of 1 MHz for all loading elements, whilst consuming 324 μW static power consumption.


I. INTRODUCTION
M ETAMATERIALS are artificial structures that exhibit electromagnetic (EM) properties not found in nature. In normal materials, permittivity (ε) and permeability (μ) are mostly positive. Some metals have negative ε or negative μ at short wavelengths but there are no materials in nature having both ε and μ negative. It was initially shown in [1] that materials with negative refractive index could be engineered. The first experiments carried out in [2] and since then, metamaterials have shown extraordinary EM properties such as cloaking [3], [4], [5], superlensing [6], and improved Radio Frequency (RF) and microwave systems [7], [8], [9].
Despite their rapid growth, all metamaterial and MSF systems to-date, are tailored only for a single or maybe a few applications and with limited tuning range. Some steps have been made towards programmable and reconfigurable metamaterials [27], with systems aiming to tune the metamaterial using PIN diodes [28], [29], varactor diodes [30], MEMS [31], [32] and liquid crystals [33], but these systems require bulky electronics and their programmability is limited. The limitations of metamaterials arise from the enablers themselves, since discrete component-based approaches consist of only passive components that do not have the capabilities of actively adjusting the EM settings, like an IC would do. The next step in metamaterial systems, is the real-time programmability of MSFs, whereby the metamaterial or MSF is seen as an array of meta-atoms with discrete values for both amplitude and phase. Some practical examples have been described in [34], [35], [36], [37]. A theoretical approach is described in [38] whereby using networks of custom-designed chips, the metamaterial can be programmatically adjusted via a computer interface to achieve such high performance. A design approach of a software-defined metasurface is presented in [39]. Even though this is the natural progress of research, there are many restrictions and constraints for the design of such complicated, multidisciplinary metamaterial. They arise from both the application at high frequencies and the limitations in state-of-art manufacturing processes. The restrictions and constraints are explored throughout the article. In this article, the design and implementation of a custom Integrated Circuit (IC) for programmable MSFs is presented. The IC incorporates integrated impedance Loading Elements (LE) including integrated inductors, which adjust the surface impedance of an MSF meta-atom. The chip also includes digital circuits for inter-chip communication with neighbouring chips. The bias voltages of the LEs are conveyed digitally to the chips, and the internal Digital-to-Analog Converters (DAC) convert them to analog values and set the bias voltage of the LE lumped components. The surface impedance of each meta-atom is controlled locally in a decentralized approach. Moreover, the surface impedance of an MSF is programmatically controlled by controlling both amplitude and phase of an array of meta-atoms that utilize the capabilities of the ICs.

A. General Structure of the Programmable Meta-Atom
The general structure of a meta-atom utilizing the IC, is shown in Fig. 1, whereby the IC is part of an example meta-atom with four square metallic patches. The MSF is considered as an array of these meta-atoms placed in a symmetrical pattern. The design of this meta-atom is presented in details in [40] describing the choices for the meta-atom's size and shape, along with its simulated performance. Here, the IC is placed at the bottom side of the Printed Circuit Board (PCB) and through-vias are used to connect the chip's ports to the metallic patches, which are placed at the top/front side of the PCB. At the bottom/back side, there are signal wires to convey the configuration data between the chips. The design of the PCB is described in details in [41] showing the asymmetric layer stack-up, while the manufacturing of the PCB is described in [42] detailing the techniques used to manufacture wires with 45 μm width and blind vias with 150 μm diameter on high-frequency substrates. Although the presented system is simple and symmetric, there are important constraints added to the design of the IC from the application and the operation at GHz frequencies, which are described below. Note that these constraints affect any type of MSF with ICs as part of their structure and not only the particular example presented in this article.
The meta-atom size affects both the size and the cost of the IC. The former limits the circuits that can be added to the chip's footprint, thus intelligent algorithms and network systems might be restricted as the operation frequency is increasing. As an example, at 60 GHz, the meta-atom size must be 1 mm × 1 mm since c = λf and it is recommended to have five meta-atoms per wavelength (λ). In this limited footprint, the meta-atom must accommodate signal routings, vias, board-toboard connections and the IC itself. In addition, the IC must accommodate integrated LEs for tuning the surface impedance of the meta-atom, and control circuit/s for communication to convey packets between neighbouring ICs. The latter (cost of IC) is increasing as the chip's size increases. Also, the MSF requires thousands of ICs to cover a large surface. For example, to construct a 1 m × 1 m MSF with meta-atoms of 1 mm × 1 mm, one would require 1000 2 chips! As can be seen, there is a direct trade-off between the die cost and the complexity of the control circuit, where the designers have to consider during the early stages of the design.
Following the IC size, the package affects the performance of the LEs and the cost of the packaged die. Typical die packages include wire bonding the die to the package, introducing parasitics, especially inductance, between the pads and the package. For most general applications, the added inductance is insignificant to the performance of the circuit but, as the LEs are sensitive RF circuits, any parasitics added will change the impedance range significantly. This issue, is minimized with the use of Wafer Level Chip Scale Packaging [43] or WLCSP for short. This package technology has die pads with solder spheres added at the wafer-level and can be directly bonded to the printed circuit boards, with arrays of meta-atoms. This package also keeps the size of the packaged chip the same as the bare die whilst most of the packaging technologies increase the size of the bare die. Finally, with WLCSP, the wire-bond procedure is not required thus at large quantities, the cost is lowered. However, the WLCSP technology limits the available I/O pins. For example, the state-of-the-art WLCSP technology at the time of this writing, allows a total of twenty-five pins for a 2 mm × 2 mm die. As it is shown in Section II, it was necessary to use serial communication between the ICs to accommodate for this issue.
Finally, the meta-atom design should not include any other external component besides the IC because any external component that is placed on the meta-atom, would require large quantities and there needs to be enough space to include them on the MSF. By avoiding other components on the PCB, the cost of the MSF manufacturing and population on the PCB is minimized while at the same time increasing MSF reliability.

B. Inter-Chip Communication With Asynchronous Circuits
The vast majority of state-of-the-art MSFs use global tuning of their meta-atoms, imposing limitations to their systems because of the lack of local control over the MSF impedance. To achieve programmability, variable LEs must be included locally, to control the impedance of each meta-atom. By interconnecting chips with communication capabilities in all meta-atoms, each meta-atom can be adjusted accordingly to manipulate the incidence EM wave in a reconfigurable way. This implies that For simple and robust scalability, the system can be broken down in tiles and at the edges of each tile, the edge connectors (con.) provide the means to connect tiles together. The edge connectors are connected to the supply rails as well, to allow powering the system from many locations. The ICs communicate via handshake thus the system can scale up without clock signal limitations.
for high frequency EM applications, a large number of chips has to be interconnected, resulting in a challenging and unique configuration with high chances of electrical and mechanical failures. One of the main requirements that arises, is to be able to dynamically scale the grid array so that it can be adopted by arbitrary MSF sizes and also allow dynamic board-to-board expansion. This will enable MSF systems that can dynamically cover irregular shaped surfaces, using a flexible MSF that has numerous chips embedded in the system, to execute various reconfigurable functionalities, by receiving commands from a dedicated computer software. Finally, all these requirements impact the cost, power, EM noise and speed of the implementations. As evident by tunable MSF systems [44], [45] they require bulky electronics e.g., control stations, and in conjunction with the high power consuming discrete elements, the power consumption is in the order of Watts at best for an MSF of a few square centimeters. Fig. 2 illustrates a possible configuration of a scalable system, using the presented IC, that can be adopted to eliminate the previously described constraints. The example system is divided in tiles, with each tile having a 4 × 4 array of ICs (as an example for illustration purposes). Notice that each dot represents an IC or a meta-atom. At the edges of each tile there are connectors that allow connections between the tiles. Also, all edge-connectors have power and ground connections to allow distribution of the supply from many locations, if required. The input port to the tile is located at the bottom left corner whereby a digital device like a field programmable gate array (FPGA) sends the configuration bits to the MSF. This tile configuration can be a simple shift-register with its size analogous to the number of chips. The input port to the shift-register is located at the bottom-left corner where the first chip of the MSF is located.
The output of the shift-register is located at the top-left corner where the last chip of the MSF is located. Each chip has storage cells located in series configuration and the incoming bits from the FPGA move through the chips and exit the output port, where they can either be discarded or can also be retrieved by the same FPGA, for verification purposes. By increasing the number of tiles, the surface size is increased, and consequently the depth of the shift-register increases.
The scalability of the presented network architecture is limited by the clock signal and its distribution between the tiles. This problem can be overcome by using asynchronous circuit design with handshake communications between chips and tiles. Synchronous circuits are by far predominant in digital design but for programmable MSF systems, synchronous design has many shortcomings that makes it both undesirable and unnecessary [46]. The clock tree synthesis should account for scalability adjustments and satisfy requirements for flexible surfaces to unlock new potentials for MSF applications, conforming walls of various shapes. In addition, the clock skew must be considered to avoid setup and hold time violations. This approach would require a detailed design for every MSF that uses the chips. Furthermore, during switching activity, all chips consume energy no matter whether they are retrieving bits or they are idle. Special techniques can be used to stop switching during static conditions but the clock tree buffers and other circuitry still consumes unnecessary energy. Another issue with synchronous design that affects RF applications, is the amount of EM noise that is generated during switching activity. One of the most important MSF applications, is to fully absorb EM waves, but in a synchronous programmable MSF, the chips will emit broadband noise [47], which can undermine an application where the absorption of the incoming EM signal is required. On the other hand, asynchronous circuit design avoids all the issues discussed here, since a synchronized clock line is not present. Using asynchronous circuits with handshake communication, the scalability of the system is converted to a 'plug-and-play' system, since once the wires are connected, the newly added tiles are considered part of the system. The power consumption is also less overall, since during static conditions, the tiles do not consume dynamic energy through clock events. Finally, the EM emissions generated by asynchronous circuits are of lower amplitude and they are spread more evenly [48] across the MSF system which makes it suitable for an MSF absorber.
The scalability of the presented topology, using asynchronous circuit design, is far superior to that of a synchronous design, where the clock tree needs to be redesigned as the surface scales. The tiles can be easily connected together in a 'plug-and-play' fashion and the power can be provided by multiple sources from multiple locations in every tile. The power consumption of the chips as it is proven in the measurements Section (Section IV) is a few hundred microwatts, thus even a typical FPGA can power hundreds of chips and at the same time act as the digital device to provide the configuration bits to the chips.
The main contributions of this article are summarized as follows. Firstly, a detailed asynchronous, transistor-level, control circuit design is presented. Secondly, an elaboration on the testing methodology and the design's measured performance in provided. Thirdly, a comparison between the proposed fullcustom IC design approach and other tunable metasurface enabling methods is conducted. Finally, a scalability analysis of these type of systems is provided; the proposed design approach eliminates the clock-tree and all issues that accompany it on arbitrarily scalable systems. In [49], a family of chips from an architectural perspective are presented. They describe in toplevel the capabilities and applications that could be achieved by incorporating them in MSF systems. Despite the fact that on-chip complex impedance circuits are well-known in the literature, the requirements and constraints that the proposed chips face and overcome, make them suitable for programmable MSF systems.
The rest of the article is organized as follows. Section II presents the architecture of the IC, while Section III presents its design details. The IC measurement results are presented in Section IV and Section V concludes the article.

II. IC SYSTEM ARCHITECTURE
The architecture of the chip is illustrated in Fig. 3. It comprises of three main parts: The Control Circuit (CC), the Digitalto-Analog Converters (DAC), and the Loading Elements (LE). The CC is implemented using asynchronous digital circuits to satisfy the constraints added to the architecture by the MSF applications, as described in Section I. As a consequence, the communication between the chips use handshake protocols [50]. It consists of one input channel and one output channel. The communication is serial (bit-by-bit) and unidirectional, because of I/O limitations at high frequency operations. In addition, it comprises sixty-four memory cells connected in series configuration, which can be categorized as an asynchronous shiftregister circuit, that stores the bits intended for the eight DACs of the chip. The output signals from the memory cells are directly connected to the DACs. The DAC architecture is an 8-bit, two-stage resistor string. This architecture guarantees a monotonic output, it comprises only of passive components, it has a relatively good accuracy, its size is small with easy layout matching and it can drive the LEs without an output buffer stage. The LEs consist of a MOSFET varistor and a MOSFET varactor to adjust the real and imaginary parts of the complex impedance respectively. There are four LEs in the chip located at the four corners, each one with 2 16 available states. For this work, each LE will be connected to one metal-patch of the meta-atom from Fig. 1. The formed complex impedance is seen at the corresponding corner port/pad. The tuning range of the LEs is controlled by the applied voltage on the varistor and varactor gates. Having four LEs per chip allows for many meta-atom designs to be realized, for example having multiple LEs per meta-atom or having multiple metallic patches with various sizes and shapes. The architecture of the LEs aims a simple architecture to avoid increased parasitics that reduce the available tuning range. Each element in the design, affects the total impedance that is seen at the output port. Both parallel and series configuration can provide decent tuning range, however, the parallel configuration can reach lower resistance values, which is better for perfect absorption. Furthermore, the circuit was tested on three CMOS technologies in [41] and the selected process technology provides the most suitable compromise between cost and tuning range.
The IC architecture presented in this section is intentionally kept generic, describing only the system blocks. This way, the system can be easily upgraded with more sophisticated circuits and other goals in mind. In this case, the system is kept simple and effective to have a fast and low-power chip with a relatively small budget. Others might be willing to migrate to exotic processes and exploit the largest available tuning range, or sacrifice the consumption and speed of the circuit to adopt intelligent networks with machine learning algorithms, fault tolerance methods and smart routing techniques. However, no matter the targeted goal, the constraints described in Section I must be addressed to have a functional working system. The IC architecture proposed in this article can be adopted by a variety of MSFs as long as the PCB consists of the appropriate wiring to connect the ICs together in a network configuration and the connections between the metal patches and the LEs. The ICs can be easily programmed through software, allowing easy migration to various MSF systems.

A. RF Loading Elements
The LEs are designed to operate at the sub-6 GHz frequency. The choice of the impedance tuning range is set for perfect absorption operation at 5 GHz frequency however, the circuits provide decent tuning range for lower frequencies as well. Fig. 4 shows the schematic diagram of the circuit. A MOSFET varistor (M1) and a MOSFET varactor (M2) are the main components for creating the complex impedance required at the output port. The two are placed in parallel configuration. A DC block capacitor (C1) is placed in series with M1 to prevent the capacitance bias voltage (V C ) from shorting out through M1. V R is the bias voltage of M1. Lastly, the RF choke inductor (L1) is used to bias the M2 without affecting the total impedance. The varactor M2 imposes a limiting factor for the available impedance range because it affects both its real and imaginary part. As explained in details in [34] the quality factor of the varactor, affects the equivalent parallel resistance of the circuit. In [51], it was shown experimentally that the quality factor can be improved by breaking the varactor in fingers and reducing the length of each finger. This is caused by the n-well resistance reduction. Another limiting factor for the circuit is the size of the inductor since in integrated bulk processes, there is not much space for high quality inductors. Its size is chosen to fit between the pads with some clearance for routing and to possess high equivalent parallel inductance and resistance. The DC block capacitor is implemented using multiple MiM (Metal-Insulator-Metal) capacitors in parallel, because they offer better performance and smaller layout compared to their MoM (Metal-Oxide-Metal) alternatives.

B. Asynchronous Shift Register + Digital-to-Analog Converters
The IC operates using a Dual Rail (DR) representation for the data with a '1-of-2' asynchronous communication protocol to implement a quasi-delay-insensitive (QDI) scheme [52]. This circuit methodology has the advantage of making timing assumptions on the propagation delay of the signals that fan-out to multiple gates, but it makes no assumptions on gate delays. The circuit uses three types of asymmetric C-elements, which are shown in Fig. 5. The C-element of Fig. 5(a) outputs a logic '1' when all four inputs (A, B, C, D) are logic '1'. However, it outputs a logic '0' when inputs A and B are '0' with C and D being 'don't cares' for a '1' to '0' transition. The feedback inverter holds the output until one of the two cases described above is presented at the inputs. For any other case, the output is kept to its previous value. Input 'rp' corresponds to the negative triggered reset signal for the C-element. When the 'rp' signal is of logic '0', the p-type transistor is turned-on and forces the output to become '0'. To avoid conflict between logic '1' and '0', in case the C-element is outputting a '1' while a reset is forced, signal 'rn' disconnects the ground from the n-type transistors path during reset. The C-element of Fig. 5(b) outputs a logic '1' when input B is '1', ignoring input A. On the other hand, it outputs a logic '0' when both A and B are '0'. For any other case, the output holds its previous state. The third asymmetric C-element used is shown in Fig. 5(c) and it requires all three inputs to be either logic '1' or logic '0' for the output to become '1' or '0' respectively. For any other case, the output holds its previous state. This C-element has reset signals 'rp' and 'rn' as in the case of the circuit in Fig. 5(a).
The CC is shown in Fig. 6. The input signals ('in.t' and 'in.f') are connected to an OR gate (OR1), generating the request signal when the input signals have valid tokens. A valid token is considered for cases '01', '10' and '11' although case '11' is 'illegal' in the DR communication and by definition not used under any circumstances. In parallel, the input signals go to the input of the first, out of sixty-four D-flip-flops connected in series configuration forming a shift-register. All flip-flops are enabled when a new valid token arrives and pushes all stored bits to the right by one position. The sixty-fourth bit moves to the output of the CC and enters as input to the next CC. The delay block (DELAY1), matches the delay of the propagation time of the flip-flop path, so that signal 'Q64' is guaranteed to reach the C-element (C1) before signal 'in_v' does. C-elements C3 and C4 are used for generating acknowledge and returning all signals to their starting/zero state. Specifically, C3 generates acknowledge for the previous stage ('in.a'), when a token is received by this stage (OR1 outputs '1') and an acknowledge signal from the successor stage has come back and reached C3 ('out.a' is '1'). Notice that C4 outputs high at the beginning of a cycle. For the signals to return to their reset state and become available for new tokens, 'out.a' and 'en' become '0', making C1 and C2 output '0'. The output token becomes empty (case '00') and eventually makes 'out.a' '0' and 'en' '1'. Signal 'in.a' becomes '0' when the previous stage issues empty token, which means 'in.t' and 'in.f' became '0'. Signals 'Q1 …Q64' are directly connected to the inputs of eight DACs for their conversion to an analog bias voltage for the impedance LEs. The DAC design in illustrated in Fig. 7. The resistor string is divided in two segments whereby one segment, namely 'coarse resistor string', is used for the four most significant bits and the other segment, namely 'fine resistor string', is used for the four least significant bits. Each segment is read individually and at the end the two segments form the 8-bit digital word.
The 'coarse resistor string' is implemented with sixteen identical resistors (CR0-15) with value equal to 5K Ohms each. The voltage is divided as (V ref are the positive and negative voltage references respectively. Each of the sixteen nodes is connected to two analog switches implemented as transmission gates and serve as a two to one multiplexer. The four input bits for the most significant bits of the 8-bit input word, namely 'Digital 4-7 or D47', are decoded by a 4:16 decoder and close one of the switches (CS0-15) according to the value of the input bits. When the switch closes, the voltage across the corresponding resistor is transferred to the 'fine resistor string' as the V H and V L voltages which set the high and low voltages of the 'fine resistor string' respectively. The 'fine resistor string' is implemented with sixteen identical resistors (FR0-15) with value equal to 20K Ohms each. As a result, the voltage is further divided to a total of 256 segments. A second 4:16 decoder decodes the input bits for the least significant bits of the 8-bit input word, namely 'Digital 0-3 or D03', and one of the switches (FR0-14) closes to form the output analog voltage. Note that in parallel to the 'FR' switches, there are dummy switches which are always ON in order to compensate the impedance of the switches from the 'coarse resistor string'. The output voltage  is calculated by the following expression: where DAC out is the analog output voltage in Volts, (D47) 10 is the decimal value of the four most significant bits, (D03) 10 is the decimal value of the four least significant bits and ΔV = V ref As an example, the binary word (01101100) 2 with supply of 1.8 V and 0 V has the following analog value:

IV. MEASUREMENT RESULTS
The proposed programmable complex impedance IC was designed and fabricated in 0.18 μm mixed-signal CMOS process technology. The design of the complete circuit includes in-house designed input/output buffers, pads and Electrostatic Discharge (ESD) protection circuits. The ESD circuit of the digital I/O consists of a Silicon-Controlled Rectifier (SCR) based circuit [53]  including a series resistor to limit the current and a secondary device for extra protection at low voltages. The RF terminals utilize only the SCR circuit without the series resistor and secondary device, to minimize the added resistance and capacitance to the LEs. Fig. 8 shows the chip microphotograph. The missing pad is intentionally left empty to create a visual asymmetry so that the die orientation is easily recognizable. Since the IC consists of analog, digital and RF circuits, it was necessary to use multiple tools for simulating the individual parts. The DACs and the CC were implemented using full-custom design in Cadence and were simulated at transistor-level with Spectre simulation platform. The RF parts of the IC i.e., the LEs, were designed and simulated using Cadence and verified in Keysight ADS. The total die area is 2.2 mm × 2.2 mm including package since it is at wafer level with WLCSP technology. Both core and I/O circuits operate at a 1.8 V power supply.
The experimental results of the IC are divided in two parts: Section IV-A presents the test-setup and measurement results of the RF LEs, showing the achieved complex impedance range of the IC. Section IV-B to IV-F present the test-setup and measurement results for the CC and finally a comparison with various discrete-based circuits that are used to provide tunability and programmability on the meta-atoms is presented.

A. Complex Impedance Range
The test set-up for measuring the impedance range of the LEs is shown in Fig. 9. The IC was populated on a four-layer high frequency substrate material (RO4350), with the top layer being the metasurface layer with the metallic patches, the bottom layer being the communication layer with the tracks connecting the chips together, and the two intermediate layers being the supply voltage and ground. The DUT (Device-Under-Test) includes 50 Ω coplanar waveguides with ground (CPWG) transmission lines connecting the four LEs to coaxial connectors. In their turn, the connectors are connected to a four port Vector-Network-Analyzer (VNA) to acquire the scattering parameters of the signals. The use of a high frequency substrate is necessary Fig. 9. Test-setup for measuring the complex impedance range of the loading elements. for microwave frequencies and is relatively expensive material, however this is compensated by the chips, since they undergo design choices that favor system cost. Specifically, the chips are designed and fabricated in a mature and relatively cost-effective process technology. They utilize the WLCSP packaging technology and they operate without the need of any external components on the meta-atom (e.g., crystal oscillators).
The measurements were taken in an ESD protected environment at stable temperature (27°C). De-embedding boards were used to compensate and remove the loading of the DUT board using the through-reflect-line method. The impedance range achieved by the measured ICs is seen on the Smith Chart of Fig. 10. Specifically, the Smith Chart shows the perimeter of the impedance values measured at frequencies between 2 GHz up to 6 GHz. The area is formed by digitally controlling the output voltage of the digital-to-analog converters that bias the LEs at the V R and V C nodes. The adopted LEs have a resistance range between 25 Ω and 290 Ω whilst the capacitance range is between 1.7 pF and 4.4 pF with the optimum values for perfect absorption at 5 GHz being R = 41 Ω and C = 2.7 pF. The tuning range of the LEs directly affect the Smith chart coverage map, which means that any added parasitics introduced in the system, shift the coverage map. This includes the impedance added from the PCB after population of the chips. With the Smith Chart, "a microwave engineer can develop a good intuition about transmission line and impedance-matching" [54] as well as easily plot the perimeter of the available impedance points of a particular system, which is why it is preferred for the LE tuning range.

B. Chip Latency
The latency of the proposed IC is defined as the time delay from placing a token at the input of the IC until the IC is ready to receive the next token. The measurements include the delays from the output buffers, the solder spheres and the tracks on the DUT board. Asynchronous circuits run at their maximum speed possible without being limited by any global synchronization signal i.e., the clock. Thus, the latency of the IC cannot be accurately determined by sending input tokens from an FPGA, since it samples only at specific time intervals. To do so, the following test has been performed: A ring topology was used by taking the output data signal and inserting it back to the chip as an acknowledge signal. Also, at the input side, the acknowledge signal is going through a logic inverter and inserted back to the input as new data ( Fig. 11(a)). Doing so, the IC continuously sends data to itself at the fastest rate possible. The inversion of the signal is performed by the FPGA, which also provides power to the chip and operates as a logic analyzer with a sampling rate of 400 MSamples/second to read the signal transitions. The delays added to the measurement by the FPGA and the external wires are compensated as follows. The FPGA's output port with the inverted signal is fed back to the input of the inverter signal as shown in Fig. 11(b). The wire used for the feedback is the same as the wire used to connect the output signal of the IC to the acknowledge port of the IC. The delay measured at the logic analyzer of the FPGA is 80 ns. This is the time required for the signal to travel through the wires, implement the inverter function and visualize the transitions on the logic analyzer. This delay is subtracted twice from the total delay of the IC's token cycle, since during one cycle the signals pass through the inverter function and the output wire two times, due to the Return-To-Zero (RTZ) line-coding. The speed of the IC is measured in terms of seconds for one token or bit-cycle and seconds for one packet or packet-cycle, since these are the delays required for the design of MSF systems. Consequently, the bit rate and packet frequency of the chip can be calculated by inverting the bit and packet cycles respectively. The measurements are taken at the input and output of the inverter since they represent the acknowledge and the input data signal of the IC, respectively. The results are shown in Table I.

C. Static and Dynamic Power Consumption
To measure the power consumption of the DUT, a Source-Measure-Unit (SMU) device was used. The SMU delivers the power to the DUT and simultaneously calculates the average current drawn by the DUT. The FPGA development board sends the configuration data to the IC, which is defined by the user, via a custom written Graphical-User-Interface (GUI) and software on the PC. During static conditions, the average current measured by the SMU is 180 μA. Static condition is defined as the state, which the IC is operational with a constant value applied on the four LEs but there is no other signal transition. The static power consumption along with the energy consumption per bit and packet delivered to the chip are summarized in Table II.

D. Scalability Considerations
To verify that the manufactured ICs can communicate and exchange data, two DUTs, each one with a chip soldered on, have been connected in series configuration and the FPGA sends various bits at the input side of the first DUT until the sequence sent is seen at the output side of the second DUT. In this test, the chips are not running at their fastest speed since they are limited by the speed of the FPGA and the corresponding custom-written software for communication with the dual-rail asynchronous protocol. Fig. 12 shows measured results at the output of the second DUT. The sequence presented here is (AAAAAAAAAAAAAAAA) HEX which corresponds to alternating between the binary values '1' and '0'. Signals 'data.true' and 'data.false' correspond to the output signals of the second chip. Signal 'data.ack' corresponds to the output  acknowledge signal of the second chip. During this test, the second chip requires approximately 6.5 μs to output a packet of 64 bits. Each bit-cycle requires approximately 80 ns and is calculated from the time that the input token/signal rises until the acknowledge signal falls.
In addition to measuring the response of two chips in series configuration, the calculations for two metasurfaces are presented, showing their potential reconfiguration delay and power consumption. Table III presents the results for a 100 × 100 meta-atom surface array and the results for a 1000 × 1000 meta-atom surface array. Since these are merely calculations, measured results may deviate slightly.  DAC is 7.03 mV. The worst-case settling time of the DAC is 366 ns. Note that during one packet cycle, the DAC values are unstable since the control circuit exchanges bits faster than the settling time of the DACs. Once these transient signals finish the chip operates at static conditions and each DAC draws 22 μA of current which corresponds to 39.6 μW. Thus, the power consumption from all eight DACs in one IC is 316.8 μW, which is the major power consumer circuit. Table IV compares various characteristics of the fabricated integrated LEs with other tunable elements. The proposed MSF "enabler" is a standalone chip containing both the LEs and the control circuit in a single die. This allows local and decentralized control of each meta-atom's surface impedance due to the inter-chip communication capabilities. Each of the four LEs has 2 16 available states, which is a huge upgrade from the discrete-component-based MSF "enablers". The static power consumption of the chip is at the microwatts range, which makes it able to cover walls of a few square meters without power limitations. Furthermore, the ICs allow for plug-and-play expansion since they do not need to be connected on a centralized FPGA. Although all systems compared can be expanded, in the case of the IC approach, this is done by simply connecting the boards together, which can be done by users that may not be familiar with the technical aspects of the system. In addition, the communication grid can be used on many types of MSFs because it only requires one layer on the PCB along with through-vias to connect the chips to the metallic patches. These benefits also eliminate the requirement for long cables and bulky electronics for biasing. The IC described in this article and its improved performance compared to other enabling circuits, enables a new generation of programmable MSFs that can be incorporated in telecommunication systems. A natural progress in the field of programmable MSFs, is the development of real-time programmable MSFs that will pave the way for extraordinary applications that seemed impossible until now, e.g., cloaking of moving objects, dynamic holograms and real-time manipulation of incident waves.

V. CONCLUSION
In this article, the design and implementation of a programmable complex impedance IC for MSFs is presented. The implemented programmable IC stores incoming digital bits and sets the analog bias of its integrated varistors and varactors. These in turn form complex impedances at four output ports of the IC that adjust the surface impedance of up to four meta-atoms. The procedure for changing all four LEs of a chip takes less than 1 μs and the complete IC consumes only 324 μW during static conditions. In addition, with its ability to send and receive data to other ICs, a system can be formed which can dynamically scale and form communication grids on surfaces. The IC is a standalone device i.e., it does not rely on external circuits such as crystal oscillators or antennas to operate. This also enhances the decreased cost for the design of the overall system; this way making an IC enabled MSF a promising candidate for high performance, low power, scalable and cost-effective programmable MSFs. In this work, focus is mainly given on the chip design and its features. For elaboration on the EM aspects and capabilities of metasurfaces consisting of these chips, the reader is encouraged to read [34], [35], [36], [37], [38], [39], [40], [41], [42].
The presented work paves the way towards sub-μm scale devices to be used on metamaterials systems. The chips can control the surface impedance of up to four meta-atoms, acting as nanoscale hardware that can be used in large quantities for enabling the implementation of software-defined metamaterial systems.