# Logical and Physical Synthesis of an HF-RISCV Core for X-Fab 180 nm Technology

Rodrigo N. Wuerdig, Lúcio P. Franco, Rodrigo F. Scroferneker Carolina M. Metzler, Ricardo L. A. Reis Universidade Federal do Rio Grande do Sul – Porto Alegre – RS – Brazil {rnwuerdig, lpfranco, rfscroferneker, cmetzler, reis}@inf.ufrgs.br

Abstract—This document enlightens the processes intrinsic to turn high-level hardware description into a polygon description (GDSII) of a circuit. The objective is to synthesize the HF-RISCV core. The HF-RISCV is a simple and versatile microcontroller core that could be used for different applications (e.g., Internet of Things), for the XC018, X-FAB's 0.18-micron Modular Logic and Mixed-Signal Technology. The whole process entails logical and physical synthesis, technical aspects, and tests under several working conditions. The design and development of the HF-RISCV for the X-Fab 180 nm node technology, presented in this document, relies on the use of the Cadence TM commercial framework. The microcontroller core was fully synthesized with and without pads, resulting in a circuit that has a slack greater than zero under every operating condition defined in this work. The estimated energy consumption of the core with pads was about 45 nJ under nominal conditions.

Index Terms—VLSI, Logical Synthesis, Physical Synthesis, RISC-V

#### I. INTRODUCTION

Considering the high number of transistors on modern circuits, making a full-custom chip is a challenge for today's technology. To make circuits with a considerable lower time to market (TTM) and with a more favourable yield, the majority part of the industry lays on the cell-based design (also called semi-custom design) Fig.1. The cell-based design consists of a collection of cells, with different logical functions and driving strengths, that are called standard-cells. Circuits composed of standard cells are more promising to work due to the fact that each piece is fully validated.



Figure 1: Representation of a cell-based design.

The authors would like to thank CNPq, Capes, and Fapergs Brazilian agencies for financial support to our research.

This document describes the steps taken to turn a hardware description, of a HF-RISCV that is available on GitHub [1], into a polygon description file (GDSII), which is one of the last steps before making the masks for the lithographic process.

This work only covers the core synthesis, due the lack of information and difficulty on what's come to the use of memory generators for this specific technology to generate the required RAMs for the whole System-on-Chip (SoC).

# A. HF-RISCV

The HF-RISCV is a versatile and straightforward 32-bit microcontroller, the successor of the already ASIC proven HF-RISC. A version of the original HF-RISC was taped out using the TSMC 180 nm process [2]. The authors related that chips worked perfectly at the specified ASIC project target frequency (50 MHz).

HF-RISCV uses an RV32I instruction set of the RISC-V architecture instead of the MIPS like the HF-RISC instruction set. The RV32I instruction set can provide a better density of code. Furthermore, it also provides an extensible ISA; no delay load/branches; a better immediate field encoding, and support for relocatable code [3]. The core organization is the same as HF-RISC, including the memory map and software compatibility. Improvements (compared to HF-RISC) include shorter critical path for higher clock frequency and an exception handling mechanism for unimplemented opcodes.

## B. X-FAB's XC018 180 nm Technology

The XC018 series is X-FAB's 0.18-micron Modular Logic and Mixed-Signal Technology. The platform focuses on System-on-Chip applications, and it can be implemented in the standard cell, semi-custom, and full custom designs methodologies. It has a wide range of product applications such as Automotive, Consumer, Industrial, and Telecommunication. Also, the low power and high voltage process are ideal for mobile applications as well as display drivers or controllers. As an industry standard, it uses a single polysilicon layer with up to six metal layers, 0.18-micron drawn gate length, N-well process, and dual gate oxide (1.8 V with 3.3 V or 5.0 V) transistors [4].

#### **II. LOGICAL SYNTHESIS**

Logical synthesis is one of the steps taken to develop an IP core. Starting from an HDL file, the main objective of

this step is to refine the description, translating it from the RTL (Register Transfer Level) to logical gates (cells). This translation can be made manually or aided via some software, such as Cadence Genus<sup>TM</sup> and Synopsys Design Compiler<sup>TM</sup>. In this project, the applied tool was Cadence Genus<sup>TM</sup>. The entire logical synthesis flow is performed according to [5], with minor tweaks in order to better avail newer tool amenities.



Figure 2: Logical synthesis flow overview.

# A. Multi Mode Multi Corner Flow

Multi-mode multi-corner (MMMC) analysis refers to performing STA (Static Time Analysis) across multiple operating modes, PVT (Process, Voltage, and Temperature) corners and parasitic interconnect corners at the same time. Genus supports this feature. To enable MMMC, a file should be made describing every operating condition Tab. I and appending them to the corresponding liberty file (*.lib*).

Table I: Utilized views for the multi-corner synthesis of the HF-RISCV core.

| Corner       | Temperature (° C) | Supply Voltage (V) | Process |
|--------------|-------------------|--------------------|---------|
| Worst Case   | 125               | 1.62               | Worst   |
| Nominal Case | 25                | 1.8                | Nominal |
| Best Case    | -40               | 1.98               | Best    |

The defined views at Tab. I are set according to their impacts on the circuit electrical characteristics. On the worst case the temperature were set at  $125^{\circ}$  C to decrease electrons mobility at FETs strong inversion, while at the best case the temperature is set to  $-40^{\circ}$  C to increase electrons mobility. The -10% nominal  $V_{DD}$  on the worst case also harm the circuit delays. For the best case the circuit is set to +10% of the nominal supply voltage. Process variations are also set under the MMMC views, at worst process the circuit suffers from all sorts of transistor attributes mutation (e.g., length, width, oxide thickness) to deteriorate performance. In the best view, process variations mutate transistor characteristics in order to improve performance.

MMMC synthesis can further give the possibility to generate reports for every view during logical and physical synthesis. This multi-corner reports can roughly display how the circuit will operate in different conditions, this will also open the possibility to extract different delay files (.sdf) for further HDL simulation. The netlist generated by the logical synthesis describes an association of library cells that perform what is described on the RTL level before logical synthesis.

#### B. Clock Gating

Despite the lack of clock specific cells on the cell library, was included a command to enable this feature. Clock gating is a technique used in many circuits for reducing dynamic power consumption. Clock gating saves power by adding more logic to a circuit to block the clock tree. Gating the clock disables portions of the circuitry so that the flip-flops in them do not have to switch between states.

## C. Constraints

For further compatibility with pads and increase yield, some special constraints are applied to the constraints file. The 50 MHz frequency used in the taped-out HF-RISC, using TSMCs 180 nm technology, was not met in this case mostly because of the cumbersome register bank, that was synthesized using standard-cells and not generated by a specific tool for memories that could generate memories with much higher density and fewer interconnections, so the core was synthesized for 40 MHz.

On what comes to compatibility of the future pad insertion, special constraints were added to target an 8 pF output load. This 8 pF load is more than necessary to drive each pad for this technology. After the addition of pads, this load will be fine-tuned by the physical synthesis tool.

A technique that increases yield was done limiting the fanout of each cell. In this project, the fanout was limited to 8 using a special constraint command. Another technique to guarantee the robustness of the circuit during the logical synthesis was to set special views for each type of analysis. In this case, was to set the hold analysis for the best view (to avoid hold violations) and setup for the worst view (to avoid setup violations).

## **III. PHYSICAL SYNTHESIS**

The main idea of the physical synthesis is to compose the netlist (usually a Verilog file) generated by the logical synthesis, into an arrangement of physically distributed geometries that are effectively the standard-cells. The entire Physical synthesis flow is done according to [5], with minor tweaks in order to better avail newer tool amenities.

# A. Pads

Pad insertion is one of the most laborious steps when doing the RTL to GDSII flow. Most of the difficulty comes with the limitations on the Verilog netlist that should instantiate the logical synthesis output netlist, among with pad cells from the I/O (input and output) libraries and connect them mutually. The input netlist for Cadence Innovus<sup>™</sup>does not support some modern commands (like *generate for loop*, to automatically create homogeneous structures using Verilog), this can be quite laborious but is easily surpassed by the use of bash commands to support the required manual work.



Figure 3: Physical synthesis flow overview.

There are lots of different cell types for pads. Documentation should be checked to see which one fits best for every task. The 146 used pads for this layout can be seen in Table II. This design relies on PAD-limited type of pads due the high number of inputs and outputs.

Table II: Used pad cells from the 3.3V I/O Library.

| Cellname | Used | Function     |
|----------|------|--------------|
| VDDIPADP | 1    | $V_{DD}$ Pad |
| GNDOPADP | 1    | Ground Pad   |
| CORNERP  | 4    | Corner Pad   |
| ICP      | 68   | Input Pad    |
| BD8P     | 72   | Output Pad   |

Note that there are multiple I/O libraries for two different voltages (3.3 V and 5.0 V) available for the utilized technology. In this project, the 3.3 V I/O library was used. Only one  $V_{DD}$  and Ground pads were used, since the total circuit current does not trespass the 25 mA recommended limit.

#### B. Floorplan

With the addition of pads, specifying the floorplan consists in preserving the original IP core size inside the pad ring, so it would not need to fill the entire die with fillers and other things that could increase the complexity in terms of power distribution. The addition of pad fillers, during physical synthesis, ensure that there is connection on the entire pad ring. On Fig. 4 both implementations, with Fig. 4a and without pads Fig. 4b, can be seen.

# C. Powerplan

The HF-RISCV powerplan is quite simple, considering the single voltage domain. The powerplan only requires special care for the pads instance connection to the power nets ( $V_{DD}$  and  $V_{SS}$ ).

After correlating every power net, specific commands for generating power rings, doing special routing, and stripes are applied. The implementation with pads Fig. 5a display the supply connections between the pad-ring and core-ring. Fig. 5b is the implementation without pads. Considering the mature



Figure 4: Floorplan of the HF-RISCV Core. 4a and 4b are not on the same scale.



Figure 5: Powerplan of the HF-RISCV Core. 5a and 5b are not on the same scale.

technology and application, this work does not cover the usage of well-taps (special cells for tying the substrate and n-well to  $V_{DD}$  and  $V_{SS}$ , commonly used in tapless technologies under 65 nm). The 180 nm technology still uses the intra-cell connections to avoid latch-up.

#### D. Placement

Cell placement takes the specified cells on the netlist and tries to place them on the optimal positions on the floorplan. After the initial placement, two more incremental placements are made to avoid cells below power stripes that could cause DRC (Design Rules Check) violations. On Fig. 6 both implementations, with Fig. 6a and without pads Fig. 6b, can be seen.

#### E. Clock Tree Synthesis - CTS

Clock tree synthesis (CTS) is a critical step in the physical synthesis flow. It is the process that tries to distribute the clock signal evenly to all sequential elements in a design. An optimized clock tree (CT) can help avoid serious issues (like excessive power consumption, routing congestion, and elongated timing closure phase) further down the flow [6]. Many elements affect the way a clock tree is made since designers often have to run many experiments to optimize the clock tree. Clock gating arrangements, CTS targets, clock library cell types and even placement of spare cells have a



Figure 6: After placement view of the HF-RISCV Core. 6a and 6b are not on the same scale.



Figure 8: Post-Routing view of the HF-RISCV Core. 8a and 8b are not on the same scale.



direct impact on the quality of a clock tree. On Fig. 7 both implementations, with Fig. 7a and without pads Fig. 7b, can be seen.



Figure 7: Post-CTS view of the HF-RISCV Core. 7a and 7b are not on the same scale.

## F. Routing

After CTS, the routing process determines the path for interconnection. Routing includes the standard-cells and pins (the pins on the block boundary or pads at the chip boundary). In the routing stage, metal and vias are used to create the electrical connections in layout so as, to complete all connections defined by the input netlist. On Fig. 8 both implementations, with Fig. 8a and without pads Fig. 8b, can be seen. Note that on the implementation with pads Fig. 8a the routing also connects the correspondent circuit nets to pads.

#### G. Metal and Cell Fillers

The use of fillers can be divided into two main things, cell and metal fill. In this project, because of the cell availability in the library, filler cells were limited into decoupling cells only, which are capacitors between the  $V_{DD}$  and  $V_{SS}$  lines. Decoupling capacitors 9 can act as filters on the power signals, increasing the reliability of the circuit, so as the planarity on the polysilicon layer.

Figure 9: Two schematics of Decoupling Capacitors (DCAP). 9a Cross Coupled DCAP, which is more used in technologies under 90 nm where the PMOS/NMOS DCAP 9b can not well handle high-frequency effects and Electrostatic Discharge (ESD) [7].



Figure 11: Post-Fill view of the HF-RISCV Core. 11a and 11b are not on the same scale.

Another concern on what comes to filler insertion is to guarantee that metal fillers are connected to  $V_{SS}$ . So, having the metal fillers attached to the grounding net, opens the capability for the metal fillers act as EM shielding for the



Figure 10: (a) Area and Number of Cells for each step. LS stands for Logical Synthesis, PS for Physical Synthesis, and PS w/Pads for Physical Synthesis with the addition of pads. (b) Display reports for power dissipation values for synthesis with and without pads under every PVT Corner. (c) Slack values for each PVT corner, for post-layout synthesis with and without pads.

circuit, also increasing the planarity of the circuit. On the drawback, the connection of metal fillers with the grounding net creates a parasitic capacitance between fillers and routing lines. On Fig. 11 both implementations, with Fig. 11a and without pads Fig. 11b, can be seen. Note that on Fig. 11a the metal fillers do not cover the pads, because the metal fillers were not inserted in the top metal layer.

## IV. RESULTS

After some small tweaks (e.g., changing the circuit density, using different clock constraints) to avoid DRC errors, the circuit was successfully synthesized under different PVT corners. Reports for the area, power dissipation, and delay under multiple operating conditions. Results display clues of circuit functionality.

The logical synthesis tool uses information from the liberty file (\*.lib) to estimate values for the area, power, and timing, so, their reports are quite inaccurate as we can see on the divergence in their results for the area on Fig. 10a. The tool catches values from the liberty for each cell and adds to an estimated percentage of the routing area.

The circuit does not change its area values for every corner because the circuit is synthesized with the worst corner and simulated with every other corner to generate results, like in the industry.

Slack results 10c greater than zero contributes to the possible circuit functionality. Power Dissipation values demonstrated a clear curve, almost linear, through operating conditions.

The reduction of 27% in power dissipation Tab. 10b after the inclusion of pads can be justified by the pessimist targeted output load during logical synthesis. During logical synthesis, was fixed a target point of 8 pF for output load. This amount of load is larger than the capacitance of each singular pad, so, during the synthesis with pads, some unnecessary buffers are automatically trimmed off to fine-tune the circuit.

The circuit is simulated using a Verilog schematic along



Figure 12: Test environment. HF-RISCV CORE is the synthesized IP covered in this work.

with the extracted Standard Delay Format (SDF) files of every synthesis step and operating corner to verify circuit functionality. The test-bench Fig. 12 HDL is available on GitHub [1]. The test-bench is modified in order to instantiate the postphysical synthesis Verilog, "HF-RISCV CORE" in Fig. 12, along with the < \*.sdf > files. Other blocks on the Fig. 12 are the original HDL files. The simulations were also done under Cadence framework, using the Incisive Enterprise<sup>TM</sup>simulator.

Further work may include a Verilog-AMS simulation of the extracted layout. The circuit will also be re-synthesized with the use of Design for Manufacturing (DFM) practices (e.g., the use of scan-chains for circuit testability) and using spare cells during physical synthesis, opening ECO (Engineering Change Order) routing possibilities.

## V. CONCLUSION

To perform the logical and physical synthesis of the HF-RISCV core for X-fab 180 nm was a great addition for learning the digital workflow. This work does not cover the whole process necessary to have a fully testable and functional circuit, but, at least it has introduced and instigated the search for the good practices and correct methods on what is to come to the logical and physical synthesis. The HF-RISCV core was fully synthesized with and without pads, resulting in a circuit that has a slack greater than zero under every operating condition defined in this work. The estimated energy consumption of the core with pads was about 45 nJ under nominal conditions.

#### REFERENCES

- [1] J. Filho, "Hf-risc soc," https://github.com/sjohann81/hf-risc.
- [2] S. Johann Filho, M. T. Moreira, L. S. Heck, N. L. V. Calazans, and F. P. Hessel, "A processor for iot applications: An assessment of design space and trade-offs," *Microprocessors and Microsystems*, vol. 42, pp. 156–164, May 2016.

- [3] S. F. Johann, M. T. Moreira, N. L. V. Calazans, and F. P. Hessel, "The hf-risc processor: Performance assessment," in 2016 IEEE 7th Latin American Symposium on Circuits Systems (LASCAS), Feb 2016, pp. 95– 98.
- X-Fab, "0.18 micron modular rf enabled cmos technology datasheet,"
  2017. [Online]. Available: https://www.xfab.com/fileadmin/X-FAB/ Download\\_Center/Technology/\\Datasheet/XC018\\_Datasheet.pdf
- [5] L. Lavagno, L. Scheffer, and G. Martin, EDA for IC Implementation, Circuit Design, and ProcessTechnology (Electronic Design Automation for Integrated Circuits Handbook). Boca Raton, FL, USA: CRC Press, Inc., 2006.
- [6] A. G. K. S. David Flynn, Rob Aitken, Low Power Methodology Manual: For System-on-Chip Design (Integrated Circuits and Systems). Springer, 2007.
- [7] X. Meng, R. Saleh, and K. Arabi, "Layout of decoupling capacitors in ip blocks for 90-nm cmos," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 16, no. 11, pp. 1581–1588, Nov 2008.