TETRACOM

Leitung:  Jun.-Prof. Dr.-Ing. G. Payá-Vayá
Team:  Dipl.-Ing. S. Nolting, Dipl.-Ing. L. Gerlach
Jahr:  2016
Laufzeit:  Januar 2016 - Juli 2016
Ist abgeschlossen:  ja
Weitere Informationen http://es.iet.unipi.it/tetracom/content/

Nowadays, continuous development of digital signal processing applications, e.g., video-based advanced driver assistance systems, are pushing the limits of existing embedded systems and are forcing system developers to spend more time on code optimization. These applications often involve complex mathematical functions like trigonometric, logarithmic, exponential, or square root operations. In particular, these functions can only efficiently be computed on standard general purpose embedded processors, using highly optimized, processor specific arithmetic evaluation software libraries . Another alternative is to extend the embedded processor architectures with a specific hardware accelerator.

Beside the use of look-up-table interpolation, there are several generic algorithms, which provide enough flexibility and performance to approximate a wide variety of different arithmetical operations. The classical "Coordinate Rotation Digital Computer" (CORDIC) algorithm is a good choice for implementing efficient and flexible elementary arithmetic functions, due to its shift-and-add nature. The kernel of the CORDIC algorithm is used
to evaluate different arithmetic functions in software or hardware, greatly decreasing the required resources (instruction memory or silicon area, respectively).

At the Institute of Microelectronic Systems, a mathematic software library ( LibARITH ) was implemented and optimized for a generic VLIW-SIMD processor based on a radix-2/-4 CORDIC algorithm. Moreover, a generic radix-2/-4 CORDIC HW co-processor was implemented and used extensively in different research projects. The results show that the CORDIC HW co-processor is capable of efficiently approximating elementary arithmetic
operations (like cosine, sine, logarithm, exponential, square-root, division,...) and therefore increases the computation speed of common digital signal processing applications. Moreover, the LibARITH SW library was fully optimized for SIMD operations, including the efficient usage of predication mechanisms at subword level and taking into account the VLIW mechanisms of the generic processor.

Some selected application examples are:

  • In the European DESERVE-project ("Development Platform for Safe and Efficient Drive"), a soft-core processor was extended with a programmable radix-2 CORDIC HW co-processor, improving the execution of a lane detection algorithm by a factor of 2.2 in comparison to a SW implementation using LibARITH.
  • In the excellence Cluster "Hearing4all", a VLIW-SIMD processor is being explored for digital hearing aid devices in terms of silicon area and power consumption. Early results show that the use of a radix-2 CORDIC HW co-processor (in comparison to a SW implementation using LibARITH ) decreases the required operating frequency for real-time processing by a factor of 2 and increases the silicon area by a factor of only 1.2, resulting in an overall core power reduction by a factor of 1.3.


The company videantis provides a very power efficient programmable VLIW-SIMD multi-core architecture for embedded video and vision applications that is used in SoC designs. Especially for standard vision processing tasks, videantis supports libraries like OpenCV as well as optimized SW modules, e.g., for complex schemes like HOG, LKT, or SfM. The efficient processing of the above mentioned non-linear functions frequently plays an important role in these applications. CORDIC is a well known general framework to efficiently provide this functionality and is therefore of great interest to videantis. The main goal of the project is to integrate and evaluate the IMS HW and SW library into the videantis SDK and toolchain. Therefore, the SW library will accelerate the implementation and verification process of new computer vision applications and the IMS CORDIC IP will enhance the processing performance of the videantis IP processor for highly computational intensive computer vision application.