MINT: Dynamic-Precision CNN Inference with MSDF Digit-Serial Arithmetic on FPGA

arXiv Math

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

We present MINT, a dynamic-precision CNN inference accelerator based on left-to-right (LR) arithmetic.

LR arithmetic computes in most-significant-digit-first manner and exposes useful partial results early so that the computation can be terminated once the desired precision is achieved.

At the core, there is a MSDF serial-parallel inner-product unit, which uses redundant signed-digit representation to compute each convolution window.

A budget-constrained greedy search profiles all convolution layers from INT2 to INT7 and selects the lowest precision per layer while constraining total accuracy loss to within 2\% of the INT8 baseline for VGG-16 and ResNet-18 networks.

The design is synthesized on a Xilinx Zynq-7020 at \SI{200}{\mega\hertz}, and uses 5.64 average bits for VGG-16 and 6.04 for ResNet-18, while achieving 19.86 GOPS and 29.51 GOPS/W on VGG-16, and 18.86 GOPS and 26.40 GOPS/W on ResNet-18.

This corresponds to 32.6\% and 26.0\% higher throughput and 82.10\% and 62.90\% higher energy efficiency than INT8 with only 1.81\% and 1.96\% drops relative to the INT8 baseline.

Compared with representative prior FPGA CNN accelerators considered in this study, MINT delivers the highest energy efficiency among the listed VGG-16 and ResNet-18 designs on Zynq-7020 platform.

전문 보기

MINT: Dynamic-Precision CNN Inference with MSDF Digit-Serial Arithmetic on FPGA

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

What Drives Interactive Improvement from Feedback?

Contrastive Reflection for Iterative Prompt Optimization

How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies

arXiv의 다른 기사

Beyond expert users: agents should help users construct preferences, not just elicit them

Investigating Multi-Agent Deliberation in Law

Why Solve It Twice? Hierarchical Accumulation of Skills for Transfer-Efficient ML Engineering