Virtual Prototype based Analysis of Neural Network Cache Behavior for Tiny Edge Device

Alexander Fratzer; Vladimir Herdt; Christoph Lüth; Rolf Drechsler

In: Forum on Specification & Design Languages (FDL). Forum on Specification & Design Languages (FDL-2022), September 14-16, Linz, Austria, 2022.


The demand for AI and specifically machine learning functionality on edge devices (TinyML) is growing. TinyML faces several unique challenges, one of them being the requirement of having a lower memory footprint for storage and inference of neural networks. In this paper, we propose and evaluate an approach to make Convolutional Neural Networks (CNNs) with higher memory footprint executable on edge devices. The idea is to combine flash memory with a cached access for the inference of CNNs. In order to evaluate the effectiveness of our proposed memory architecture by measuring the cache hitrate at the system level, we build a Virtual Prototype (VP) with a dedicated flash device and an exclusive cache. We are using Tensorflow Lite Micro (TFLM) for the network inference and mapping all model data and runtime buffers into the cache enhanced flash memory. Multiple experiments with several cache configurations show that a small cache of around 1 KB is able to achieve very high hitrates over 99%. Additionally, the experimental results show that the memory planning of TFLM supports the usage of caching because most memory accesses are adjacent.


German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz