Wednesday, July 31, 2024

Etnaviv NPU update 20: Fast object detection on the NXP i.MX 8M Plus SoC

I'm happy to announce that my first project regarding support for the NPU in NXP's i.MX 8M Plus SoC has reached the feature complete stage.

CC BY-NC 4.0 Henrik Boye

For the last several weeks I have been working full-time on adding support for the NPU to the existing Etnaviv driver. Most of the existing code that supports the NPU in the Amlogic A311D was reused, but NXP used a much more recent version of the NPU IP so some advancements required new code, and this in turn required reverse engineering.

This work has been kindly sponsored by the Open Source consultancy Ideas On Board, for which I am very grateful. I hope this will be useful to those companies that need full mainline support in their products, even if it is just the start.

This company is unique in working on both NPU and camera drivers in Linux mainline, so they have the best experience for products that require long term support and vision processing.

Since the last update I have fixed the last bugs in the compression of the weights tensor and implemented support for a new hardware-assisted way of executing depthwise convolutions. Some improvements on how the tensor addition operation is lowered to convolutions was needed as well.

Performance is pretty good already, allowing for detecting objects in video streams at 30 frames per second, so at a similar performance level as the NPU in the Amlogic A311D. Some performance features are left to be implemented, so I think there is still substantial room for improvement.

At current the code is at a very much proof-of-concept state. The next step is cleaning it all up and submitting for review to Mesa3D. In the meantime, you can find the draft code at https://gitlab.freedesktop.org/tomeu/mesa/-/tree/etnaviv-imx8mp.

A big thanks to Philipp Zabel who reverse engineered the bitstream format of the weight encoding and added some patches to the kernel that were required for the NPU to work reliably.

7 comments:

Anonymous said...

Amazing stuff! Do you know whether Ideas On Board is working with GNOME / Mobian / Purism devs to enable use of the NPU on the Librem 5 via Pipewire?

Anonymous said...

The Librem 5 uses the NXP i.MX8MQ which does not have the Verisilicon NPU.

Anonymous said...

Ah yes, I got the two confused...
In any case, I see from libcamera v0.3.1 that Kieran Bingham from Ideas On Board is working on the stack from which the Librem 5’s camera should benefit.

Thanks to you all!

Anonymous said...

Thank you for this work!

Are there patches still needed upstream for Linux etnaviv and libdrm and if not what kernel version has what you need to reproduce this?

Could you provide some details on what your dev setup looks like in order to build a custom mesa and anything else to reproduce this?

Tomeu Vizoso said...

> Are there patches still needed upstream for Linux etnaviv and libdrm and if not what kernel version has what you need to reproduce this?

Kernel 6.10 should have everything you need. No changes to libdrm were needed.

> Could you provide some details on what your dev setup looks like in order to build a custom mesa and anything else to reproduce this?

I'm afraid it is a bit PoC at the moment for people to give it a try without a significant time investment (depending on your level of experience).

But in case you want to try it out, these links may help you:

https://gitlab.freedesktop.org/tomeu/mesa/-/tree/etnaviv-imx8mp
https://docs.mesa3d.org/teflon.html
https://github.com/tomeuv/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/tree/teflon-demo

Good luck!

Anonymous said...

I'm working with a Gateworks Venice GW7401 which has an IMX8MP. I have a 6.10 kernel with etnaviv and npu enabled and an Ubuntu rootfs.

I've been able to build your mesa easily enough so I have a libteflon.so but for the life of me I can't figure out how to get TensorFlow Lite installed on either Ubuntu 24.04 (noble) or Ubuntu 22.04 (jammy). I following various guides and running into various issues. Some questions about that:
- do we still need an older version of python-opencv (3.4.11.41 from one of your guides)
- do we need python-opencv at all?

How are you going about installing TensorFlow Lite and what OS/Distro are you using to do it?

Also, you show a labelled picture - are you generating this with Gstreamer and if so what pipeline/elements?

Tomeu Vizoso said...

> - do we still need an older version of python-opencv (3.4.11.41 from one of your guides)
> - do we need python-opencv at all?

OpenCV isn't strictly needed by the NPU driver. It's just what happens to use the demo that I chose to use to validate this work at this stage. I don't know of any specific version requirement, though. If you have been trying with Ubuntu packages so far, then maybe give the pip packages a try.

> How are you going about installing TensorFlow Lite and what OS/Distro are you using to do it?

I'm using Debian. Are the instructions in this link not working?

https://docs.mesa3d.org/teflon.html#install-runtime-dependencies

> Also, you show a labelled picture - are you generating this with Gstreamer and if so what pipeline/elements?

That's drawn by OpenCV as driven by the example scripts in the repo below:

https://github.com/tomeuv/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/tree/teflon-demo

I estimate that in 2 months or so from now I will be able to get this branch in a more workable state. I will also see how I can make it easier to demo this.