Which Compiler to Use?


Hi Guys.

I've been using the "Xilinx ARM GNU/Linux" (arm-xilinx-linux-gnueabi-) toolchain for all things snickerdoodle.
Recently, though, I realized that that compiler lacks support for the Zynq's ARM VFP3 hardware floating point extensions, so I switched to the "Xilinx ARM v7 GNU/Linux" (arm-linux-gueabihf-) toolchain.
The examples in the Snickerdoodle Book specify the non-VFP toolchain for building u-boot and the kernel.
Is there ever any reason not to use the VFP version on snickerdoodle?
Thanks,

-Nick

Hi Nick,

It took a bit of time to track down some real answers for this, as I wasn't sure about the support, myself. I haven't found anything that shows that toolchain doesn't support the VFP3 extensions, but rather that they are not enabled by default. The default settings (can be found using arm-xilinx-linux-gnueabi-gcc -v) show --with-float=softfp and --with-fpu=neon-fp16 which give single precision floating point support and allows the compiler to use HW FPU instructions. You should be able to change to VFPv3 support by adding CFLAGS='-mfpu=vfpv3' to your make command, for example:

$ make ARCH=arm CROSS_COMPILE=arm-xilinx-linux-gnueabi- CFLAGS='-mfpu=vfpv3' LOADADDR=0x8000 uImage

Check page 16 of https://forums.xilinx.com/xlnx/attachments/xlnx/ELINUX/8281/1/neon_fpu_hw_acceleration_v2.pdf for the defaults

Can you post what you've found about the limitations of the toolchain?

On Wednesday, August 3, 2016 at 12:52:58 PM UTC-7, Nick Burkitt wrote:

Hi Guys.

I've been using the "Xilinx ARM GNU/Linux" (arm-xilinx-linux-gnueabi-) toolchain for all things snickerdoodle.
Recently, though, I realized that that compiler lacks support for the Zynq's ARM VFP3 hardware floating point extensions, so I switched to the "Xilinx ARM v7 GNU/Linux" (arm-linux-gueabihf-) toolchain.
The examples in the Snickerdoodle Book specify the non-VFP toolchain for building u-boot and the kernel.
Is there ever any reason not to use the VFP version on snickerdoodle?
Thanks,

-Nick
Hi Russel.

Wow. Not as simple a question as I had thought. There are three toolchains involved here; the two Xilinx-sourced cross-chains, and the Ubuntu-sourced toolchain on snickerdoodle itself. All have some options that make sense, others that don't, and some that are simply pointless in this context.

The Ubuntu toolchain is unavoidable, as it would have been used to build everything in the distro. The good news is that it is configured to use hardware FP. The bad news is that it's the stripped-down, 16-register VFPv3-d16 version. Zynq's ARMv7 has the full 32-register VFPv3 FPU, no? It's also not tuned for the Cortex-A9 processor. I have no idea what practical effect either of those has on real performance, but it's not the preferred configuration. When I have the time, I'll look into Yocto (or just follow this blog).

I compared the three compilers' -v outputs. arm-linux-gnueabihf (both Xilinx's and Ubuntu's) is from Linaro, arm-xilinx-linux-gnueabi is from Mentor Graphics' Sourcery Codebench. There's much disagreement between the options. I've attached the full outputs, and excerpted the CPU/FPU bits below.

The Xilinx/Sourcery version uses

gcc version 4.9.2 (Sourcery CodeBench Lite 2015.05-17)
--with-arch=armv5te
--with-arch=armv7-a
--with-cpu=cortex-a9
--with-float=softfp
--with-fpu=neon-fp16

while the Xilinx/Linaro version uses

gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09)
--enable-multiarch
--with-arch=armv7-a
--with-float=hard
--with-fpu=vfpv3-d16
--with-tune=cortex-a9

and the Ubuntu/Linaro version is different from both of those:

gcc version 4.8.2 (Ubuntu/Linaro 4.8.2-19ubuntu1)
--enable-multiarch
--with-arch=armv7-a
--with-float=hard
--with-fpu=vfpv3-d16

Here's a pretty clear comparison of the various GCC FPU options (and a link to the GCC configuration page).

Ideally, it seems, snickerdoodle code would use
--with-cpu=cortex-a9
--with-float=hard
--with-fpu=vfpv3 if you need double-precision math
or
--with-fpu=neon if you don't.

Hmm. In fact, the SDK builds the snickerdoodle BSP using -mcpu=cortex-a9 -mfpu=vfpv3 -mfloat-abi=hard.

Okay, bottom line, it probably doesn't matter which toolchain, as long as you add the CPU/FPU compiler switches you want.
Whew.

-Nick

On Friday, August 5, 2016 at 11:40:37 AM UTC-7, Bush wrote:

Hi Nick,

It took a bit of time to track down some real answers for this, as I wasn't sure about the support, myself. I haven't found anything that shows that toolchain doesn't support the VFP3 extensions, but rather that they are not enabled by default. The default settings (can be found using arm-xilinx-linux-gnueabi-gcc -v) show --with-float=softfp and --with-fpu=neon-fp16 which give single precision floating point support and allows the compiler to use HW FPU instructions. You should be able to change to VFPv3 support by adding CFLAGS='-mfpu=vfpv3' to your make command, for example:

$ make ARCH=arm CROSS_COMPILE=arm-xilinx-linux-gnueabi- CFLAGS='-mfpu=vfpv3' LOADADDR=0x8000 uImage

Check page 16 of https://forums.xilinx.com/xlnx/attachments/xlnx/ELINUX/8281/1/neon_fpu_hw_acceleration_v2.pdf for the defaults

Can you post what you've found about the limitations of the toolchain?

On Wednesday, August 3, 2016 at 12:52:58 PM UTC-7, Nick Burkitt wrote:

Hi Guys.

I've been using the "Xilinx ARM GNU/Linux" (arm-xilinx-linux-gnueabi-) toolchain for all things snickerdoodle.
Recently, though, I realized that that compiler lacks support for the Zynq's ARM VFP3 hardware floating point extensions, so I switched to the "Xilinx ARM v7 GNU/Linux" (arm-linux-gueabihf-) toolchain.
The examples in the Snickerdoodle Book specify the non-VFP toolchain for building u-boot and the kernel.
Is there ever any reason not to use the VFP version on snickerdoodle?
Thanks,

-Nick