Rust Docker image broken on architecture linux/arm/v7

So, this has had me scratching me head (read: nearly throwing my computer through the window) for a day, but I think I can now confidently say there's something broken about the linux/arm/v7 of the current Docker image.

In short, cargo doesn't work. I've not run into this before because in general I cross compile my services locally and then just copy the compiled binary into a docker image to run on the target device, but I have one thing I'm building that depends on some non-Rust libraries that it's just easier to build and link on the target architecture. So, I set up an xbuild process, which works fine building for linux/arm64 but bursts into flames for linux/arm/v7:

[Sasha:project-maramures-ssbt/pn532-nfc-driver] timwa% ./build-nfc.sh
[+] Building 24.4s (14/23)                                                                                                                                                                                                                                                                                                                                                                   
 => [internal] load build definition from Dockerfile
 => => transferring dockerfile: 1.31kB
 => [internal] load .dockerignore
 => => transferring context: 2B
 => [internal] load metadata for docker.io/library/ubuntu:bionic
 => [internal] load metadata for docker.io/library/rust:latest
 => [internal] load build context
 => => transferring context: 29.90kB
 => [builder  1/12] FROM docker.io/library/rust@sha256:86bb2e39e20f602b169ade0bc5cff7460a49af2f60b9aefc390db48eb888d20a
 => => resolve docker.io/library/rust@sha256:86bb2e39e20f602b169ade0bc5cff7460a49af2f60b9aefc390db48eb888d20a
 => [stage-1 1/6] FROM docker.io/library/ubuntu:bionic@sha256:ea188fdc5be9b25ca048f1e882b33f1bc763fb976a8a4fea446b38ed0efcbeba
 => => resolve docker.io/library/ubuntu:bionic@sha256:ea188fdc5be9b25ca048f1e882b33f1bc763fb976a8a4fea446b38ed0efcbeba
 => => sha256:b017766c8f695603aa5da2f534135bd1fbe253f667ef6faa77a77c66be9ba9f5 187B / 187B
 => => sha256:0e961f21c7ea83c265a6de2897b8255e9e03cf1d39b74fb208b4ca936c6a53c5 853B / 853B
 => => sha256:6d8bd15f2f6189f24e8f1b5dc573a293c963565ab012ca6a42e51a3023e72e7e 22.29MB / 22.29MB
 => => extracting sha256:6d8bd15f2f6189f24e8f1b5dc573a293c963565ab012ca6a42e51a3023e72e7e
 => => extracting sha256:0e961f21c7ea83c265a6de2897b8255e9e03cf1d39b74fb208b4ca936c6a53c5
 => => extracting sha256:b017766c8f695603aa5da2f534135bd1fbe253f667ef6faa77a77c66be9ba9f5
 => CACHED [builder  2/12] RUN echo "I am running on linux/amd64, building for linux/arm/v7"
 => CACHED [builder  3/12] RUN apt-get update
 => [builder  4/12] RUN apt-get install -y libi2c-dev libusb-dev libnfc-bin libnfc-dev libnfc-examples libnfc-pn53x-examples
 => [stage-1 2/6] RUN apt-get update
 => CANCELED [stage-1 3/6] RUN apt-get install -y libnfc5 libnfc-bin
 => [builder  5/12] WORKDIR /usr/src
 => ERROR [builder  6/12] RUN cargo install cargo-build-dependencies
------                                                                                                                                                                                                                                                                                                                                                                                       
 > [builder  6/12] RUN cargo install cargo-build-dependencies:                                                                                                                                                                                                                                                                                                                               
#14 0.834     Updating crates.io index                                                                                                                                                                                                                                                                                                                                                       
#14 1.590 warning: spurious network error (2 tries remaining): could not read directory '/usr/local/cargo/registry/index/github.com-1285ae84e5963aae/.git//refs': Value too large for defined data type; class=Os (2)                                                                                                                                                                        
#14 2.087 warning: spurious network error (1 tries remaining): could not read directory '/usr/local/cargo/registry/index/github.com-1285ae84e5963aae/.git//refs': Value too large for defined data type; class=Os (2)                                                                                                                                                                        
#14 2.628 error: failed to fetch `https://github.com/rust-lang/crates.io-index`
#14 2.628 
#14 2.628 Caused by:
#14 2.629   could not read directory '/usr/local/cargo/registry/index/github.com-1285ae84e5963aae/.git//refs': Value too large for defined data type; class=Os (2)
------
failed to solve: rpc error: code = Unknown desc = executor failed running [/bin/sh -c cargo install cargo-build-dependencies]: exit code: 101

(I'm using Docker QEmu based cross-platform to sloooowly build.)

I am going to hazard a wild guess that the problem ("value too large for defined data type") is some kind of datatype sizing problem in the 32-bit build environment (inode maybe?) causing Cargo to crash and burn.

Through a process of elimination, I've determined that versions earlier than 1.30 don't blow up in this way (although there are other problems it seems...)

I'm not sure yet if the problem is isolated to the official docker images, or if it's the Rust toolchain itself that's broken on arm/v7 - I'm going to attempt to set up a build environment of my own to see if I can work that out. Maybe someone here already knows..? Should I give up trying to get Rust to compile on arm/v7? (Not for arm/v7, but on arm/v7)?

Note that linux/arm64 works just fine, so I assume this is a 32bit/64bit issue. Unfortunately my target device is not running 64 bit...

Your problem seems to be spelled out in the line of your output:

RUN echo "I am running on linux/amd64, building for linux/arm/v7"

Looks like it has determined it is running on a 64 bit system rather than 32 bit. Likely because it has detected the processor is a 64 bit device rather than detecting the 32 bit OS it's running on.

I have had this problem on Raspberry Pi before. Sadly I can't remember what it is I was trying to build or how the issue was resolved.

I knew I shouldn't have included that line - ignore it, it's just telling me the host operating system as determined from an environment variable. I've found the problem I think, I'll write in another post just as soon as this build actually completes...

But for the impatient - spoiler: the official docker Rust image is broken.

OK, so, a choir of angels sings and I have a working build.

The issue appears to be that the toolchain in the official rust image on dockerhub is broken for linux/arm/v7.

Building my own docker image 'from scratch' (ubuntu:bionic as a base image and then installing the rust toolchain on top) works:

FROM ubuntu:bionic as builder
ARG APPNAME="nfc-svc"
RUN apt-get update
RUN apt-get install -y build-essential
RUN apt-get install -y curl

# Install Rust from scratch
WORKDIR /usr/rust
COPY installrust.sh ./
RUN ./installrust.sh  -y
ENV PATH="/root/.cargo/bin:$PATH"

But, you may be asking, why is he using a local copy of the Rust install script instead of curling it direct from https://sh.rustup.rs?

Answer: because the script from https://sh.rustup.rs doesn't work properly and needs a little local patching. I can't be 100% sure this is the root of the problem in the official image, but I can say for certain that this part of the install script is broken on linux/arm/v7 in Docker/qemu:

get_bitness() {
    need_cmd head
    # Architecture detection without dependencies beyond coreutils.
    # ELF files start out "\x7fELF", and the following byte is
    #   0x01 for 32-bit and
    #   0x02 for 64-bit.
    # The printf builtin on some shells like dash only supports octal
    # escape sequences, so we use those.
    local _current_exe_head
    _current_exe_head=$(head -c 5 /proc/self/exe )
    if [ "$_current_exe_head" = "$(printf '\177ELF\001')" ]; then
        echo 32
    elif [ "$_current_exe_head" = "$(printf '\177ELF\002')" ]; then
        echo 64
    else
        err "unknown platform bitness"
    fi
}

Checking /proc/self/exe doesn't work in this context, which means this function fails (it reaches the "unknown platform bitness" line). Unfortunately, it doesn't crash out on that error - the rest of the script just silently carries on with the bitness unknown. This results in a successful but broken install.

Like I say, I can't confirm that this is what is happening in the official Docker images, but if they are using the regular install script same as the rest of us, and being built using the Docker multiarch build system, there is a good chance that this is the reason the builds appear successful but actually are broken.

If I hack that function to return 32 instead of an error when /proc/self/exe doesn't exist, I can get everything to build. (Working out a method for the script to properly detect bitness (and indeed endianness - the following get_endianness() function also fails) in the Docker environment is a job for another day. Alternatively, the solution may be to work out how to get /proc/self/exe working right in the Docker/qemu env.)