Solved: PahoDescr(-1, "TCP/TLS connect failure") until I clean/build (ARM64)

Solved:

It turns out that requiring a build just added enough delay to allow something in the stack to complete initialization so the connection to the MQTT broker could be established. I added a 120s delay to the cron job that runs the app and it works fine. (I also need to add code to retry the connection on failure.)

Thanks!

OP

Good morning all,
I'm getting results I don't understand with a console app that uses paho-mqtt. When I build it, it runs as expected. Following a reboot, it produces the following error until I rebuild it.

Unable to connect:
        PahoDescr(-1, "TCP/TLS connect failure")

This is on a Raspberry Pi CM4 running Debian (64 bit) and with the following Rust version

hbarta@glencoe:~/Downloads/mqtt_recorder$ rustc --version
rustc 1.71.1 (eb26296b5 2023-08-03)

I've built on Debian on Intel and do not experience the same issue. Apparently something changes when I run cargo. I've looked at ENV variables and they're essentially unchanged. Output of ldd on the resulting executable does change:

hbarta@glencoe:~/Downloads/mqtt_recorder$ diff ldd*
1,6c1,6
<       linux-vdso.so.1 (0x0000ffffb4b2b000)
<       libssl.so.3 => /lib/aarch64-linux-gnu/libssl.so.3 (0x0000ffffb4910000)
<       libcrypto.so.3 => /lib/aarch64-linux-gnu/libcrypto.so.3 (0x0000ffffb44b0000)
<       libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffffb4470000)
<       libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffb42c0000)
<       /lib/ld-linux-aarch64.so.1 (0x0000ffffb4aee000)
---
>       linux-vdso.so.1 (0x0000ffffaeeee000)
>       libssl.so.3 => /lib/aarch64-linux-gnu/libssl.so.3 (0x0000ffffaecd0000)
>       libcrypto.so.3 => /lib/aarch64-linux-gnu/libcrypto.so.3 (0x0000ffffae870000)
>       libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffffae830000)
>       libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffae680000)
>       /lib/ld-linux-aarch64.so.1 (0x0000ffffaeeb1000)
hbarta@glencoe:~/Downloads/mqtt_recorder$ 

I'm afraid I don't understand the significance of the change. I've only ever used ldd to identify missing libraries so I don't know what it means when the addresses change and perhaps that's not related to the issue.

Code for the project (work in progress) can be found at https://github.com/HankB/mqtt_recorder

Any help resolving this is appreciated.

Thanks!

it's just the linker put the libraries at different addresses, should not matter for this particular error.

I'm not familiar with mqtt, so I am just guessing, but looking at your code, the clean_session() option looks suspicious to me. maybe it's a credential caching problem? e.g. maybe some data is stored to certain location which is non-volatile on PC but is volatile on the raspberry pi system?

btw, maybe you can get more detailed error if you enable logging/tracing for the mqtt library.

maybe you can get more detailed error if you enable logging/tracing for the mqtt library.

Seems like a good idea - Thanks!

I could also try a different setting for clean_session(). But AFAIK that just tells the broker that the app is reconnecting and to continue with any traffic interrupted during the previous session. If it doesn't connect I don't think it would make a difference.

Edit: The paho-mqtt library provides some environment variables for tracing and I have enabled those:

export MQTT_C_CLIENT_TRACE=ON
export MQTT_C_CLIENT_TRACE_LEVEL=PROTOCOL

I also discovered some interesting results while exploring these. Running the executable directly from the command line (./target/release/mqtt_recorder) it did not connect. Invoking using cargo run resulted in expected operation. After that, invoking the executable directly did work as expected. (cargo run did not appear to rebuild anything.)

Following reboot, no invocation resulted in a connection except following cargo clean and cargo build at which time both direct and invoked by cargo connected. Following another reboot I

  • checked the timestamp on the executable
  • ./target/debug/mqtt_recorder - no connection
  • cargo run - no connection
  • cargo build and cargo run - no connection
  • cargo build -r (and confirm that timestamp on debug exe has not changed)
  • cargo run - got connection
  • ./target/debug/mqtt_recorder - got connection

Rebooted.

Ran the debug and release candidates and eventually got a connection. Perhaps it is a timing thing on the host. Forcing a rebuild introduces an 80s delay and perhaps that allows something else go get working. The error message mentions TLS and perhaps that takes longer to initialize on a Raspberry Pi.

I've added a 120s delay to the cron entry that runs this app and it seems to be working fine following a reboot.

I think I need to tweak the code to retry on connection failure.