Segmentation fault (opencv + tesseract + tensorflow)

Strange project behavior (example GitHub - ilb/rustopencv)

Initial deps

[dependencies]
opencv = "0.66.0"
anyhow = "1.0"
image = "0.24.1"
ndarray = "0.15.4"

Works fine

But when I add tesseract or tensorflow (add tensorflow by slavb18 · Pull Request #1 · ilb/rustopencv · GitHub)

tensorflow = "~0.18.0"
imageproc = "~0.23.0"
structopt = "~0.2.15"
openssl-sys = "0.9.75"

Projects issues random Segmentation fault (core dumped)
Running in gdb:

Program received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:513
513     ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.

separate tesseract project works fine: GitHub - ilb/rusttess

OS: KUbuntu 22.04, rustc 1.62.0 (a8314ef7d 2022-06-27)

1 Like

It's unclear if the fault occurs when you run your program or when building your program. Please also indicate if you're building the release version or the debug version.

I am running debug version (cargo run)
50% of cargo run result in Segmentation fault

If required, i can make opencv + tensorflow version without tesseract for test

Are you running your program inside your Docker image?

When I run on host machine, 50% succeed,
In docker in never works (only separete opencv , tensorflow or tesseract projects work in docker)

Can you do "cargo run &> log" and paste the contents of log ?

output from GDB:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7d018d2 in tesseract::StringParam::StringParam(char const*, char const*, char const*, bool, tesseract::ParamsVectors*) ()
from /usr/lib/x86_64-linux-gnu/libtesseract.so.5

There will be another error with tensorflow.
I guess than opencv breaks tesseract and tensorflow (may be some dependencies?)

You are ignoring my question and jumping to where you think the error is.

This makes it difficult for me to try to help you because I can not build a mental model of what is working / what is breaking.

There is nothing interesting
root@f86a40317e3b:/rustopencv# cargo run &> log
Segmentation fault (core dumped)

log contains

    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/rustopencv`

If you run bt in gdb above, can you recover / guess the Rust stackframes?

(gdb) bt
#0 0x00007ffff7d018d2 in tesseract::StringParam::StringParam(char const*, char const*, char const*, bool, tesseract::ParamsVectors*) ()
at /usr/lib/x86_64-linux-gnu/libtesseract.so.5
#1 0x00007ffff47d0ed0 in () at /usr/lib/x86_64-linux-gnu/libtesseract.so.4
#2 0x00007ffff7fe1fe2 in call_init (l=, argc=argc@entry=1, argv=argv@entry=0x7fffffffec78, env=env@entry=0x7fffffffec88) at dl-init.c:72
#3 0x00007ffff7fe20e9 in call_init (env=0x7fffffffec88, argv=0x7fffffffec78, argc=1, l=) at dl-init.c:30
#4 _dl_init (main_map=0x7ffff7ffe180, argc=1, argv=0x7fffffffec78, env=0x7fffffffec88) at dl-init.c:119
#5 0x00007ffff7fd30ca in _dl_start_user () at /lib64/ld-linux-x86-64.so.2
#6 0x0000000000000001 in ()
#7 0x00007fffffffee79 in ()
#8 0x0000000000000000 in ()
(gdb)

I'm not an expert in this.

This is happening before the Rust main fn is being called right?

I.e. if you put "println!("hello world");" in the rust main fn as first line, nothing shows up?

If that is the case, this looks like some crazy initialization issue between tesseract / whatever else, since it's happening before Rust code is even called ?

Yes, with "println!("hello world");" in the rust main fn as first line, nothing shows up
rust code not executed

Given the crash is happening before the Rust code is even executing, the tesseract forums may be able to help you more than the Rust forums; this seems to require knowledge of tesseract internals.

Based on the backtraces and symptoms we've seen so far, it sounds like tesseract has set up some code to be executed before main (see the ctor crate or __attribute__((constructor)) in C or a static variable with a non-trivial constructor in C++) and it's running into a segfault, probably because something it depends on hasn't been initialized yet.

This can sometimes happen when one static variable is constructed using another static variable, maybe something like this:

// greeting.cpp
std::string greeting{"Hello, World"};

// cli-args.cpp
#include "greeting.h"
tesseract::StringParam greetingParam{"greeting", greeting};

One would assume that greeting will be initialized before greetingParam because that's the order that makes sense, but the C++ standard says the order static variables are initialized in is undefined. What this means we might try to construct greetingParam before greeting has been initialized, and then have a bad time when we read an uninitialized variable.

Either way, because there is no Rust code on the stack when the segfault is triggered, this is probably a bug in tesseract and your setup just happens to expose it. I'd create a ticket upstream and see if they can reproduce the issue.

3 Likes

Problem is not in Tesseract, simiilar error in OpenCV + Tensorfow

sample project: GitHub - ilb/rustmtcnn: Tensorflow Rust Implementation of MTCNN - https://cetra3.github.io/blog/face-detection-with-tensorflow-rust/

when I add line "use opencv::{self as cv, prelude::*};", program stops working,
in GDB i see another error:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7478a53 in google::protobuf::internal::AddDescriptors(google::protobuf::internal::DescriptorTable const*) () from /usr/lib/libtensorflow_framework.so.2

I would recommend the following:

  1. build the C/C++ dependencies (tesseract, opencv, tensorflow) manually in debug mode

  2. run bt

  3. get a helpful backtrace

  4. look at the lines of C/C++ in the backtrace to see what is going on

This is BT without debug info:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7478a53 in google::protobuf::internal::AddDescriptors(google::protobuf::internal::DescriptorTable const*) () from /usr/lib/libtensorflow_framework.so.2
(gdb) bt
#0  0x00007ffff7478a53 in google::protobuf::internal::AddDescriptors(google::protobuf::internal::DescriptorTable const*) () at /usr/lib/libtensorflow_framework.so.2
#1  0x00007ffff7fe1fe2 in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffec78, env=env@entry=0x7fffffffec88) at dl-init.c:72
#2  0x00007ffff7fe20e9 in call_init (env=0x7fffffffec88, argv=0x7fffffffec78, argc=1, l=<optimized out>) at dl-init.c:30
#3  _dl_init (main_map=0x7ffff7ffe180, argc=1, argv=0x7fffffffec78, env=0x7fffffffec88) at dl-init.c:119
#4  0x00007ffff7fd30ca in _dl_start_user () at /lib64/ld-linux-x86-64.so.2
#5  0x0000000000000001 in  ()
#6  0x00007fffffffee7e in  ()
#7  0x0000000000000000 in  ()

Which project should I build with debug info?

ibtensorflow_framework.so.2 is build using cargo, see

It should contain debug info probably

May be google::protobuf shold be build with debug info?
But error in random, in different combintaions it differs

By all evidence so far, this bug is in C/C++ code. This is a Rust forum.

If you are not willing to dig into the C/C++ to fix the bug, it is highly unlikely someone here will on your behalf.

Yeah this is also a Static Initialization Order Fiasco issue. Upstream issues "your dynamic library breaks _start when it's linked into Rust" seem appropriate.

Another possibility is delayed manual loading of the dynamic libraries: on Unix, dlopen() from main, should give more deterministic ordering.

1 Like