I am writing a telegram message backup parser. The parsed message are then saved to mongodb.
The issue is, if I don't save the messages to mongodb, by replace
collection
.insert_many(messages, None)
.wrap_err_with(|| eyre!("failed to insert documents"))?;
with
println!("...");
I don't get any issue.
But if the record is saved, then I immediately get errors like
warning: unused config key `source.rsproxy.index` in `/Users/maximin/.cargo/config.toml`
warning: unused config key `source.rsproxy-sparse.index` in `/Users/maximin/.cargo/config.toml`
Finished release [optimized] target(s) in 0.20s
Running `target/release/telegram-chat-history --input /Users/user/Downloads/chats/`
failed to parse HTML: sending on a closed channel
failed to parse HTML: sending on a closed channel
failed to parse HTML: sending on a closed channel
failed to parse HTML: sending on a closed channel
failed to parse HTML: sending on a closed channel
I don't understand why the channel is closed in this case. Any why I don't see the cause of the closing of the channel.
When debugging a problem like this and asking for help, the first step you should take is create a minimal reproducible example. Your code is neither minimal nor reproducible. Most of the time during minimization, you'll get to the source of the error by yourself. In your case, sending on a closed channel looks to me like the receiver of the channel has already been dropped/disconnected when you are trying to send the html from parse_html to to save_messages.
Thanks for your help. I know that the root cause is the sender / receiver maybe drop somewhere and I didn't see the error message when I tested and ctrl-c. But I run multiple times and I occasionally see the error message displayed like
...
...
failed to parse HTML: failed to send messages
failed to parse HTML: failed to send messages
failed to parse HTML: failed to send messages
failed to parse HTML: failed to send messages
Error: consumer thread exited
Caused by:
0: failed to insert documents
1: Kind: An invalid argument was provided: No documents provided to insert_many, labels: {}
Location:
src/main.rs:36:14
then I understand the issue is that the receiver has been drop due to this error.
It seems to be the reason of multi-threading that caused the error logs from the producer to delay the printing of error logs from the consumer. And sometimes the error logs from the consumer were not displayed because the program was interrupted by me.
Okay, but killing the receiver can be avoided by handling the insertion error properly instead of returning an error from your consumer thread here:
collection
.insert_many(messages, None)
.wrap_err_with(|| eyre!("failed to insert documents"))?;
The error message reads to me like you are trying to insert an empty Vec<Message> into the collection. If that's the case, you could look before you leap and only insert when messages is not empty, for example:
if !messages.is_empty() {
collection
.insert_many(messages, None)
.wrap_err_with(|| eyre!("failed to insert documents"))?;
}
Thanks for the tip and the link, I've fixed the issue. Do you have any advices of logging in multiple threads (besides this look before you leap and not bubble of the error)? In my original question, the logs from this separate consumer thread were masked by multiple other threads from the producers, causing them to not be printed out for a long time, which prevented me from discovering the root of the problem.