Parallelize a working linear code

Are you using fitsio? I am the author - if you think at any point that your performance issues are related to the crate, let me know. I shouldn't think it is as it's a thin wrapper around the c library, and it sounds like an IO bottleneck to me, as others have suggested.

2 Likes

Yes, I am using fitsio.

I think so.

How can I refer to a portion of the data column instead of collecting the whole column as a vector? referring to the code I mentioned earlier-

I just want flux and wavelength values indexed 2464 to 5246. I am trying to reduce fits file read time and swap usage.

Since this is a batch processing type job, I would probably not use collect to Result<Vec<_>, _> here - since it means throwing away every result if there is an error with any of the thousand input files. :slightly_smiling_face: That's something to write for later, for example csv output (or other output) with a row per file, with result and/or error, if you need it.

You're using read_col and there is a read_col_range in the docs, so that sounds good to use. I don't know anything about this file format though, so best to give that a try or stare deeper into the docs.

1 Like

That worked really well. Thanks a lot :smiley:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.