Windows app to detect the invoking shell - I would appreciate reviews

I just got some code working that detects the user's shell in Windows. The full code is here: Rust program to identify the shell that invoked this application · GitHub

This is a more or less direct translation of the equivalent code from the shellingham Python library. I wanted something similar in Rust, and I was looking for a non-trivial problem to get some experience trying to write more idiomatic Rust code (I've used very little Rust, being mostly a Python programmer, and it shows :slight_smile:) So I thought I'd see how porting that code would go.

It took me some time, but the code now works. It doesn't seem very clean, though - the explicit state value in my iterator implementation seems clumsy, and my handling of options and errors can best be described as "I found something that worked".

One thing that may look odd is the error handling - I catch and ignore all errors, turning them into "not found". This is deliberate, at least for now, as I want the find_shell function to have a simple interface, where there's no need for error handling beyond "I couldn't find the answer". Maybe longer term I'll return actual errors properly, but error types and threading Result values through the code are something I want to leave until I have a bit more experience - I'm having enough trouble handling Option cleanly!

Any comments would be welcome, particularly on how to make this more idiomatic. I felt like I spent a lot of time looking for ways to do things that seemed like they ought to be simpler (trimming trailing null characters in my implementation of From<PROCESSENTRY32W> feels particularly clumsy). I feel like that's my Python experience showing - getting stuck on "how I'd do this in Python" and not knowing how to look for the natural Rust approach.

Heads up: I don't have a box to compile this on handy, so I can't easily test my code suggestions. They're just based on reading the code.


Let's start with impl Iterator for ProcIter. It looks like the only difference between being in NotStarted versus Running is that you set the state to Running in the NotStarted branch. That is, they're pretty much the same, since overwritting Running with Running in the Running branch would be a no-op.

If we merge those two states, there's only two states left: Done or not. My instinct is that this could be represented by an Option around one of your existing fields, instead of an explicit state. Let's say handle for reasons I'll get to.

struct ProcIter {
    handle: Option<HANDLE>,
    proc_entry: PROCESSENTRY32W,
}

impl Iterator for ProcIter {
    type Item = Process;
    fn next(&mut self) -> Option<Self::Item> {
        // Remove handle from `self` temporarily. If we have no handle,
        // iteration is complete, and the `?` will return `None` here.
        let handle = self.handle.take()?;

        let entry_ptr = &mut self.proc_entry as *mut PROCESSENTRY32W;
        match unsafe { Process32FirstW(handle, entry_ptr) } {
            Ok(()) => { 
                // Replace handle and keep going
                self.handle = Some(handle);
                Some(self.proc_entry.into())
            }
            Err(_) => {
                // Close the handle but don't replace it in self
                unsafe { let _ = CloseHandle(handle); }
                None
            }
        }
    }
}

The above change will break your Drop implementation. IMO it's nice to have a way to close things without dropping them, so let's add that while we're at it.

impl ProcIter {
    /// Closes the handle if it has not been closed already.
    /// `<ProcIter as Iterator>::next` will always return `None` after the
    /// handle has been closed.
    pub fn close(&mut self) -> Result<(), SomeErrorIDidNotLookUp> {
        if let Some(handle) = self.handle.take() {
            unsafe { CloseHandle(handle) }
        } else {
            Ok(())
        }
    }
}

impl Drop for ProcIter {
    fn drop(&mut self) {
        let _ = self.close();
    }
}

This is the other reason it's nice that handle is the Option: if there's multiple places things might get closed, you have to guard against closing twice.

TBH though, you could instead go with the other field as the Option, keep the Drop implementation, and don't bother with the close method.


iter_processes looks like it could be impl Default for ProcIter. It would also need adjusted for the above change.

In find_shell, None => return None is a sign that you should use ? if possible. And you can get rid of the continue with a little reorg.

    for _ in 0..11 {
        let proc = procs.get(&pid)?;
        if proc.1.extension().map_or(true, |x| x != "exe") {
            return None;
        }

        if shells.contains(&proc.1.with_extension("").to_str().unwrap_or("")) {
            return Some((pid, proc.1.to_owned()));
        }

        pid = proc.0;
    }
1 Like

Thanks! I appreciate the comments.

Unfortunately, NotStarted calls Process32FirstW whereas Running calls Process32NextW. But yeah, otherwise these are the same. Maybe I could factor out the "function I'm calling" somehow, but I don't know if it would make things much clearer.

I'm used to Python iterators where I just write a function with yield statements to suspend execution, so I always find inverting the flow of control like this difficult to do cleanly.

Yes, you're probably right about that. Also, I wasn't aware of take() - that's useful to know. And I didn't know that ? would work on options - I thought it was only for Result values.

Yep, that felt very clumsy, I simply wasn't aware that ? was an option there.

Good catch. That was a holdover from the Python code, which does a little more work in the "return a value" case, but I agree the reorganised code you show looks better.

One thing - is the shells.contains(&proc.1.with_extension("").to_str().unwrap_or("")) expression reasonable? It felt over-complicated with all of the to_str and unwrap_or calls. Also, I should probably do a case insensitive match on the values in shells, so really there needs to be a to_lowercase in there as well. That was a general concern I had, that these method chains get pretty long. I'm still finding them quite hard to read - I'm not sure to what extent that's just me being new to Rust, or whether it's a genuine issue.

Thanks for the comments, there's some good stuff to think about there :slight_smile:

Ah sorry, too visually similar for me to notice I guess.

It might still be usable if you call Process32FirstW when creating the ProcIter and then either save or close the handle depending on if it's an error; after that, handle.is_some() means there's a result waiting for you...

    fn next(&mut self) -> Option<Self::Item> {
        // Remove handle from `self` temporarily. If we have no handle,
        // iteration is complete, and the `?` will return `None` here.
        let handle = self.handle.take()?;
        let this = self.proc_entry.into();

        let entry_ptr = &mut self.proc_entry as *mut PROCESSENTRY32W;
        match unsafe { Process32NextW(handle, entry_ptr) } {
            Ok(()) => { 
                // Replace handle and keep going
                self.handle = Some(handle);
            }
           Err(_) => {
                // Close handle and don't replace it
                let _ = unsafe { CloseHandle(handle) };
            }
         }

         Some(this)
    }

...if I didn't miss any (more) state transitions, anyway. It does mean you pay a greater cost before iterating (but you're not really taking advantage of lazy iteration anyway). Matter of preference perhaps.

The plan is for Rust to have something like that eventually, but yeah we don't have it yet.

Eh, it's not the prettiest but not the worst either. In scripting languages it'd probably be a regex and lowercase call or something. That said, I think it could be improved if you used &OsStrs or &Paths or the like for your shells HashSet, as then you shouldn't need to do the str conversion.

On the other hand, if you're going to lowercase the executable name, it will be easier (though not strictly necessary) to convert to a String.

You can use eq_ignore_ascii_case instead of converting, but you'd need something other than a HashSet -- a sorted Vec or array you could binary search against, perhaps. (Or something dumber, this part doesn't need ultra-premature-optimized.)

So I think it's all tradeoffs.

This is actually a very good point. Because I'm used to Python, I immediately thought of using an iterator, because that's what you typically do in Python. But as you say, I'm not actually using any of the benefits of iteration, so I should probably just compute the process map directly, and cut out the extra complexity.

I've updated the code based on your suggestions - the new code looks a lot cleaner now, so thanks very much for your help.

1 Like

Yeah, true, heh.

However! I think it's good you made a custom iterator. I made a bunch early-on in my own Rust journey and find it completely natural (even if boilerplatey, like many other things in Rust). But I've seen a lot of people who find it daunting for some reason, even awhile after getting started on Rust, so it's a valuable skill.

1 Like

You can abuse async fn for this now, but it's a bit clumsy. Generators/coroutines have had some movement recently (Add support for `gen fn` by eholk · Pull Request #118457 · rust-lang/rust · GitHub for example), so you can play with this using unstable features, but it's still pretty raw.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.