For a project that didn't need perfect precision, I made this, which is surprisingly close:
fn rough_isqrt(x: usize) -> usize {
if x == 0 { return x }
let n = std::mem::size_of::<usize>() as u32 * 8;
let scale = n - x.leading_zeros();
let half = scale/2;
let guess = ( (x >> half) + (1 << half) )/2;
guess
}
(The playground ASM has an unnecessary branch due to an LLVM bug, which could be worked around with ctlz_nonzero in std::intrinsics - Rust)