Using Iron and Router to parse Chinese characters in a URL?


#1

So, I have some pretty simple Iron and Router code that basically is just their copied examples. I’m making an application that looks Chinese characters up in a dictionary.

 38     fn query_handler(req: &mut Request) -> IronResult<Response> {
 39         let ref query = req.extensions.get::<Router>()
 40             .unwrap().find("query").unwrap_or("/");
 42         let result = cedict::parse_line(*query).unwrap();
 43         Ok(Response::with((status::Ok, &*result.definitions[0])))
 44     }

When I go to localhost:3000/你, it should hypothetically look up 你 and provide a definition. However, it instead attempts to look up %E4%BD%A0. I want to look into this, but I’m not sure if the issue is happening in Iron, the router, the browser, or Rust itself (which I thought could handle this, so perhaps Rust isn’t the issue), so I’m not sure where to start.

Follow up question, in line 43, why do I have to do &* syntax? It feels unidiomatic, is there a better way to do it? I did it to avoid “cannot get out of indexed content”.

Thanks!


#2

Your URL isn’t localhost:3000/你, but localhost:3000/%E4%BD%A0. Non-ASCII characters are not allowed in URLs without being encoded (even if browsers make it appear otherwise).

It looks like the framework doesn’t do url-decoding automatically. Try:


#3

In Rust * can call Deref trait, and & can call AsRef.

String happens to implement Deref that gives str, and then you get &str. More idiomatic is &result.definitions[0]. result.definitions[0].as_str() is fine too. In all cases it changes move to a borrow.

Alternatively, result.definitions.remove(0) or result.definitions.swap_remove(0) may work if you really wanted to move the string out of the definitions.


#4

I’m pretty sure that doesn’t work. AsRef must be called explicitly. & triggers Deref just as *, though.

Compare:

https://play.rust-lang.org/?gist=4834c31c63c84c70d5a1a19b922e4cf8&version=stable

https://play.rust-lang.org/?gist=84717278e85fbb3787425a353d26e12e&version=stable


#5

Yes, you’re right.


#6

Happens :).


#7

Thanks all. I was able to get the unencoded (or is it encoded?) URL using iron urlencoded. Still not 100% sure on the &* thing, I’m basically going from String -> str -> &str? I thought str is a pointer (slice) to a string, so why isn’t it just String -> str?


#8

&str is a fat pointer, str is actual bytes of variable length. So str as such can’t be passed as an argument.

That’s different from String which is a struct of a fixed size containing an equivalent of *mut str.