How to declare Vector of select iterators in scraper


#1

I’m trying to declare a vector of Select Iterators of scraper as a struct field.I’m currently doing

pub struct PawnScraper{
	pub html_instance: Vec<Html>,
	pub selectors: Vec<Selector>,
	pub element_iterators: Vec<scraper::html::Select<'a,'b>>,
}

But this gives me an error

E0106: missing lifetime specifiers label: expected 2 lifetime parameters

i’m confused here…I tried this

pub struct PawnScraper<'a,'b>{
	pub html_instance: Vec<Html>,
	pub selectors: Vec<Selector>,
	pub element_iterators: Vec<scraper::html::Select<'a,'b>>,
}

But this gives me another error for my implementations

impl PawnScraper<'a,'b> {
/*My implementations*/
}

E0261: use of undeclared lifetime name 'a label: undeclared lifetime

I’m new to rust and i’m actually stuck here,how i can continue from here.Any of you experienced developers kindly guide me in this and correct my stupidity


#2

You need to declare the lifetime parameters inside the impl portion, just like with generic type parameters:

impl<'a, 'b> PawnScraper<'a, 'b> { ... }

The bigger issue, however, is you’re headed towards a self-referential struct, from what I can tell, and self-referential structs are not supported by Rust (there are some crates that can work around it, to some degree, but I wouldn’t consider them just yet).

Select is an iterator that borrows a reference to Html and Selector provided to the Html::select() method. PawnScraper stores (and owns) Html and Selector instances. If you are planning on also storing the Select instances produced from these, you end up with self references (Select, owned by PawnScraper, borrowing from other owned fields of PawnScraper).

You will want to reorganize data storage and data flow to avoid this. If you explain what exactly you’d like to do here, we can come up with something workable.


#3

Thank you for taking your precious time to reply here :slight_smile: .
I was trying to write a plugin for pawn scripting language through which i can interact with scraper.
I was thinking about creating a iterator to iterate through the selected elements(in pawn).So using the references of iterators in vector i can achieve this.
Here is the project if my explanation wasn’t good enough,as always


#4

A simple approach would be to store only the Html and Selector values in PawnScraper:

pub struct PawnScraper {
    html_instance: Vec<Html>,
    selectors: Vec<Selector>,
}

Then, ask it to produce a Select instance by somehow indicating which html and selector to use. Here’s a naive approach that uses an index into the 2 Vecs:

impl PawnScraper {
    pub fn select(&self, idx: usize) -> Select<'_, '_> {
        self.html_instance[idx].select(&self.selectors[idx])
    }
}

Now the caller owns the returned iterator (Select), but the iterator has references back to the PawnScraper, which means the PawnScraper instance must outlive the caller. So PawnScraper in the above example is more like a “repository” of the documents and selectors, likely a long-lived value in your app/plugin that other, shorter-lived functionality, makes use of.

The Select iterator, once used, is exhausted so it likely doesn’t make too much sense to store it in a long-lived structure; instead, callers ask for a new iterator when they want to traverse (select) using the repository of html docs and selector expressions.


#5

Thank you so much! You were right about my current approach and i was struggling again to continue.Now i understood i have to learn the language better.But here the function needs to return cell value (i32 ) it can’t access the access the iterator like that in pawn.May be i should explicitly use next and other functions by passing html idx and selector idx from pawn to rust


#6

I’m not familiar with pawn - what’s the interface between it and the plugins? i.e. how does pawn request something from the plugin, and what can the plugin return exactly?


#7

The interface is samp-sdk


#8

I was hoping you could explain it here (at least just the relevant parts) so that I don’t have to learn pawn but can help you with the Rust side of things :slight_smile:.


#9

Ok here is a minimal example
This parse_document function (which returns AmxResult ie a 32 bit integer)

	fn parse_document(&mut self,_:&AMX,document:String) -> AmxResult<Cell> {
		let parsed_data = Html::parse_document(&document);
		self.html_instance.push(parsed_data);
		Ok(self.html_instance.len()  as Cell -1)
	}

is mapped to Pawn’ native function ParseHtmlDocument
using this

pub fn amx_load(&mut self, amx: &mut AMX) -> Cell { //called when an pawn script is loaded
		let natives = natives!{
			"ParseHtmlDocument" => parse_document, //pawnfunctionname => our rust function
			"ParseSelector" => parse_selector
		};

		match amx.register(&natives) {
			Ok(_) => log!("Natives are successful loaded"),
			Err(err) => log!("Whoops, there is an error {:?}", err),
		}

		AMX_ERR_NONE
	}

In pawn this native is called like as below

new Html:html = ParseHtmlDocument("\
		<!DOCTYPE html>\
		<meta charset=\"utf-8\">\
		<title>Hello, world!</title>\
		<h1 class=\"foo\">Hello, <i>world!</i></h1>\
	");//returns the newly index of recently pushed html instance in the vector

Now html variable in pawn consists of the index of new created html instance in html_instance vector.

The same goes for the selector.

Now i want to implement the select function too and simulate the iterator’ function in pawn.
I’m thinking of doing the same approach as above for Iterator (like next,nth,last etc and apply it in a for loop in pawn).But i need to reference the iterator created from select function from pawn as above.For html and Selectors i could i have done it by storing in a vector.