Crawling

Jul 6, 2011 at 9:58 PM
Edited Jul 6, 2011 at 10:03 PM

Hi. First thing I need to say, it that I think this is a great module to Umbraco. Rendering and catching whole html is a great idea.

However I need to ask, if it is possible, or will be in near future to crawl for Umraco Full Text Search Module.

I have a site that uses some macros and user controls. This controls render some links, which should be indexed.

Scenario:

www.mypage,.com/listOfBooks.aspx

This site has grid with lets say 10 books, and links direct to:

www.mypage,.com/Book.aspx?id=X

where X is an Id of book. This page has, user control that renders book details depending on id.

 

Now, i think, search will only index 2 pages. instead of 12 (www.mypage,.com/Book.aspx?id=1, www.mypage,.com/Book.aspx?id=2, etc...)

 

Is this feature availible, or will be?

Thanks.

Mike.

Coordinator
Jul 11, 2011 at 10:13 AM

Hi Mike, thanks for the kind words.

Searching this kind of page is definitely problematic at the moment. It's something that needs to be addressed. But I'm not sure of the best way to approach it.

I'm reluctant to implement a crawler, because this will break the relationship between nodes in umbraco and in the index, meaning it might become necessary to re-index large portions of a site any time a single node changes. It would also add significant complexity to the package.

But you're right that there needs to be a way to index this kind of content. When I have some time I'll look at adding this, hopefully in the near future, but I can't promise when.

In the meantime you could look at overriding the renderer for the node type of book.aspx from your own code. It might be possible to add the additional nodes to the search you need from there. See the advanced customisation docs.