Gowator Posted August 25, 2003 Report Share Posted August 25, 2003 Doesn anyone know/recommend any software for creating a search engine on a web site. It's html and pdf although I guess I could make html copies fo the pdf stuff. I want it to run through and then I can add a search since the site has about 2000 pages. Quote Link to comment Share on other sites More sharing options...
paul Posted August 26, 2003 Report Share Posted August 26, 2003 aspseek is a very cool search engine ..... acts just like google ;-) http://www.aspseek.org/ in fact we could create our own google if we wanted :-) ... start a spider, and come back in 4 months to our own version of google ... I've wanted to do this for a while, and have let a spider run for 2 months .. before I run out of HD space :-( Quote Link to comment Share on other sites More sharing options...
Gowator Posted August 26, 2003 Author Report Share Posted August 26, 2003 Hey, that looks cool. I'm going to have to compile it for Solaris so it depends on the libraries but I'll try it at home under linux first. THX Quote Link to comment Share on other sites More sharing options...
paul Posted August 28, 2003 Report Share Posted August 28, 2003 wanna see it in action??? I've run a spider for 3 days on mandrakeuser/wapdomainz/loudas here's the result http://twins.loudas.com Quote Link to comment Share on other sites More sharing options...
johnnyv Posted August 29, 2003 Report Share Posted August 29, 2003 wanna see it in action???I've run a spider for 3 days on mandrakeuser/wapdomainz/loudas here's the resulthttp://twins.loudas.com those rank numbers look like what you get from mysql full text search queries. I assume you have set a result limit? Quote Link to comment Share on other sites More sharing options...
paul Posted August 29, 2003 Report Share Posted August 29, 2003 nope no result limit .... the search deamon get the results in arrays .. i.e all results from www.mandrakeusers.org all resulots from www.loudas.com etc etc then creates the link "More results from http://www.mandrakeusers.org (22811 documents)" the data is stored in mysql, and a "words" index is built that looks like this paul@trinity paul $ ls /trinity/aspseek/var/aspseek12/ 00w 08w 16w 24w 32w 40w 48w 56w 64w 72w 80w 88w 96w lastmodd 01w 09w 17w 25w 33w 41w 49w 57w 65w 73w 81w 89w 97w logs.txt 02w 10w 18w 26w 34w 42w 50w 58w 66w 74w 82w 90w 98w ranksd 03w 11w 19w 27w 35w 43w 51w 59w 67w 75w 83w 91w 99w subsets 04w 12w 20w 28w 36w 44w 52w 60w 68w 76w 84w 92w citations total 05w 13w 21w 29w 37w 45w 53w 61w 69w 77w 85w 93w delmapr 06w 14w 22w 30w 38w 46w 54w 62w 70w 78w 86w 94w delta 07w 15w 23w 31w 39w 47w 55w 63w 71w 79w 87w 95w deltas paul@trinity paul $ Quote Link to comment Share on other sites More sharing options...
Gowator Posted August 29, 2003 Author Report Share Posted August 29, 2003 I like it. Now I gotta try ... Quote Link to comment Share on other sites More sharing options...
aRTee Posted August 29, 2003 Report Share Posted August 29, 2003 Why not put google on your website? It worked fine for mine, you can even let the results page look like your site, check my site for kicks ;) Quote Link to comment Share on other sites More sharing options...
Gowator Posted August 29, 2003 Author Report Share Posted August 29, 2003 Its for an intranet site Im editing at work. So google is out.... Also it eventually has to work on IIS (not my idea ???) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.