Tags: based, crawler, create, engine, lisp, programming, search, web

web crawler and search engine in lisp

On Programmer » Lisp

5,303 words with 6 Comments; publish: Mon, 12 May 2008 10:56:00 GMT; (200140.63, « »)

I would like to create a lisp based search engine.

Has anyone done any work on this?

All Comments

Leave a comment...

  • 6 Comments
    • tinku99 wrote:

      > I would like to create a lisp based search engine.

      > Has anyone done any work on this?

      Hi There

      We have done a LOT of work with the web - your main issues are:-

      - Handling a LOT of traffic mostly unwanted - including people who

      insist on trying to hack your site as if it's an apache or IIS site!

      (we get this a lot),

      - Dealing with a LOT of data - searching it fast - MYSQL has to be

      configured in a certain way to work under this kind of pressure when

      you are doing a LOT of searches and inserts, for example we found it

      was totally impossible to write logging information to a database, we

      actually had to do it by using text files (just like IIS and apache

      actually - there is a reason!).

      If you use something like ALLEGROSERVE with SBCL you will want to use

      Allegro common Lisp itself we found that Allegro Common Lisp worked

      really well and was really stable under pressure. If you are in the UK

      we can help with this.

      Paul

      http://www.hyperstring.net

      #1; Mon, 12 May 2008 10:58:00 GMT
    • tinku99 wrote:

      > I would like to create a lisp based search engine.

      > Has anyone done any work on this?

      Hi,

      I have created a search engine backend for crawling, re-visiting and

      feeding off of a ping server (currently weblogs.com). It is focused on

      blogs and other live or structured data (e.g. Feeds and Microformats).

      The main work done so far focused on getting it to run fast (currently

      around 100 simultaneous connections) and stable despite the wilderness

      out there on the web.

      One possible business model involves offering derivative data products

      to third parties via an API, so if you want to work on a front-end we

      might be able to collaborate.

      On the other hand if you'd care to ask more specific questions, I'll be

      happy to provide more information.

      Cheers,

      Chris Laux

      http://www.artofcomputing.net/

      #2; Mon, 12 May 2008 10:59:00 GMT
    • Christopher.laux.lisp.todaysummary.com.Web.de wrote:

      > tinku99 wrote:

      > Hi,

      > I have created a search engine backend for crawling, re-visiting and

      > feeding off of a ping server (currently weblogs.com). It is focused on

      > blogs and other live or structured data (e.g. Feeds and Microformats).

      > The main work done so far focused on getting it to run fast (currently

      > around 100 simultaneous connections) and stable despite the wilderness

      > out there on the web.

      > One possible business model involves offering derivative data products

      > to third parties via an API, so if you want to work on a front-end we

      > might be able to collaborate.

      > On the other hand if you'd care to ask more specific questions, I'll be

      > happy to provide more information.

      > Cheers,

      > Chris Laux

      > http://www.artofcomputing.net/

      Sent you an email Chris

      If you don't get it please get in touch - a Lewis server or 5 with Ajax

      would provide you with a really unique proposition

      Paul

      http://www.hyperstring.net

      #3; Mon, 12 May 2008 11:00:00 GMT
    • Hi thanks for the informative link.

      The fact that i didn't find it using search shows there is improvement

      to be done on search engines...

      I am specifically looking to create a custom search engine for

      radiology or medicine (I am a radiologist in training). I will have

      some academic funding soon which i'd like to devote to a lisp search

      engine.

      My motivation is that although i have a customed search engine using

      google:

      http://www.google.com/coop/cse?cx=0...6%3Azt_wady8tis

      I would like control over the search index database.

      #5; Mon, 12 May 2008 11:02:00 GMT
    • tinku99 wrote:

      > I would like control over the search index database.

      In what way would you like to control the index?

      I could configure my spider in a similar way to Google Co-op, i.e.

      focusing on specific domains/sites, and provide an index for you or

      others to further work on.

      Chris

      #6; Mon, 12 May 2008 11:03:00 GMT