Search Engine Optimization > Web Development > File limit on perlfect search engine - how do I circumvent?
File limit on perlfect search engine - how do I circumvent?
Posted by Dave on November 8th, 2005

I'm using the perfect search engine

http://www.perlfect.com/freescripts/search/

on a web site. Reading the FAQ,

http://www.perlfect.com/freescripts/search/faq.shtml#G4

it says "There's a limit at 65,535 files which can be worked around if
necessary."

Does anyone know how to work around this? We are just copying 63,000
files, but this will be extended, so will easily exceed the 65,535 limit.


Looking at the perl code I see:

foreach $doc_id (keys %tdf) {
#print "weight = $tdf{$doc_id} * log ($DN / $df)\n";
$weight = $tdf{$doc_id} * log ($DN / $df);
$weight = int($weight*100);
$weight = 65535 if ( $weight > 65535 ); # we're limited to 16 bit
$weights .= pack("SS", $doc_id, $weight);
}

and according to

http://www.icewalkers.com/Perl/5.8.0/pod/func/pack.html

'pack' uses a 16-bit short. I suspect the limit is in some way connected
to 'pack', but are not sure, and if someone else has already got around
it, I'd like to know how.

I can if necessary use a 64-bit version of perl.

Posted by Safalra on November 8th, 2005

Dave wrote:
Looking at the documentation you mention, you'd need to change "SS" to
"LL" to get 32 bits, and then change all the other code that assume 16
bits as appropriate (so 65536 would become 4294967295). There will
probably several parts of the code that assume 16 bits, so it might
require a little work.

--
Safalra (Stephen Morley)
http://www.safalra.com/programming/


Posted by Dave on November 13th, 2005

Safalra wrote:
Thank you, I'll take a look at that. The problem is not as urgent as I
thought, as whilst I now have well over 100,000 files, many are gif's or
similar that can not be indexed, so the number actually indexed is lower
than I thought (39046). Hence there is 26,000 or so to go before hits a
limit, but I'll starting looking now.


Funbolt.com - Entertainment portal, wallpapers, sexy celebs