MOSS Forum

Ask Question   UnAnswered
Home » Forum » MOSS       RSS Feeds

Moss 2007 indexing- anyway to speed things up?

  Date: Nov 01    Category: MOSS    Views: 564

My current moss 2007 farm is taking 24-26 hours for an incremental crawl
(current total size for the database is pushing 5TB). Besides tossing
beefier hardware at the farm, is there a way to speed this up? Since I
have several web apps on the farm, I was thinking of splitting up the web
apps into two or more content sources. Would this allow a
more parallel approach, instead of the current serial method of having all
the web apps I the same content source? Anyone have experience with this?



3 Answers Found

Answer #1    Answered On: Nov 01    

I see that you are indicating that you have close to a 5TB DB. How many items is
your farm crawling? How many web apps do you currently have? How many WFE's and
APP servers do you have?

What I have done to speed up crawling is to split out the content source into
several content sources. This allows better flexibility with crawl schedules and
crawling web apps that might be more important.

You have to see, which web apps have the most content and then plan accordingly.

Answer #2    Answered On: Nov 01    

Currently we have 12 web apps. Our farm is one app/Central admin server
and 2 WFEs
You touched on what I was looking at as a solution, splitting up the web
apps into separate content sources. My main concern is, how many crawls
can I safely have running at the same time?

Answer #3    Answered On: Nov 01    

12 web apps... ok. That is not bad. How is your SQL setup? How many SQL servers?
Not sure how big your user population is, but here are a couple of things to
keep in mind regarding crawl times.

Determining crawl times/crawl performance depends on a number of factors, here
are some of the more important ones

** Number of Indexing/Crawl threads
** Size of documents, type of documents (mix), ifilters (single or multi
** Memory/CPU utilization/NIC utilization on the source and destination server
(this is very important, especially with the 2-1 setup that you have)
** Indexing a WSS 3.0 is more efficient since it uses the change log. (this is
very important, because you should have the temp dbs on their own LUNs and have
ample amount of space to grow so it can handle the changes)
** BDC for Structured data uses dedicated crawl time so this should be factored

So in a nutshell what are some broad estimates?

** If it's 10s to hundreds of MBs – measure it in minutes
** If it's 10s to hundreds of GBs – measure it in hours
** If it's 1-10TB+ - measure it in days to a week
** 10-100TB – measure it in weeks

If you split them into separate content sources, you just have to know which
content sources have the most items and try not to run the big content sources

Didn't find what you were looking for? Find more on Moss 2007 indexing- anyway to speed things up? Or get search suggestion and latest updates.