Configure
and run a search crawler on your local portal
site to gather information and create a search collection that enables
your users to search your portal site.
Portal
Search provides a default portal site search collection
that enables your users to search your portal site. Before your users
can search the portal site collection, perform the following tasks.
- Set the crawler user ID. Set a dedicated
crawler
user ID for crawling the portal site content source. Proceed as follows:
- Define the crawler user ID by using the Manage
Users
and Groups portlet. Proceed as follows:
Note: It is of
benefit to define a dedicated crawler user ID. The pre-configured
default portal site search uses the default administrator user ID wpsadmin with
the default password of that user ID for the crawler. If you changed
the default administrator user ID during your portal installation,
the crawler uses that default user ID. If you have changed the password
for the wpsadmin or other administrative
user ID, or if you changed the default administrator user ID to an
ID other than wpsadmin, or if you want
to use a separate user ID, you need to set the crawler user ID.
- Set the preferred language of the
portal site crawler
user ID to match the language of the portal site search collection
that it crawls. (If you do this after you started a crawl on the portal
site search collection, you need to reset the portal site collection.
Refer to Resetting the default search collection.)
- Edit the portal site collection content source and fill
in the crawler user ID and its password. To do this, proceed as follows:
- Click .
- Select Default Search Collection from the
Search Collection list.
- In the Content Source Name list of
the search collection, click
the Portal Content Source search collection.
- Click the Edit icon next to the Portal
Content Source collection name.
- Select the Security tab.
- Click the Edit icon next to
the security
realm that you want to modify.
- Type the crawler user ID and
password into the appropriate fields.
- Click Update.
- Click Save to save your changes.
- Optional: For content
sources of type Web
Site, you can configure the crawler to follow external links from
inside the portal. To do this, modify the value in the field Levels
of links to follow under the tab General Parameters.
Set the level to a value higher than 1. In
addition, you can configure filters for those external links from
the Filters tab. The default filter suppresses all links that point
back to portal pages. The default filter is displayed only after you
save the configuration of the content source.
- Start the initial crawl. Start the initial
crawl
on the portal site content source:
- Click .
- In the search collection list, click Default
Search Collection.
- Click
the Start Crawler icon
(right-pointing arrow) next to the Portal content source name.
- Configure regular crawls. If you want regular crawls
on the portal site content source, perform either of the following
tasks:
- Enable the default scheduler. To do this,
proceed as follows:
- Click the View Content Source Schedulers
icon next to the collection
name.
- In the Manage Schedulers page, click Disabled.
This changes the status of the scheduler to Enabled and displays a
confirmation message.
- Set up your own scheduler.
To do this, proceed as follows:
- Click the Edit icon for the content source.
Note: You
can have only one schedule at a time. Therefore, to create your own
schedule, you first have to delete the existing schedule.
- Select the Schedulers tab.
- Configure
your own scheduler as required. For more details about
how to do this, refer to the Manage Search portlet help.
- Click Save to save your changes.
For more detailed information about how to
work with content
sources refer to
Managing the content sources of a search collection and to the
Manage Search portlet help.
Notes: - The local portal site is exposed through a service that requires
SSL. Therefore, if your portal is configured with a Web server and
you configure the content source root URL through the Web server,
you must configure the Web server for SSL.
- By default, items
in the result lists from portal site searches
provide no summary information. If you want to have the summary information added,
configure the portlet with the summary parameter enabled as follows:
PortalCollectionSummarizer=on.
- When you crawl a portal site, be aware of the Memory required for crawls and the Time required for crawls and imports and availability of documents.
- Set
the preferred language of the crawler user ID to match the
language of the search collection that it crawls.
- The portal
site search collection is created when an administrator
navigates to the Manage Search portlet. However, you must start the
crawl for users to be able to search the portal site. Depending on
your portal configuration and environment and possible customization,
you might need to reset the portal site search collection that was
created. For details about such scenarios and the necessary tasks
to perform refer to Resetting the default search collection.
- If your users search the portal site search collection on a secured
portal site, refer to the additional information under Enabling search on a secured portal site with the default configuration.
- The portal search crawler indexes static
content pages and all pages that include portlets.
When users search a portal site, they can access portal
pages of two types:
- Public or anonymous portal pages. These
are pages that users can
view without authentication by user ID and password. The crawler can
crawl public pages on the portal site on which it resides, or on a
remote portal.
If you want anonymous users to be able to search
the public pages of your portal site, refer to Enabling anonymous users to search public pages of your portal.
- Secured portal
pages. These are pages that users can only view
if they authenticate themselves to the portal by logging in to the
portal with a user ID and password. Refer to Configuring search on a secured portal site.
Note: You can crawl,
index, and search secured portal pages only on your local portal installation.
For security reasons, you cannot crawl secured pages of one portal
site from another portal site.
If you customize
search on your portal site, you might
find useful information under Configuring the default location for search collections and Resetting the default search collection.
If
your portal site is
multilingual and your users use different languages to search your
portal, refer to Crawling a multilingual portal site.