
LucidWorks Enterprise supports crawling a SharePoint Repository running on the following platforms:
* Microsoft Office SharePoint Server 2007
* Microsoft Windows SharePoint Services 3.0
* Microsoft SharePoint 2010
{warning}LucidWorks Enterprise does not support crawling a SharePoint repository running on Microsoft Portal SharePoint Server 2003 or Microsoft Windows SharePoint Services 2.0.{warning}
The SharePoint crawler will index all content in the repository, including public & personal sites, files, discussion boards, calendars, contacts, and images.
h2. Creating a SharePoint Data Source
To configure a SharePoint repository as a data source, click *SharePoint* and fill in the form.
{note}
Additional web services must be installed on the SharePoint server, as described in [#Installing SharePoint Additional Web Services] below, in order for SharePoint data sources to work effectively.
{note}
!screen-index-sources-sharepoint.1.8.png|border=1!
|| Field || Description ||
| Name | A name you want to give for the data source. Data source names may contain any combination of letters, digits, spaces and other characters, and data source names are case sensitive. |
| SharePoint URL | The fully qualified URL for the SharePoint site. |
| Username | Username with authorization to crawl the SharePoint repository. |
| Password | Password for the username above. |
| Domain | The domain where the user is authenticated. |
| MySite URL | Used for MOSS 2007 only. The MySite base URL is used to determine the complete MySite URL, so {{[http://server.domain/personal/administrator/default.aspx]}} would be entered as {{[http://server.domain]}}. The credentials provided will allow LucidWorks Enterprise to complete the MySite URL and crawl the content. |
| Included URLs | The directories on the server that should be crawled for indexing. If left blank, all paths will be followed, even if they lead away from the original URL entered. To limit crawling to a specific site, repeat the URL in this site with a regular expression to indicate all pages from the site. The SharePoint data source uses GNU regular expressions, which may be different from the Java regular expressions used for Web and Filesystem data sources. More information on the syntax can be found in the GNU regular expression [lweug18umentation|http://www.gnu.org/software/gawk/manual/html_node/Regexp.html]. |
| Excluded URLs | Directories on the server that should not be crawled and that should be excluded from the index. The same regular expression syntax can be used to specify Excluded URLs as is used for Included URLs. |
| Kerberos KDC Server | Kerberos KDC Hostname. |
| Use SP Search Visibility | Use SharePoint search visibility options. |
| Site Alias Mappings | Allows mapping of source URL patterns to aliases that are used to rewrite URLs before indexing. |
| Index Immediately | Initiates indexing immediately. Choosing *Once* will start indexing the data source as soon as the _Create_ button is clicked. Choosing *Daily* will start indexing immediately and set a schedule to re-index the data source every day at the same time. Choosing *No* will not start indexing nor set a schedule; indexing can be started at a later time and a schedule set on the Schedules page. |
h2. Working with Secure Content from SharePoint
By default, all content crawled from a SharePoint repository is available to all users. However, it is possible to configure LucidWorks Enterprise to restrict documents shown in the results of queries using Search Filters.