SharePoint 2010 Search

Crawler:  Crawling is the process of gathering data from content sources and storing it in databases for use by the query server.
Indexer: Indexing is the process of turning data gathered by the crawler into logical structured data that is usable by a search engine. This process is the second key component to any search engine. The indexer is responsible for making sense of crawled data. The indexer also collects custom metadata, manages access control to containers, and trims the results for the user when interfacing with the search engine.
Query Processor: The query processor is the third major component of the search architecture. The query processor is the portion of the search architecture that users directly interface with. It is what accepts queries entered into the search box, translates them into programmatic logic, delivers requests to the index engine, and returns results.
Databases: The fourth and final components of the search infrastructure are databases. Almost all data in SharePoint is stored in SQL database instances. In regards to search, databases are used to store a wide range of information, such as crawled and indexed content, properties, user permissions, analysis reports, favorite documents, and administrative settings. When the crawler accesses a content source and brings data into SharePoint, it places that content into one or more databases. In addition, all of the data that administrators and users add to SharePoint through active actions, such as adding metadata, or passive actions, such as logs created from portal usage, is also stored in SQL databases.
There are three primary SQL databases necessary to run the search service in SPS 2010.
  • Crawl databases manage crawl operations and store the crawl history.
  • Property databases store the properties (metadata) for crawled data.
  • The Search Admin database stores the search configuration data and access control list (ACL) for crawled content.

Crawling Users Profiles
Enough cannot be said about the power of connecting people for business. For most organizations, their people and those people’s expertise are their biggest assets. Finding people and expertise in a company can be a challenging task at the best of times, and experience and skills can go largely unexploited because people with the right knowledge cannot be found—or worse, their colleagues don’t even know they exist.
SharePoint’s People Search is a powerful feature to expose people in an organization and their expertise, making them findable and accessible. The people search mechanism, although a simple enough concept, requires the identification of people in the organization, their expertise, and their contact information. In order to expose this information and find the relevant people, SharePoint must first be able to crawl the information about these people.

People data in SharePoint comes from indexing user profiles. User profiles are held in SharePoint and hold the information about all the users of SharePoint as well as other potential SharePoint users that may have profile data imported from Active Directory or some other directory server service.

User profile data can be entered manually, either by the administrator or by the users themselves in their personal site (MySite). Additionally, other data sources can be used to populate user profile data.
Usually the starting point for an organization is to synchronize the existing information they have in their organization’s directory with SharePoint and then allow connected users to enrich that information on their MySite pages. This will allow for rich metadata and social search functionality in People Search.

However, this is not strictly necessary, and data from a directory server is not required to have a rich people search experience as long as users are aware of the MySite feature and have the time and interest to keep it up to date.
User profile data is managed by the User Profile service application in the Service Applications
section of Central Administration. For the purpose of this book, we will only go into crawling user profiles and synchronizing them with Directory Servers, but it is important to note that a great deal of rich user information can be managed from this service application. Additionally, the User Profile service application makes it possible to share user data across multiple sites and farms. This can allow for a rich and effective people search and expose expertise in areas of the organization not previously accessible to many employees.
The protocol used to crawl data collected from the User Profile service is called SPS3. It can be seen set in the default content source for SharePoint sites as sps3://servername. If user profiles are not
crawled, check if this site is set in the default content source.
Synchronizing User Profiles
To synchronize user profiles, navigate to the “Manage service applications” page in Central
Administration Then choose the User Profile Service Application. Check if the service application is started in the right-hand column.
Searching from MySites
Each MySite has its own search box that is similar to all the pages in a SharePoint site. However, this MySite search box can have a different target search center. Designating the target search center for MySites is done in the User Profile services application under MySite Settings

Using Crawl Rules
SharePoint 2010’s crawler communicates with the content sources that are defined in a very
standardized manner. It indexes the content as the user that it is specified as and collects information from all the links that are specified. If subfolders are set to be indexed, it will navigate to those folders, collect the links, and gather the content. It is not always desirable or possible, however, to have SharePoint crawl the content sources in the same way with the same accounts. Therefore, SharePoint 2010 has a powerful feature to specify rules for given paths that may be encountered during crawling.
These rules can include or exclude specific content as well as pass special user credentials to those specific items in order to gather them correctly.
Crawl rules are applied in the Search service application on the Crawl Rules page, which is under the Crawler section of the left-hand navigation. Adding a new crawl rule is as easy as navigating to the Crawl Rules page and selecting new crawl rule. Because regular expressions and wildcard rules can be applied, a testing feature is made available on the Crawl Rules page. This feature will allow a particular address to be entered and tested to see if there is a rule already designated that affects the crawling of this address.
Since many rules can be applied and the effect of rules is not always obvious, this testing feature is very useful. If a page is not being crawled, administrators are encouraged to check for conflicting rules.
Add a crawl rule:
To add a crawl rule, navigate to the Search service application and choose Crawl Rules in the lefthand navigation under Crawler. On the Crawl Rules page, select New Crawl Rule. On the Add Crawl Rule page, paths can be added to either explicitly exclude or include. Wildcards or regular expressions can be used to create complicated inclusion or exclusion rules. This gives a powerful way to find undesirable or desirable content and make sure it is or isn’t crawled.

Server Name Mappings
Sometimes, it is desirable to crawl one source and have the link refer to another source. For example, I have a dedicated crawl server called SPCrawlWFE, which is a mirror of my web servers that are providing content to my users. I want to crawl the SPCrawlWFE site but have the users click through to the other server (SPProdWFE). By using server name mappings, one site can be crawled and the server names on the result page links change to another server.
To add a server name mapping, navigate to the Server Name Mappings page under the Crawler
section of the Search service application. Click New Mapping. On the Add New Mapping page, add the name of the site that was crawled and the name of the site users should click through to.

Crawling Metadata
Metadata is information that is associated with a document or file that is not necessarily an explicit part of the visible document. Often, metadata is held in hidden tags on a document or with files or records associated with that document. SharePoint 2010 has a powerful mechanism to assign a large number of properties to lists and documents, which is configurable by the administrator and updatable by authors and collaborators. The management of metadata will be discussed later in this book where it is considered relevant for improving or expanding search. SharePoint 2010 has a rich, new Managed Metadata service application that adds a great deal of configurability and relevancy to search.

Defining Scopes
Search scopes are a slightly confusing concept in SharePoint, as Microsoft has adopted the term “scope”to refer to a structure categorization of documents based on filtering of documents on their shared properties. It is best to think of search scopes as groups of documents that have shared properties. When setting up the crawler, it is possible to create scopes based on managed properties. This will allow pre-determined filter sets to be applied on new search tabs and in search box drop-downs (must be enabled in the Site Collection Search settings). Care should be taken to create scopes that match business needs and possible future sectioning of the content. Any managed property can be made available for scopes, but properties must be explicitly defined as available for scopes in the EditManaged Property page.

To create a new scope, navigate to the Scopes menu item under Queries and Results on the Search Service Application page. On the Scopes page, the existing scopes can be seen. In SharePoint 2010, there should be two default scopes, People and All Sites
Federated Sources
Another great feature supported in SharePoint 2010 is the ability to add federated sources. Federated sources are those sources that are not directly crawled by SharePoint’s crawlers but can still be searched by querying and accessing the indexes of external systems. This is done by querying the search mechanism of that external source, retrieving the result set, and then formatting and displaying it within the SharePoint search interface.
Federated sources can be either SharePoint sites or sites that conform to the OpenSearch 1.0 or 1.1
standards. These standards define how search queries should be passed and how the data is structured and returned.
New federated sources can be defined or an existing template can be downloaded and imported. To
create a new federated source, click New Location and fill out the fields with the appropriate settings.
Every source requires a name that will also be used in the search center.
Creating a New Federated Source
When creating a new federated location, a name and a display name should be defined. The display
name will be shown on the federated search Web Part. A description of the source is also required
The Enterprise Search Center
The Enterprise Search Center provides a tab-based interface in which users can jump between several search pages. The two tabs, All Sites and People, are provided by default, but additional tabs can be added. These tabs can be fully customized to cater to the needs of a particular organization. By selecting the different tabs above the query field, users can direct their searches to different scopes.
Deploying the Enterprise Search Center
If People search is required or desired, it is wise to deploy the Enterprise Search Center, which has a
template with all the elements of People search already set. Although the Enterprise Search Center
template is visible on standard SharePoint deployments, it requires the SharePoint Server Publishing
Infrastructure feature.

To deploy the Enterprise Search Center, follow these steps.
  • Navigate to the top level of your site collection where you want the search
  • Log in as a site collection administrator.
  • Choose Site Actions, Create Site.
  • Name the search center in the Title field.
  • Give a path for the search center under Web Site Address.
  • Choose the Enterprise tab under the Template section.
  • Choose the Enterprise Search Center template.

The Basic Search Center
Unlike the Enterprise Search Center, the Basic Search Center provides users only with the ability to
execute basic and advanced searches against one universal access search experience.
While the Basic Search Center
can be customized to provide a specialized experience for users, every user on the search center will
return results from the same content sources in an identical layout. In addition, as noted earlier, the
Basic Search Center does not provide a pre-deployed tab for People search. The main benefit of this
search center is simplicity, as it allows for quick deployment of search functionality on any SharePoint site collection.
Deploying the Basic Search Center
The Basic Search Center gives search and result functionality with advanced search and preferences but
without a built-in, pre-defined People Search tab. See Figure 4-2.
To deploy the Basic Search Center, follow these steps and refer to Figure 4-6.
  • Navigate to the top level of your site collection where you want the search
  • Log in as a site collection administrator.
  • Choose Site Actions, Create Site.
  • Name the search center in the Title field.
  • Give a path for the search center under Web Site Address.
  • Choose the Enterprise tab under the Template section.
  • Choose the Basic Search Center template
Redirecting the Search Box to the Search Center
  • Navigate to the top level of the site collections.
  • Choose Site Actions, Site Settings.
  • Under Site Collection Administration, choose Search Settings.
  • In the Site Collection Search Result page field, define your new search results
  • If you named your Basic Search Center “Search”, this path will be
  • /Search/results.aspx. If you created an Enterprise Search Center and named
  • it “Search”, the path will be /Search/Pages/results.aspx.

Best Bets - Best bets are result suggestions pushed to users based on their search queries. Unlike the search suggestion functionality, which suggests a query, the Best Bets feature suggests a result. Best Bets suggestions occur based on specific keywords entered into the query and are presented as the first result(s) in a search result set. This result is slightly offset and marked with a star to stand out from the rest of the result set. For users, actioning a Best Bets result suggestion functions as any other search result of the same content type. The usefulness of the Best Bets feature for users is that if well managed, it drives the most relevant result to the start of a result set
Federated Results
Search federation in SharePoint allows search results to be pulled from other search engines into a
SharePoint search center. If set up, users can use one search center to pull information back from
content outside of SharePoint. For example, if a user is logged into the North American SharePoint farm, but wants to also return results from the European farm, if federated search is set up, the user can achieve this. In the same search interface, the user could also return search results from an Internet search engine, such as Bing.com.

This concept was first highlighted in Chapter 1 and is essential for SharePoint environments that
need to pull content from sources such as external silos, other SharePoint farms, blogs, and Internet
search engines. Chapter 1 discussed how SharePoint can accept a query, pass it to another external
search engine for processing, and return results from that external search engine into one aggregated  interface.  

Operator in SharePoint Search:
  • When searching for “Noble AND SharePoint”, all results
  • returned must contain both the term “Noble” and the term
  • “SharePoint”.

  • When searching for “Energy OR Nuclear”, results will include
  • items containing “Energy”, “Nuclear”, or both “Energy” and
  • “Nuclear”.

  • When searching for WORDS (Kids, Children), results will
  • include items containing “Kids” as well as “Children” and
  • consider them to be the same term for ranking.

  • When searching for “Energy NOT Nuclear”, all results
  • returned must contain the term “Energy”, but never should
  • contain “Nuclear”.

  • When searching for “Energy - Nuclear”, all results returned
  • must contain the term “Energy”, but exclude results
  • containing “Nuclear”.

  • The query “author:Josh Noble” will return only documents
  • authored by “Josh Noble”. “filetype:pdf” will return only
  • PDF-type files.


Query Suggestions
Query suggestions, or search suggestions as they are commonly called, are shown as a drop-down on the search box as the user types in search terms. Search suggestions are shown as “search-as-you-type,” if any suggestions exist that match the text in the search box.

Choosing a Federated Location
To use any new federated location, choose it from the Location drop-down. Alternatively the pre-defined Internet Search Results location can be selected. This searches http://bing.com using the OpenSearch standard and RSS. To complete the configuration, follow these steps.

  • After selecting a location, click Apply, and then OK.
  • Click Stop Editing on the ribbon, and test the new settings with a query.
Search Keywords
Keywords are used to configure a relationship between a keyword, synonyms, and a manually defined result set called best bets. Many companies have content where some terms are especially important or have a special meaning. This then leads to a particular query not returning a particular search result or the wanted search result is too far down in the search result set.

Search keywords are a very useful feature in these types of scenarios. They can either be used to define synonyms for particular keywords or to add/emphasize specific search results on the search results page using best bets and the Best Bets Web Part.
Even though this feature is also available in SharePoint 2010, it is especially useful in SharePoint 2010 due to the search statistics available. Now an administrator can find often-searched-for query terms and augment the search results of these with specially selected best bets.

Managing Search Keywords
The search keywords page is used to create the mapping between keywords and synonyms. Additionally best bets can be configured for each keyword, as shown here. The search keywords page is accessed from Site Actions ➤ Site Settings ➤ Site Collection Administration ➤ Search Keywords.
Adding Search Scope Selector to the Search Box

To enable scope selection directly from the search box, do the following:
  1. Go to Site Actions ➤ Site Settings ➤ Site Collection Administration ➤ Search
Settings. This will open the page for managing the search boxes on this site
Collection

v4.master
This is the default team site master page and the one that is suggested as a template when creating a new or custom branded master page. This master page provides the new ribbon bar as well as other UI changes. Also the Site Actions button is moved in SP 2010. It now appears in the upper left corner.
default.master
If a site is upgraded from SP 2007, it uses this master page per default. The Site Actions button is located on the upper right side, and the UI mainly is the same as in SP 2007. This master page does not include the ribbon bar. The sites using this master page can be changed to use the new version 4 master page, named v4.master, or any custom branded master page based on this.

minimal.master
This master page is close to being the simplest possible. It is used only by the search centers and Office Web Applications. One of the things most people quickly notice when using sites based on this master page is the lack of navigation. It is arguably a significant lack of functionality, and although the purpose is to provide more screen real estate for search results as well as making the search center appear cleaner, it is something that should be changed in most corporate sites. It does make more sense for the Office Web Applications, as they have their own UI. In any case, this is how the minimal.master is in SP 2010 out of the box.
simple.master
This master page is used by the generic SharePoint 2010 pages such as login and error pages. It is not
possible to use another master page for these pages. The only option to customize these pages is to
create a replacement page and override the existing ones by saving it in the _layouts directory on the
server.
The following pages use simple.master:
  • Login.aspx
  • SignOut.aspx
  • Error.aspx
  • ReqAcc.aspx
  • Confirmation.aspx
  • WebDeleted.aspx
  • AccessDenied.aspx

Other Master Pages
SharePoint 2010 includes a lot of other master pages that are typically not required to be manipulated when creating a custom branded layout. It is suggested to leave them unchanged unless a special reason exists for not doing so. These master pages are
  • application.master
  • applicationv4.master
  • dialog.master
  • layouts.master
  • layoutsv3.master
  • pickerdialog.master
  • rtedialog.master
  • simple.master
  • simplev4.master
  • mwsdefault.master
  • mwsdefaultv4.master
  • admin.master
  • popup.master

Creating a FullTextSqlQuery-Based Search
Sometimes the programmer wants more control over the query. This can be achieved using the
FullTextSqlQuery class. This class allows the programmer to control the query using SQL syntax, which is often a more familiar language and makes it easier to understand why a query behaves a particular way. The FROM target object in the SQL has to be set to the SCOPE() function, as shown here. If further refinement is required, the scope can be set in a WHERE clause. If the scope is not set, no scope is applied. In the following query, the first name of all persons in the People scope are returned.
SELECT FirstName FROM SCOPE() WHERE "scope" = 'People'

Comments

Popular posts from this blog

CAML Query in SharePoint 2010

Calling REST APi from server side code - C#

All about SharePoint 2010 Content Type Hub

SharePoint Interview Questions and Answers

Calling ASP.Net WebMethod using jQuery AJAX