PortiBlog

Cloud hybrid search considerations

24 maart 2016

Summary: This blog post describes some limitations that you need to consider before implementing cloud hybrid search in your organization.

For a full overview of how Cloud Hybrid Search works, read my blog post: Everything you need to know about Cloud Hybrid Search

1. Provider-hosted apps
If you are currently using any provider-hosted apps in your SharePoint farm, do not run the onboarding script provided by Microsoft.
One of the steps in the onboarding script will change the SPAuthenticationRealm for your SharePoint farm.
Effectively, this will break the SPTrustedSecurityTokenIssuer that is responsible for the server-to-server trust between the provider-hosted app and SharePoint.

Microsoft is currently investigating this issue. If there are any updates, I will reflect these changes in this blog post.

2. Search customizations
The Cloud Search Service Application shares a similar architecture as the native SharePoint Search Service Application.
However, customization is limited because the search experience is derived from Office 365.

Below is a table that shows the current limitations in search customizations when using Cloud Hybrid Search.
Hybrid Cloud Search customizations

3. Index item limits and pricing
Vlad Catrinescu blogged about this on his blog.
For every 1TB of pooled storage in SharePoint Online, we are allowed to put one million index items from our On-Premises SharePoint Farm.

You can check your current searchable items from the Search Service Application in Central Administration
Cloud hybrid search considerations

In this example, we would need 13TB of pooled storage in SharePoint Online. This might mean that you have to reconfigure your content sources.

4. Security and compliancy
As your index is stored in Office 365, what does this mean compliancy wise?
This is the answer from Microsoft:

The content that is passed from on-premises to the azure cloud search connector (SCS) consists of crawled properties, keywords, ACLs, tenant info and some other metadata about the item. This is encrypted on-premises using a key supplied by the SCS and transmitted to the endpoint in Azure. Once there it is stored in an encrypted blob store and queued for processing. We retain the encrypted package in the blob store for use should we need to issue a content recrawl. The encrypted object is not the document though, it is just a parsed and filtered version that makes sense to the search engine.

In my opinion it would be wise to consult with your legal department before setting up Cloud Hybrid search to make sure it is okay to store content in the cloud.
You can always modify the content sources to exclude highly classified documents.

5. Licensing and Office 365 accounts
Every user that wants to use the new Cloud Hybrid Search functionality will need an active Office 365 license and an account synchronized from your on-premises Active Directory.
Cloud-Hybrid-Search-Identity
As items are indexed in Office 365, the access control entities are looked up in the cloud directory service.
Hybrid-Search-FederatedAccount
User SIDs are mapped to PUIDs; Group SIDs are mapped to Object IDs; and and are mapped to .

6. Documentation
As the cloud hybrid search is still in preview, documentation is limited at best.
For example, if you are using a proxy for Internet access on your server, make sure to specify this in your machine.config.
I have created a more detailed blog post around this issue here: http://www.sharepointrelated.com/2015/12/11/cloud-hybrid-search-proxy-settings/

Submit a comment