SharePoint and Content Deduplication

Let me start off by saying that in this short posting, I don’t intend to provide or discuss any solutions, that specifically address de-duplication of content in SharePoint. But rather explore the idea, that duplication of content is often mistaken for a problem its not, and that while de-duplication may seem like the most logical solution; it may really cause more problems then it solves.

I believe that when the topic of duplicate content comes up it generally revolves around content discovery, primarily the impact that duplicate content has on search results. While duplicate content is quickly pegged as the culprit, it’s really more of a discoverability issue around authorative content. Removing duplicate content may not really be the solution, but perhaps effectively surfacing authorative content by reconfiguring search result page(s), fine tuning your search configuration, and doing a better job leveraging refiners and scopes.

As a user who sometimes copies documents and presentation from other areas into my own collaboration or personal sites, I wouldn’t be in favor of a solution that automatically removes my copies. I believe that in trying to remove duplicate content, you’ll quickly find many such users, and you may ultimately find yourself trying to change too much about how your users get work done.

Of course, discoverability may not be the issue you are trying to solve with de-duplication, in which case it may ultimately have to do with storage. If so, it may point to a bigger issue around information architecture, lifecycle management, and content expiration… But I’ll leave those topics for another time in the interest of keeping this posting short.

About these ads

2 Comments on “SharePoint and Content Deduplication”

  1. […] SharePoint and Content Deduplication (rafelo) Let me start off by saying that in this short posting, I don’t intend to provide or discuss any solutions, that specifically address de-duplication of content in SharePoint. But rather explore the idea, that duplication of content is often mistaken for a problem its not, and that while de-duplication may seem like the most logical solution; it may really cause more problems than it solves. I believe that when the topic of duplicate content comes up it generally revolves around content discovery, primarily the impact that duplicate content has on search results. While duplicate content is quickly pegged as the culprit, it’s really more of a discoverability issue around authorative content. Removing duplicate content may not really be the solution, but perhaps effectively surfacing authorative content by reconfiguring search result page(s), fine tuning your search configuration, and doing a better job leveraging refiners and scopes. […]

  2. […] SharePoint and Content Deduplication (rafelo)Let me start off by saying that in this short posting, I don’t intend to provide or discuss any solutions, that specifically address de-duplication of content in SharePoint. But rather explore the idea, that duplication of content is often mistaken for a problem its not, and that while de-duplication may seem like the most logical solution; it may really cause more problems than it solves. I believe that when the topic of duplicate content comes up it generally revolves around content discovery, primarily the impact that duplicate content has on search results. While duplicate content is quickly pegged as the culprit, it’s really more of a discoverability issue around authorative content. Removing duplicate content may not really be the solution, but perhaps effectively surfacing authorative content by reconfiguring search result page(s), fine tuning your search configuration, and doing a better job leveraging refiners and scopes. […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.