Share via


Partially indexed items in eDiscovery

eDiscovery searches automatically include partially indexed items in the estimated search results when the setting Include partially indexed items is selected. Partially indexed items are Exchange mailbox items and documents on SharePoint and OneDrive sites that weren't completely indexed for search. In Exchange, a partially indexed item typically contains a file (of a file type that can't be indexed) attached to an email message.

The following are other reasons why items can't be indexed for search and are returned as partially indexed items:

  • The file type is unrecognized or unsupported for indexing.
  • Messages have an attached file that can't be opened; this is the most common cause of partially indexed email items.
  • The file type is supported for indexing but an indexing error occurred for a specific file.
  • Too many files attached to an email message.
  • A file attached to an email message is too large.
  • A file is encrypted with non-Microsoft technologies.
  • A file is password-protected.

Note

Most organizations have less than 1% of content by volume and less than 12% by size that is partially indexed. The reason for the difference between volume and size is that larger files have a higher probability of containing content that can't be completely indexed.

For legal investigations, your organization might be required to review partially indexed items. You can specify whether to include partially indexed items when you export search results or when you add search results to a review set.

Tip

If you're not an E5 customer, use the 90-day Microsoft Purview solutions trial to explore how additional Purview capabilities can help your organization manage data security and compliance needs. Start now at the Microsoft Purview trials hub. Learn details about signing up and trial terms.

Certain types of files, such as Bitmap (.bmp) or MP3 (.mp3) files, don't contain content that can be indexed. The search indexing servers in Exchange and SharePoint don't perform full-text indexing on these types of files and they're considered as unsupported file types. There are also file types for which full-text indexing is disabled, either by default or by an administrator. Unsupported and disabled file types are labeled as unindexed items in searches.

For a list of supported and disabled file formats, see the following articles:

Messages and documents with partially indexed file types returned in search results

Not every email message with a partially indexed file attachment or every partially indexed SharePoint document is automatically returned as a partially indexed item. That's because other message or document properties, such as the Subject property in email messages and the Title or Author properties for documents are indexed and available to be searched. For example, a keyword search for financial returns items with a partially indexed file attachment if that keyword appears in the subject of an email message or in the file name or title of a document. However, if the keyword appears only in the body of the file, the message or document is returned as a partially indexed item.

Similarly, messages with partially indexed file attachments and documents of a partially indexed file type are included in search results when other message or document properties, which are indexed and searchable, match the search criteria. Message properties that are indexed for search include sent and received dates, sender and recipient, the file name of an attachment, and text in the message body. Document properties indexed for search include created and modified dates. So even though a message attachment might be a partially indexed item, the message is included in the regular search results if the value of other message or document properties matches the search criteria.

For a list of email and document properties that you can search for by using eDiscovery tools in the Microsoft Purview portal, see Keyword queries and search conditions for eDiscovery.

Note

If a mailbox item is moved from a folder that is indexed to a folder that isn't indexed, a flag is set to unindex the item and the item is removed from the index and isn't searchable. Later, if that same item is moved back to a folder that is indexed, the flag isn't reset. That means the item remains unindexed, and not searchable.

Partially indexed items included in search results

Your organization might be required to identify and perform additional analysis on partially indexed items to determine what they are, what they contain, and whether they're relevant to a specific investigation. If Include partially indexed items is selected, the partially indexed items in the content locations that are searched are automatically included with the estimated search results. Depending on the specific setting selected, you can control if you want to include partially indexed items in locations with indexed search hits or partially indexed items in locations without indexed search hits or both. Additionally, you can also include these partially indexed items when you export search results or add items to review sets.

Keep the following in mind about partially indexed items:

  • When you run an eDiscovery search, the total number and size of partially indexed Exchange items (returned by the search query) are displayed in the search statistics view, and labeled as partially indexed items. Statistics about partially indexed items displayed don't include partially indexed items in SharePoint sites or OneDrive accounts.

  • If the search that you're exporting results from was a search of specific content locations or all content locations in your organization, only the unindexed items from content locations that contain items that match the search criteria are exported. In other words, if no search results are found in a mailbox or site, then any unindexed items in that mailbox or site isn't exported. The reason for this is that exporting partially indexed items from lots of locations in the organization might increase the likelihood of export errors and increase the time it takes to export and download the search results.

    To export partially indexed items from all content locations for a search, configure the search to return all items (by removing any keywords from the search query) and then export only partially indexed items when you export the search results (by selecting partially indexed items under Select items to include in your export options).

  • If you choose to include all mailbox items in the search results, or if a search query doesn't specify any keywords or only specifies a date range, partially indexed items might not be copied to the PST file that contains the partially indexed items. This is because all items, including any partially indexed items, are automatically included in the regular search results.

  • Partially indexed items aren't available to be previewed. You have to export the search results to view partially indexed items returned by the search.

    Additionally, when you export search results and include partially indexed items in the export, partially indexed items from SharePoint items are exported to a folder named Uncrawlable. When you export partially indexed Exchange items, they're exported differently depending if the partially indexed items matched the search query and the configuration of the export settings.

  • The following table shows the export behavior of indexed and partially indexed items and whether or not each is included for the different export configuration settings.

    Export configuration Indexed items that match search query Partially indexed items that match search query Partially indexed items that don't match search query
    Export only indexed items Exported Exported (included with the indexed items exported) Not exported
    Export only partially indexed items Not exported Exported (as partially indexed items) Exported (as partially indexed items)
    Export indexed and partially indexed items Exported Exported (included with the indexed items exported) Exported (as partially indexed items)

Date ranges and excluding partially indexed items

In eDiscovery searches, you can't use a date range to exclude partially indexed items from results in a search query. Partially indexed items that fall outside of a date range are included as partially indexed items in the search statistics and when you export partially indexed items. In eDiscovery with premium feature support, partially indexed items can be added to review set and then filtered in a review set before export. Alternatively, use the advanced indexing capability (a premium eDiscovery feature) to ensure these partially indexed items are reindexed to compare to the date range specified and to avoid a large volume of data getting exported.

To learn more about indexing limits for email messages, see Limits in eDiscovery.

More information about partially indexed items

  • Because message and document properties and their metadata are indexed, a keyword search might return results if that keyword appears in the indexed metadata. However, that same keyword search might not return the same item if the keyword only appears in the content of an item with an unsupported file type. In this case, the item would be returned as a partially indexed item.
  • If a partially indexed item is included in the search results because it matched the search query criteria, it isn't included with partially indexed items when you export search results.
  • Although a file type is supported for indexing and is indexed, there can be indexing or search errors that cause a file to be returned as a partially indexed item. For example, searching a large Excel file might be partially successful (because the first 4 MB are indexed), but then fails because the file size limit is exceeded. In this case, it's possible that the same file is returned with the search results and as a partially indexed item.
  • Files that are encrypted with Microsoft encryption technologies and are attached to an email message that matches the criteria of a search can be previewed and are decrypted when exported. At this time, files that are encrypted with Microsoft encryption technologies (and stored in SharePoint or OneDrive) are partially indexed.
  • Email messages encrypted with S/MIME are partially indexed. This includes encrypted messages with or without file attachments.
  • Email messages protected using Azure Rights Management are indexed and are included in the search results if they match the search query. Rights-protected email messages are decrypted and can be previewed and exported. This functionality requires that you're assigned the RMS Decrypt role, which is assigned by default to the eDiscovery Manager role group.
  • We don't recommend using query-based holds to address encrypted or partially indexed items. Query based hold with queries that include conditions beyond dates, participants, or item types—such as keywords or paths might not be applicable for these items. There's a risk that the hold won't be applied as intended. To ensure coverage, we recommend limiting query-based hold conditions for date ranges, participants, and item types only or apply a location-based hold. For more information, see Create an eDiscovery hold.