Answer
Archiver does normally store only a single copy of an email, even if one tries to archive the same email after it has been archived initially. The methods used for Single Instance Storage (SIS) were improved with the release of version 2012 in comparison to older versions.For a better understanding the list below outlines a regular scenario in which SIS prevents a duplicate from being created:
- SenderZ sends an email to RecipientA and RecipientB
- Microsoft Exchange creates a copy of this email in the journal mailbox
- GFI Archiver downloads the email from the journal mailbox
- GFI Archiver stores the email once into an Archive Store and assigns ownership to SenderZ, RecipientA and RecipientB (at this point, a hash value which was generated based on certain data of the email is also stored in the Archive Store - this is the SIS hash value)
- Let's assume the administrator uses the Import Export Tool and imports the very same email from the Exchange mailbox of RecipientA into GFI Archiver
- GFI Archiver (more precisely, the GFI Archiver Store service) will calculate the SIS hash for the copy of the mail which is currently being processed
- GFI Archiver will query the Archive Store to see if an email with the same SIS hash already exists
- GFI Archiver will find the SIS hash value resp. the email in the Archive Store, it will not store the email a second time and prevent a duplicate from being created
a) If an email is processed a second time which was sent to a distribution list (DL) and the members of the DL did change since the email was archived originally
Scenario outline:
- DistributionListX contains 2 members: RecipientA and RecipientB
- SenderZ sends an email to DistributionListX
- Microsoft Exchange creates a copy of this email in the journal mailbox
- GFI Archiver downloads the email from the journal mailbox
- GFI Archiver stores the email once into an Archive Store and assigns ownership to SenderZ, RecipientA and RecipientB (at this point, a hash value which was generated based on certain data of the email is also stored in the Archive Store - this is the SIS hash value)
- The members of DistributionListX changes: RecipientC is added
- Let's assume the administrator uses the Import Export Tool and imports the very same email from the Exchange mailbox of RecipientA into GFI Archiver
- GFI Archiver (more precisely, the GFI Archiver Store service) will calculate the SIS hash for the copy of the mail which is currently being processed
- GFI Archiver will query the Archive Store to see if an email with the same SIS hash already exists
- GFI Archiver will not find find the same SIS hash value resp. the email as the recipients (based on the DistributionListX members in this case) is part of the SIS hash calculation
- GFI Archiver stores the email as a duplicate into an Archive Store and assigns ownership to SenderZ, RecipientA, RecipientB and RecipientC
This scenario is very similar to a) as is causes the SIS hash value to differ when comparing the SIS hash of the original email and the edited one.
c) If an email is processed a second time, but the Archive Store which contains the original email is configured with the option "Read-only access" or "Do not archive more emails in this Archive Store"
Scenario outline:
- SenderZ sends an email to RecipientA and RecipientB
- Microsoft Exchange creates a copy of this email in the journal mailbox
- GFI Archiver downloads the email from the journal mailbox
- GFI Archiver stores the email once into the Archive Store [2014 Jan] and assigns ownership to SenderZ, RecipientA and RecipientB (at this point, a hash value which was generated based on certain data of the email is also stored in the Archive Store - this is the SIS hash value)
- Later in the year, the administrator enables the option "Do not archive more emails in this Archive Store" for the Archive Store [2014 Jan]
- Let's assume the administrator uses the Import Export Tool and imports the very same email from a PST file choosing ImportUserC as the owner
- GFI Archiver (more precisely, the GFI Archiver Store service) will calculate the SIS hash for the copy of the mail which is currently being processed
- GFI Archiver will query the Archive Store to see if an email with the same SIS hash already exists
- GFI Archiver will find the SIS hash value resp. the email in the Archive Store [2014 Jan], but I cannot assign the additional owner ImportUserC based on the [2014 Jan] having the option "Do not archive more emails in this Archive Store" enabled
- GFI Archiver will create a new Archive Store [2014 Jan 2] storing the email into it and assigns ownership to SenderZ, RecipientA, RecipientB and ImportUserC (effetively creating a duplicate across multiple Archive Stores)