Versions / Builds Affected
20131111Status
ResolvedProblem Summary
Marc.search.exe crashes due to high amount of failed messagesTT / JIRAID
1857How to Identify
IMPORTANT:
CRASHES CAN OCCUR FOR DIFFERENT REASONS - ENSURE TO FOLLOW THESE STEPS IN FULL TO PROPERLY IDENTIFY THIS KNOWN ISSUE
DO NOT JUST ESCALATE / ATTACH THIS KNOWN ISSUE FOR "ANY" CRASH YOU SEE
ALL 3 POINTS MUST BE VERIFIED IN ORDER TO CONFIRM THIS KNOWN ISSUE
IT IS DIFFICULT TO VERIFY IT COMPLETELY WITH TS FILES ALONE
NORMALLY A REMOTE SESSION IS NEEDED TO VERIFY IT PROPERLY
1. The Applications event log lists the crash as follows:
Type: Error
Source: Application Error
Event ID: 1000
Description:
Faulting application name: MArc.Search.exe, version: 20130.1112.92.14, time stamp: 0x5282129b
Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: 0x521eaf24
Exception code: 0xc00000fd
Fault offset: 0x0000000000026062
Faulting process id: 0xa94
Faulting application start time: 0x01cf298e9a05fd8c
Faulting application path: C:\Program Files (x86)\GFI\MailArchiver\Search\bin\MArc.Search.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll
2. Search\DebugLogs\Index.log ends with the following message: "Process failed messages. Total: 17166" (whereby 17166 is a rather high number)
...
2014-02-14,14:11:28,119,1,"#00000A94","#0000000A","info ","Index","Validate: [2008-1201]"
2014-02-14,14:11:28,150,1,"#00000A94","#0000000A","info ","Index","Validate: [2012-0203]"
2014-02-14,14:11:28,181,1,"#00000A94","#0000000A","info ","Index","Validate: [2011-1201]"
2014-02-14,14:11:28,306,1,"#00000A94","#0000000A","info ","Index","Process failed messages. Total: 17166"
Notes:
Make sure that the timestamps of the last line in index.log and the crash event log match (there should not more than 1-2min difference between them).
3. Check for failedMessages.xml files which contain a high amount of entries
Search all index folders for a files named "failedMessages.xml"
- E.g. run the following command against the root folder which holds the index folders: dir /o /s e:\indexes > index-dir-listing.txt
- Open the output file index-dir-listing.txt in Notepad++
+ Search for all lines which contain the filename: failedMessages.xml
- If the file is (much) larger than 157 bytes open the large failedMessages.xml file(s) in notepad and count the rows of the file(s)
If the amount of rows is high (lets say >1000 - ) you found a "problematic index"
Notes:
- There can be more than one problematic index per installation. Ensure to identify all of them!
- If the indexes are located across different disks or folder ensure to still check each of them! Above example simply assumes that they are all stored under a single root index folder.
SIDE EFFECT
As the Search service is crashing the web interface does not loadWorkaround / Fix Details
Fixed in MARC2014 build 20140616
DETAILED INFORMATION ABOUT THE CHANGES IMPLEMENTED - THIS IS MEANT FOR TECHNICAL SUPPORT AGENTS AND NORMALLY NOT NEEDED TO BE COMMUNICATED TO THE CUSTOMER:
The patch fixes the problem causing the crash and marks the archive store as "invalid" after processing the failed messages at least once (so after restarting the service at least once after the failure) if the number of failed messages is bigger that the 10% of the archived messages (if the index is not smaller than 5000) or if it's bigger than "MaximumFailedMessages". MaximumFailedMessages is a value configurable by changing product.config and by default is 10000. As a formula:
((failedMessages > 0.1 * indexSize) AND (indexSize> 5000)) OR (failedMessages > MaximumFailedMessages)
-----
PATCH FOR 20131111:
http://ftp.gfisoftware.com/patches/MARC2013/20131111/MARC2013_R2_PATCH_20140422_1857.zip
-----
WORKAROUND:
Goal:
Get MARC into a state in which the website loads again. The problematic index(es) will not be available for searching.
High level tasks:
Rename the "problematic index folders" (the ones with the high amount of failed messages) and pause them
Steps (this requires to have identified the "problematic indexes" using the method outlined under How to Identify > 3. Check for failedMessages.xml files which contain a high amount of entries:
1. Rename the problematic index folder(s) (e.g. "e:\indexes\2013 Nov" to "e:\indexes\2013 Nov.ContainsHighAmountOfFailedMessages"
2. Start the Search service (it should stay up and running for the time being)
3. Open the web page and navigate to Configuration > Archive Stores
4. Pause indexing for ALL archive store that had a "problematic index"Required Actions
Upgrade to MARC2014 build 20140616 or newer