Recovery of files lost on NIKHEF
We recieved a list (
atlas_md5sums - 56,2 MB ) of lost files, which were damaged by malfunctioning hardware on matrix.sara.nl. The list included also their calculated md5 checksum.
The original file was checked using
check_weird_lines script for unexpected input.
The file with correct input data was then given to a slightly modified David Camerons script
checkforfiles - using this script, output is needed to be stored (i.e. using
nohup command etc.) This gave two files - file "results" which contains the files, which weren't found and the actual output, where were stored the files, which were found somewhere in some lfc catalogue.
This results were parsed with scripts
parse_output which stores lists files, which were found into different files using depending on which lfc they have been found and
parse_results which strips the "results" file from
checkforfiles script just to contain lfns of files which weren't found (just to have better control about the counts of files)
Then I merged each file which was created with parse-output using script
LFN_from_filename_with_md5 which merges given list of files with original list of files to obtain md5 sum for each file in the list (script
toNativeLFN is needed using this script).
Then I had to use my
md5check script for checking every lfc separately - in my case 9 times. It uses as input list of files - on each line one name of file along with md5 and certain LFC catalogue, which is to be searched through.
It gives as output list of files which md5 is the same in given LFC as in the input files, which md5 is different, which md5 is completely missing in the LFC and lis of files, which couldn't be found in LFC (it signs something went wrong).
Then I merged results from all the LFCs (using
cat command), and used
parse_md5check_results script which checks for duplicities and whether the files have stored different md5 in different catalogues.
These are the results:
- correct files list (22,2 MB) - list of 99434 files which were unharmed
- corrupted files list (9,3 MB) - list of 34188 files, which were damaged at NIKHEF
- list of files missing their md5 (1,3 MB) - list of 10959 files which md5 couldn't be retrieved from any of the LFCs
- bad input list (3,1 MB) - list of 32309 files, which were provided us without md5 and probably weren't even found on the NIKHEF disks. I'll try to find them anywhere else, just for sure.
- not found list (2 KB) - 22 files, which weren't found in any LFC or LRC even if they were shipped us with their md5 sum
- LFC collision list (3 KB) - list of 19 files, which had in two different LFCs different md5 sum. One of these was always the same we recieved with the original file list, so the file appeared to be OK and the other was different. This bug(?) will be reported.