[NAS-2349] List of all files when adding new file to bitarchive is inefficient Created: 23/Jun/14  Updated: 19/Feb/16  Resolved: 19/Feb/16

Status: Resolved
Project: NetarchiveSuite
Component/s: Archive
Affects Version/s: 4.4
Fix Version/s: 5.1

Type: Bug Priority: Major
Reporter: Mikis Seth Sørensen (Inactive) Assignee: Unassigned
Resolution: Fixed  
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates NAS-2467 Inefficient coding in BitarchiveAdmin Resolved
Related
related to NAS-2467 Inefficient coding in BitarchiveAdmin Resolved
External reference:

NARK-506


 Description   

See NARK-506 for details.



 Comments   
Comment by Nicholas Clarke (Inactive) [ 24/Jun/14 ]

http://codingjunkie.net/java-7-copy-move/
Another good reason to upgrade to 7 or 8.

Comment by Thorbjørn Ravn Andersen (Inactive) [ 24/Jun/14 ]

It should only be creating dirs and moving files, and I believe Jens Henrik has enough Unix-fu to be able to script that process instead of doing it by hand.

Comment by Tue Hejlskov Larsen [ 24/Jun/14 ]

And it is not a quick fix for Jens Henrik! He need to split up >100 drives with 500 TB into new subfolders.

Comment by Thorbjørn Ravn Andersen (Inactive) [ 24/Jun/14 ]

I disagree that the "split up in smaller subdirectories" is a quick and dirty fix. It is a very common technique to overcome that most filesystems implement directory operations linearly meaning that Things go slower when the number of files in the current directory get large.

Comment by Tue Hejlskov Larsen [ 24/Jun/14 ]

The test system uses the same storage backend according to Jens Henrik.
So we only need to reproduce the problem

Comment by Mikis Seth Sørensen (Inactive) [ 24/Jun/14 ]

A way forward could be:

  1. Get the test system to use the same storage backend as the prod system.
  2. Reproduce the problem with lots and lots of empty files.
  3. Attempt fix, here the Watcher service sounds like a pretty solution IMO. Alternativly a scheduled consistency check.
    This is the longer term fix.
    On the short term a splitting of the files into smaller subdirectories sounds like a quick and dirty fix.
Comment by Nicholas Clarke (Inactive) [ 24/Jun/14 ]

Well it could be an experiment to see if the watcher service works nicely over NFS with so many files. If NFS has native watching it should not be a problem.

Comment by Thorbjørn Ravn Andersen (Inactive) [ 24/Jun/14 ]

What would be the most appropriate way to solve this?

Comment by Mikis Seth Sørensen (Inactive) [ 24/Jun/14 ]

Aaah ok, so this functionality exist to ensure consistency between actual disk data and the cached filelists. We have run into the same problem in the Bitrepository reference pillar, where Jonas had implemented similar functionality, which caused the pillar to slow down to a crawl. This has now been changed so a scheduled service checks consistency on a regular basis instead of being trigged by external operation. This means that instead of using more and more resources as the load rises, the check can run when the system is idle.

A even better strategy would of course be to only trigger a small update based on the actual relevant events, eg. the files a disk has changed, that is use the JDK1.7+ Watcher service as Nicholas proposes.

Comment by Nicholas Clarke (Inactive) [ 24/Jun/14 ]

Yes, Watcher service is nice.

Comment by Thorbjørn Ravn Andersen (Inactive) [ 24/Jun/14 ]

So this would be a very good reason to move to a Java 8 runtime?

Comment by Nicholas Clarke (Inactive) [ 24/Jun/14 ]

Well linux and many files in the same folder is a bad combo even without network mounted drives.
Indexes should be buillt when the app is started.
Søren's point is that rescan is nice to capture when files are added manually.
If this is the case JDK1.7+ Watcher service is ideal for create, update and delete events on files and folders.
Maybe the sub-folders can be created like a dynamic b+tree?

Comment by Thorbjørn Ravn Andersen (Inactive) [ 24/Jun/14 ]

Sounds like the culprit. Wonder why the Network disk cannot keep up?

Comment by Søren Vejrup Carlsen (Inactive) [ 24/Jun/14 ]

The code related to this in the class dk.netarkivet.archive.bitarchive.BitarchiveAdmin.
The interesting methods are updateFileList and verifyFilelistUpToDate() which calls updateFileList

Furthermore, it turns out that the when adding a new file to a archiveDirectory, the List of files for this archivedirectory is regenerated each time instead
of just modified

Generated at Fri Mar 29 06:34:46 CET 2024 using Jira 9.4.15#940015-sha1:bdaa9cbecfb6791ea579749728cab771f0dfe90b.