[NAS-2550] Don't try to show large warc-files in the browser but save them to a file Created: 13/Sep/16 Updated: 03/Nov/16 Resolved: 19/Oct/16 |
|
Status: | Resolved |
Project: | NetarchiveSuite |
Component/s: | GUI, Viewerproxy |
Affects Version/s: | 5.1 |
Fix Version/s: | None |
Type: | Improvement | Priority: | Minor |
Reporter: | Søren Vejrup Carlsen (Inactive) | Assignee: | Søren Vejrup Carlsen (Inactive) |
Resolution: | Fixed | ||
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
External reference: | https://sbprojects.statsbiblioteket.dk/jira/browse/NARK-1138 |
Sprint: | NAS 5.2 |
Verification: | I didn't check sensitivity to the filesize parameter, but certainly warc.gz files are saved to disk now release test cases. |
Description |
The links presented by the QA-getfiles.jsp (e.g. http://kb-prod-adm-001.kb.dk:8080/QA/QA-getfiles.jsp?jobid=29234&harvestprefix=29234-76) <p>The links below will only work if your browser is set up to use the viewerproxy as web proxy.</p> <a href="http://netarchivesuite.viewerproxy.invalid/getFile?arcFile=29234-76-20080606102834-00000-sb-prod-har-002.statsbiblioteket.dk.arc">29234-76-20080606102834-00000-sb-prod-har-002.statsbiblioteket.dk.arc</a><br> <a href="http://netarchivesuite.viewerproxy.invalid/getFile?arcFile=29234-76-20080606102834-00001-sb-prod-har-002.statsbiblioteket.dk.arc">29234-76-20080606102834-00001-sb-prod-har-002.statsbiblioteket.dk.arc</a><br> <a href="http://netarchivesuite.viewerproxy.invalid/getFile?arcFile=29234-76-20080606103124-00002-sb-prod-har-002.statsbiblioteket.dk.arc">29234-76-20080606103124-00002-sb-prod-har-002.statsbiblioteket.dk.arc</a><br> asks the viewerproxy the fetch a single (w)arc-file and show the file in the browser. If the file is big, this will crash the browser. The solution: don't try to show the file in the browser, if the file is big |
Comments |
Comment by Søren Vejrup Carlsen (Inactive) [ 06/Oct/16 ] |
Yes, I did. |
Comment by Colin Rosenthal [ 15/Sep/16 ] |
Did you commit this to master already? For me it already save files to disk (I'm testing with warc.gz) but the files always get the name "getFile". |
Comment by Søren Vejrup Carlsen (Inactive) [ 14/Sep/16 ] |
It does that now |
Comment by Søren Vejrup Carlsen (Inactive) [ 13/Sep/16 ] |
So the parameter check should also check for an empty argument |
Comment by Søren Vejrup Carlsen (Inactive) [ 13/Sep/16 ] |
if we use the link
http://netarchivesuite.viewerproxy.invalid/getFile?arcFile=
or
http://netarchivesuite.viewerproxy.invalid/getFile?arcFile
we get error: Internal server error for: http://netarchivesuite.viewerproxy.invalid/getFile?arcFile= dk.netarkivet.common.exceptions.ArgumentNotValid: The value of the variable 'arcfilename' must not be an empty string. at dk.netarkivet.common.exceptions.ArgumentNotValid.checkNotNullOrEmpty(ArgumentNotValid.java:63) at dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient.getFile(JMSArcRepositoryClient.java:203) at dk.netarkivet.viewerproxy.GetDataResolver.doGetFile(GetDataResolver.java:218) at dk.netarkivet.viewerproxy.GetDataResolver.executeCommand(GetDataResolver.java:115) at dk.netarkivet.viewerproxy.CommandResolver.lookup(CommandResolver.java:72) at dk.netarkivet.viewerproxy.CommandResolver.lookup(CommandResolver.java:74) at dk.netarkivet.viewerproxy.WebProxy.handle(WebProxy.java:144) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) |
Comment by Søren Vejrup Carlsen (Inactive) [ 13/Sep/16 ] |
If the arcFile parameter is missing, we get this error: Internal server error for: http://netarchivesuite.viewerproxy.invalid/getFile? dk.netarkivet.common.exceptions.IOFailure: Missing parameter 'arcFile' at dk.netarkivet.viewerproxy.GetDataResolver.getParameter(GetDataResolver.java:246) at dk.netarkivet.viewerproxy.GetDataResolver.doGetFile(GetDataResolver.java:211) at dk.netarkivet.viewerproxy.GetDataResolver.executeCommand(GetDataResolver.java:115) at dk.netarkivet.viewerproxy.CommandResolver.lookup(CommandResolver.java:72) at dk.netarkivet.viewerproxy.CommandResolver.lookup(CommandResolver.java:74) at dk.netarkivet.viewerproxy.WebProxy.handle(WebProxy.java:144) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) |
Comment by Søren Vejrup Carlsen (Inactive) [ 13/Sep/16 ] |
I have now reduced the default to 10 MB |
Comment by Søren Vejrup Carlsen (Inactive) [ 13/Sep/16 ] |
The new feature as well as the feature used by GetRecord uses the maxSizeInBrowser setting with default 100000000 (100Mb), which I suggest reduced to 10 MB
<viewerproxy>
<baseDir>viewerproxy</baseDir>
<tryLookupUriAsFtp>false</tryLookupUriAsFtp>
<maxSizeInBrowser>100000000</maxSizeInBrowser>
</viewerproxy>
|