All Activity

NAS-2733 - lacked some imports in the jsp page

Note that the new "old" filtering method gives you a Domain X is not registered! error, if domain X is not registered in the Domains table of the harvestdatabase

Note that the new "old" filtering method gives you a

 Domain X is not registered! 

error, if domain X is not registered in the Domains table of the harvestdatabase

Merge branch 'master' into version3-release to get checksum checker and fixes.

Merge branch 'DPA-140' to get checksum resynthesizer validation check.

NAS-2733 - Forgot to update the complete_settings.xml file. Added the default value for tempPath to HarvesterSettings class

In the latest commit, I have renabled the old filtering method using lookup in the database, and made this the default method. There is now a setting for the filteringMethod used by the Harveststat...

In the latest commit, I have renabled the old filtering method using lookup in the database, and made this the default method.
There is now a setting for the filteringMethod used by the Harveststatus-running.jsp page:
settings.harvester.webinterface.runningjobsFilteringMethod

If this value is set to "cachedLogs", the filtering is done by searching in the cached crawllogs
if this value is set to "database", the filtering is done by searching in the database
The latter is the default value in the settings

Fixed in the latest commit

Fixed in the latest commit

Fixed in the latest commit

Fixed in the latest commit

Fixed in the latest commit to branch NAS-2733

Fixed in the latest commit to branch NAS-2733

Made fix for NAS-2735. Work on NAS-2733 - Enabled filtering in running jobs using previously used FindRunningJobQuery class

DPA-124: Move VeraPDF invocation to a separate process.

We have found during the project that VeraPDF have had leaks causing

the JVM to run out of resources and needing a restart. Most

of these have been fixed, but to make the autonomous components

invoking VeraPDF more robust, the functionality must be

moved to a separate JVM and invoked as a simple web service.

This branch implements that.

H3 monitor work in progress. Added some threshold settings. Isolated I18N. Show total and search sizes in crawllog resource. Do not call h3Job.update() in job resource.

The searching for jobs harvesting a domain used the code in 5.2.2 used the ./harvester/harvester-core/src/main/java/dk/netarkivet/harvester/webinterface/FindRunningJobQuery.java and this code in th...

The searching for jobs harvesting a domain used the code in 5.2.2
used the ./harvester/harvester-core/src/main/java/dk/netarkivet/harvester/webinterface/FindRunningJobQuery.java
and this code in the Harveststatus-running.jsp

 FindRunningJobQuery findJobQuery = new FindRunningJobQuery(request);
    Long[] jobIdsForDomain = findJobQuery.getRunningJobIds();

and further down:

<% if (jobIdsForDomain.length > 0) { %>
<br/>
<table class="selection_table_small">
<tr>
    <th><fmt:message key="running.jobs.finder.table.jobId"/></th>
</tr>
<% for (long jobId : jobIdsForDomain) {
    String jobDetailsLink = "Harveststatus-jobdetails.jsp?"
       + Constants.JOB_PARAM + "=" + jobId;
%>
<tr><td><a href="<%=jobDetailsLink%>"><%=jobId%></a></td></tr>
<% } %>
</table>
<% } else {

    //after using the search button "searchDone" !=null
    String searchDone = request.getParameter("searchDone");
    if (searchDone != null) { %>
    	 <fmt:message key="table.job.no.jobs"/>

<% } %>
<% } %>
<% } %>
<%
 HTMLUtils.generateFooter(out);
%>


Currently the method FindRunningJobQuery(ServletRequest req) is no longer called in Harveststatus-running.jsp

And also when using the new search-formula, jobs not found in the cached logs are not shown at all.
In the old version, the jobs matching the search were shown in a table of their own

However we could introduce the old method again without too much work.
But what do we want here?

Fixed trivial issue NAS-2741. It wrote out useless warnings to the log

But in that case it is only useful, if caching is enabled for all the H3 hosts

But in that case it is only useful, if caching is enabled for all the H3 hosts

I was thinking of a use case where a domain-owner/host complains that a harvest is too aggressive, but we don't know which job is doing the harvesting - for example of inline content. So we want to...

I was thinking of a use case where a domain-owner/host complains that a harvest is too aggressive, but we don't know which job is doing the harvesting - for example of inline content. So we want to search in the crawl logs.

After Sara's comments last Friday, I still would like to know what the initial requirements for this feature was. According to BNF, they would have preferred a filter, that searched in the harvest-...

After Sara's comments last Friday, I still would like to know what the initial requirements for this feature was.
According to BNF, they would have preferred a filter, that searched in the harvest-database instead of in the cached crawl-logs.
As it is - now - it is very confusing

I think it should possible to disable this feature.

There was no logging at all for the NASEnvironment class. And I didn't want to waste any time figuring out how logging should work in NICL's framework. So I've added my own framework. So, if you th...

There was no logging at all for the NASEnvironment class. And I didn't want to waste any time figuring out how logging should work in NICL's framework.
So I've added my own framework.
So, if you think the methodName writeLog is confusing, you're welcome to find another name for it.

Merge branch 'version1-release' into version2-release to get hotfixes

version2-release: fixes + web app "manual control".

I'm getting confused because the word "logging" is being used both for diagnostic logging, and for crawl-logs. So is this intended to be a fix for the underlying bug, or just added diagnostic loggi...

I'm getting confused because the word "logging" is being used both for diagnostic logging, and for crawl-logs. So is this intended to be a fix for the underlying bug, or just added diagnostic logging to find out what is going on?

Improved final report and JMX stats.

Merge these two loglines

Merge these two loglines

Ad-hoc code review by ABR.

We should create the directory represented by tempPath, if it doesn't exist instead of falling back to /tmp

We should create the directory represented by tempPath, if it doesn't exist instead of falling back to /tmp

Checksum Regenerator autonomous component functionally complete. No error handling. CAS disabled.

  1. … 25 more files in changeset.
Checksum Regenerator autonomous component functionally complete. No error handling. CAS disabled.

Work on NAS-2733 - now searching for the domain in the cached logs. Only does lookup, if the crawlog...
Work on NAS-2733 - now searching for the domain in the cached logs. Only does lookup, if the crawlog...
This value is always set to "", even if we just searched for jobs harvesting a specific domain. We should replace "" with the value of searchedDomainName if it is not null Created https://sbforge....

This value is always set to "", even if we just searched for jobs harvesting a specific domain.
We should replace "" with the value of searchedDomainName if it is not null

Created https://sbforge.org/jira/browse/NAS-2735 for this