Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503258348860.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:20:45 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:20:45 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:20:45 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 21:45:48 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503258108782' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503258228830' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:20:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503258348860.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503258949142.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503258949142.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 21:55:49 CEST 2017 Found warcs: /opt/webdanica/ARKIV/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/715-1054-20170820195423317-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501652/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/715-1054-20170820195423317-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 715-1054-20170820195423317-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501652/715-1054-20170820195423317-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503258949142.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501652 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501652 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501652/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501652/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ERROR: criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501652/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501652/714-1053-20170820195223163-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz: failed with exitcode 2 ERROR: criteria-workflow failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503258949142.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503259549438.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503259549438.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:05:49 CEST 2017 Found warcs: /opt/webdanica/ARKIV/716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501677/716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501677/717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503259549438.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501677 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501677 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501677/716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501677/716-1055-20170820200223506-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501677/717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501677/717-1056-20170820200423049-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503259549438.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:21:58 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:21:58 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:21:58 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:05:49 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503259309353' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503259429403' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:22:01 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:02 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:02 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:02 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:03 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:03 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503259549438.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503260149748.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503260149748.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:15:49 CEST 2017 Found warcs: /opt/webdanica/ARKIV/719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501724/719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501724/718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503260149748.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501724 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501724 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501724/718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501724/718-1057-20170820201223879-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501724/719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501724/719-1058-20170820201423443-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503260149748.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:22:47 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:22:47 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:22:47 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:15:49 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503259909678' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503260029719' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:22:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:50 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:51 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:22:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503260149748.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503260750085.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503260750085.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:25:50 CEST 2017 Found warcs: /opt/webdanica/ARKIV/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/721-1060-20170820202423031-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501773/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/721-1060-20170820202423031-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 721-1060-20170820202423031-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501773/721-1060-20170820202423031-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503260750085.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501773 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501773 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501773/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501773/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ERROR: criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501773/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501773/720-1059-20170820202223367-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz: failed with exitcode 2 ERROR: criteria-workflow failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503260750085.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503261350368.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503261350368.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:35:50 CEST 2017 Found warcs: /opt/webdanica/ARKIV/723-1062-20170820203423339-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/723-1062-20170820203423339-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 723-1062-20170820203423339-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501796/723-1062-20170820203423339-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501796/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503261350368.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501796 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501796 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501796/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501796/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ERROR: criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501796/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501796/722-1061-20170820203223635-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz: failed with exitcode 2 ERROR: criteria-workflow failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503261350368.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503261980662.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503261980662.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:46:20 CEST 2017 Found warcs: /opt/webdanica/ARKIV/724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501818/724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501818/725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503261980662.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501818 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501818 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501818/724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501818/724-1063-20170820204223598-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501818/725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501818/725-1064-20170820204424051-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503261980662.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:24:20 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:24:20 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:24:20 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:46:20 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503261710595' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503261830633' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:24:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:24:24 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503261980662.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503262580989.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503262580989.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:56:20 CEST 2017 Found warcs: /opt/webdanica/ARKIV/727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501865/727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501865/726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503262580989.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501865 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501865 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501865/726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501865/726-1065-20170820205153504-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501865/727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501865/727-1066-20170820205423015-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503262580989.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:25:10 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:25:10 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:25:10 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 22:56:20 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503262280916' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503262430955' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:25:13 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:25:13 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:25:13 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:25:13 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:25:13 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:25:14 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503262580989.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503263181267.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503263181267.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:06:21 CEST 2017 Found warcs: /opt/webdanica/ARKIV/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/729-1068-20170820210424104-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501914/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/729-1068-20170820210424104-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 729-1068-20170820210424104-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501914/729-1068-20170820210424104-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503263181267.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501914 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501914 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501914/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501914/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ERROR: criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501914/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501914/728-1067-20170820210153467-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz: failed with exitcode 2 ERROR: criteria-workflow failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503263181267.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503263751622.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503263751622.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:15:51 CEST 2017 Found warcs: /opt/webdanica/ARKIV/731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501944/731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501944/730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503263751622.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501944 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501944 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501944/730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501944/730-1069-20170820211153402-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501944/731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501944/731-1070-20170820211422905-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503263751622.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:26:02 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:26:02 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:26:02 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:15:51 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503263481547' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503263631598' - a harvest with this name already exists in database Printing out report for seed: http://000000book.com/users/antibiotics-for-ear-infection-in-adults-sog Printing out report for seed: http://brothersinc.net/scroll.waypoints aug 23, 2017 5:26:05 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager closeAllConnections INFO: Closing down all 1 connections aug 23, 2017 5:26:05 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager closeAllConnections INFO: Clearing the connectionmap aug 23, 2017 5:26:05 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager deregister WARNING: jdbc-driver 'sun.jdbc.odbc.JdbcOdbcDriver' not registered by this app, so we don't touch it Ingest of /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503263751622.txt was successful Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503263751622.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503264291911.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264291911.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:24:51 CEST 2017 Found warcs: /opt/webdanica/ARKIV/732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501965/732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501965/733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264291911.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501965 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501965 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501965/732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501965/732-1071-20170820212123849-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501965/733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501965/733-1072-20170820212323048-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264291911.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:26:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:26:21 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:26:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:24:51 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503264051848' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503264171883' - a harvest with this name already exists in database Printing out report for seed: http://0000666.com/home.php?mod=space&uid=162376 Printing out report for seed: http://brothersinc.net/show.bs.collapse aug 23, 2017 5:26:24 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager closeAllConnections INFO: Closing down all 1 connections aug 23, 2017 5:26:24 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager closeAllConnections INFO: Clearing the connectionmap aug 23, 2017 5:26:24 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager deregister WARNING: jdbc-driver 'sun.jdbc.odbc.JdbcOdbcDriver' not registered by this app, so we don't touch it Ingest of /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264291911.txt was successful Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503264291911.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503264922213.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264922213.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:35:22 CEST 2017 Found warcs: /opt/webdanica/ARKIV/734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501985/734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501985/735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264922213.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501985 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501985 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501985/734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501985/734-1073-20170820213123726-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503501985/735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503501985/735-1074-20170820213324211-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503264922213.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:27:07 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:27:07 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:27:07 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:35:22 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503264652140' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503264772177' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:27:10 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:10 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:11 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503264922213.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503265522536.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503265522536.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:45:22 CEST 2017 Found warcs: /opt/webdanica/ARKIV/737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502032/737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502032/736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503265522536.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502032 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502032 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502032/736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502032/736-1075-20170820214153672-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502032/737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502032/737-1076-20170820214353005-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503265522536.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:27:49 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:27:49 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:27:49 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:45:22 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503265282464' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503265402505' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:27:52 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:53 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:53 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:53 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:53 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:53 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:54 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:54 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:54 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:27:54 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503265522536.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503266122876.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503266122876.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:55:22 CEST 2017 Found warcs: /opt/webdanica/ARKIV/739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502075/739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502075/738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503266122876.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502075 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502075 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502075/738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502075/738-1077-20170820215153580-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502075/739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502075/739-1078-20170820215353492-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-20-08-2017-1503266122876.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:28:37 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:28:37 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:28:37 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Sun Aug 20 23:55:22 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503265882805' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503266002849' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:28:41 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:28:41 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:28:41 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-20-08-2017-1503266122876.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503266723184.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503266723184.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:05:23 CEST 2017 Found warcs: /opt/webdanica/ARKIV/741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502122/741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502122/740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503266723184.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502122 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502122 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502122/740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502122/740-1079-20170820220153889-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502122/741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502122/741-1080-20170820220353413-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503266723184.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:29:29 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:29:29 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:29:29 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:05:23 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503266483108' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503266603146' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:29:32 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503266723184.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503267323513.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503267323513.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:15:23 CEST 2017 Found warcs: /opt/webdanica/ARKIV/743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502172/743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502172/742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503267323513.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502172 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502172 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502172/742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502172/742-1081-20170820221153524-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502172/743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502172/743-1082-20170820221353057-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503267323513.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:30:16 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:30:16 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:30:16 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:15:23 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503267083436' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503267203480' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:30:20 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:20 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:20 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:30:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503267323513.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503267923818.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503267923818.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:25:23 CEST 2017 Found warcs: /opt/webdanica/ARKIV/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/745-1084-20170820222353118-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502222/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/745-1084-20170820222353118-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 745-1084-20170820222353118-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502222/745-1084-20170820222353118-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503267923818.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502222 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502222 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502222/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502222/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ERROR: criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502222/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502222/744-1083-20170820222153599-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz: failed with exitcode 2 ERROR: criteria-workflow failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503267923818.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503268524149.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503268524149.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:35:24 CEST 2017 Found warcs: /opt/webdanica/ARKIV/747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502245/747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502245/746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503268524149.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502245 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502245 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502245/746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502245/746-1085-20170820223153512-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502245/747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502245/747-1086-20170820223353403-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503268524149.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:31:30 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:31:30 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:31:30 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:35:24 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503268284080' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503268404119' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:31:34 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:34 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:34 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:31:35 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503268524149.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503269124520.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503269124520.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:45:24 CEST 2017 Found warcs: /opt/webdanica/ARKIV/749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502296/749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502296/748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503269124520.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502296 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502296 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502296/748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502296/748-1087-20170820224153414-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502296/749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502296/749-1088-20170820224353617-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503269124520.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:32:18 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:32:19 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:32:19 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:45:24 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503268884434' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503269004487' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:32:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:32:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:32:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:32:23 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503269124520.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503269724826.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503269724826.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:55:24 CEST 2017 Found warcs: /opt/webdanica/ARKIV/750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502344/750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502344/751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503269724826.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502344 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502344 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502344/750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502344/750-1089-20170820225153591-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502344/751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502344/751-1090-20170820225353323-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503269724826.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:33:09 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:33:09 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:33:09 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 00:55:24 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503269484750' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503269604800' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:33:13 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:33:14 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:33:14 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503269724826.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503270325128.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503270325128.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:05:25 CEST 2017 Found warcs: /opt/webdanica/ARKIV/752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502394/752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502394/753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503270325128.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502394 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502394 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502394/752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502394/752-1091-20170820230153934-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502394/753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502394/753-1092-20170820230353000-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503270325128.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:34:00 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:34:00 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:34:00 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:05:25 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503270085049' - a harvest with this name already exists in database Skip ingest of harvest 'webdanica-trial-1503270205099' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:34:03 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:04 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503270325128.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503270835447.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503270835447.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:13:55 CEST 2017 Found warcs: /opt/webdanica/ARKIV/754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502444/754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=1,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503270835447.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502444 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502444 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502444/754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502444/754-1093-20170820231153071-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503270835447.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:34:38 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:34:39 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:34:39 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:13:55 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503270685382' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:42 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:34:43 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503270835447.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503271405732.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503271405732.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:23:25 CEST 2017 Found warcs: /opt/webdanica/ARKIV/755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502483/755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Considering command successful: #successes=1,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503271405732.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502483 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502483 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ do criteria-analysis on file /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502483/755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502483/755-1095-20170820232123333-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Executing bash ingestTool.sh /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503271405732.txt /opt/workflows//automatic-workflow 1.1-RC8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/workflows/automatic-workflow/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. aug 23, 2017 5:35:17 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'hbase-phoenix' for setting 'settings.database.system'. aug 23, 2017 5:35:17 PM dk.kb.webdanica.core.datamodel.dao.HBasePhoenixConnectionManager register INFO: Now created instance of 'org.apache.phoenix.jdbc.PhoenixDriver aug 23, 2017 5:35:17 PM dk.kb.webdanica.core.utils.SettingsUtilities getStringSetting INFO: Using value 'jdbc:phoenix:narcana-hbase01.statsbiblioteket.dk,narcana-hbase02.statsbiblioteket.dk,narcana-yarn01.statsbiblioteket.dk,narcana-yarn02.statsbiblioteket.dk,narcana-ambari01.statsbiblioteket.dk:2181:/hbase' for setting 'settings.database.connection'. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:23:25 CEST 2017 log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Skip ingest of harvest 'webdanica-trial-1503271255678' - a harvest with this name already exists in database The list of loaded data settings is empty. Is this OK?aug 23, 2017 5:35:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:21 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:22 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. aug 23, 2017 5:35:22 PM dk.kb.webdanica.core.utils.SettingsUtilities getBooleanSetting INFO: Using value 'true' for setting 'settings.seeds.rejectDkUrls'. dk.kb.webdanica.core.datamodel.dao.DaoException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:71) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.processFile(CriteriaIngest.java:222) at dk.kb.webdanica.core.interfaces.harvesting.HarvestLog.processCriteriaResults(HarvestLog.java:184) at dk.kb.webdanica.core.datamodel.criteria.CriteriaIngest.ingest(CriteriaIngest.java:59) at dk.kb.webdanica.core.tools.CriteriaIngestTool.main(CriteriaIngestTool.java:71) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. DOMAINS.DOMAIN may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:199) at dk.kb.webdanica.core.datamodel.dao.HBasePhoenixDomainsDAO.insertDomain(HBasePhoenixDomainsDAO.java:67) ... 4 more ERROR: criteria ingest failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503271405732.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503272126047.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503272126047.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:35:26 CEST 2017 Found warcs: /opt/webdanica/ARKIV/757-1098-20170820233352976-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/756-1097-20170820233153488-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/757-1098-20170820233352976-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 757-1098-20170820233352976-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502522/757-1098-20170820233352976-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ^CProcessing /opt/webdanica/ARKIV/756-1097-20170820233153488-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 756-1097-20170820233153488-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502522/756-1097-20170820233153488-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ^CConsidering command successful: #successes=2,#failures=0 Finished parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503272126047.txt with success Executing : bash criteria-workflow-alt.sh /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502522 /opt/workflows//automatic-workflow/criteria-results-automatic/23-08-2017-1503502522 /opt/workflows//automatic-workflow /opt/workflows//pig-0.16.0/ ERROR: seqfile /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502522/756-1097-20170820233153488-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz/756-1097-20170820233153488-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz does not exist. The parsed-text computation must have gone wrong ERROR: criteria-workflow failed Processing done of harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503272126047.txt Processing harvestlog: /opt/workflows/harvestlogs/harvestLog-21-08-2017-1503272726364.txt Starting parsed-workflow on file /opt/workflows//automatic-workflow/working/harvestLog-21-08-2017-1503272726364.txt .. Ignoring line: Harvestlog for harvests initiated by the Webdanica webapp at Mon Aug 21 01:45:26 CEST 2017 Found warcs: /opt/webdanica/ARKIV/758-1099-20170820234153173-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz /opt/webdanica/ARKIV/759-1100-20170820234355442-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz Processing /opt/webdanica/ARKIV/758-1099-20170820234153173-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 758-1099-20170820234153173-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502525/758-1099-20170820234153173-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz ^CProcessing /opt/webdanica/ARKIV/759-1100-20170820234355442-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz do parsed-extract on file 759-1100-20170820234355442-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz with destination /opt/workflows//automatic-workflow/SEQ_AUTOMATIC/23-08-2017-1503502525/759-1100-20170820234355442-00000-narcana-webdanica01.statsbiblioteket.dk.warc.gz