E_ACCESS_DELAY |
TREXS_CRUISER-COUNTER |
N |
10 |
time between 2 HTTP gets in ms |
E_CRAWLING_ALGORITHM |
TREXS_CRUISER-NUMC1 |
N |
1 |
0=down the tree; 1= same server; 2=same domain; 3 = top level domain, 4= free |
E_CREATE_SUSPENDED |
TREXS_CRUISER-NUMC1 |
N |
1 |
1=cruise is suspended after creatation |
E_CREATION_TIME |
TREXS_CRUISER-DATE_TIME |
C |
20 |
creation time of the crawl |
E_CRUISE_DOCUMENT_INFO |
TREXS_CRUISE_DOCUMENT_INFO |
v |
24 |
meta information like seedurl etc. |
E_CRUISE_EXTENSION_LIST |
TREXT_CRUISE_EXTENSION_LIST |
h |
16 |
TREX cruiser extension list for parameters |
E_CRUISE_PP_EXTENSION_LIST |
TREXT_CRUISE_EXTENSION_LIST |
h |
16 |
TREX cruiser extension list for parameters |
E_DEINDEX_FLAG |
TREXS_CRUISER-NUMC1 |
N |
1 |
1= deindex documents which got an error during update crawl |
E_DOCKEY_CONVERSION_LIST |
TREXT_CRUISE_LIST |
h |
16 |
cruiser list for convertion (e.g. document keys) |
E_DONT_INDEX_DIRS |
TREXS_CRUISER-NUMC1 |
N |
1 |
1=do not index directories |
E_EXCLUDE_DIRECTORIES |
TREXT_EXCLUDE_DIRECTORIES |
h |
8 |
directories which must not be craweld |
E_FILE_PATH |
TREXS_CRUISER-STRING_FIELD |
g |
8 |
connector type = 1 (file) where documents will be stored |
E_GET_ACCESS_RIGHTS |
TREXS_CRUISER-NUMC1 |
N |
1 |
1= extract ACLs from files |
E_HOST_INFORMATION |
TREXT_CRUISE_HOST_INFORMATION |
h |
280 |
TREX cruiser host information |
E_INDEX_ID |
TREX_RFC-INDEX_ID |
C |
64 |
indexid |
E_IS_MULTILANGUAGE_INDEX |
TREXS_CRUISER-NUMC1 |
N |
1 |
1 = using multi language index |
E_LANGUAGE |
LAISO |
C |
2 |
Language according to ISO 639 |
E_MAX_DEPTH |
TREXS_CRUISER-COUNTER |
N |
10 |
maximal depth which will be crawled |
E_MAX_DOCSIZE |
TREXS_CRUISER-COUNTER |
N |
10 |
maximal size in bytes which will be crawled |
E_MAX_RETRY_COUNT |
TREXS_CRUISER-COUNTER |
N |
10 |
max retry of HTTP gets |
E_MIN_DOCSIZE |
TREXS_CRUISER-COUNTER |
N |
10 |
minimal size in bytes which will be crawled |
E_NEGATIVE_FILE_EXTENSION |
TREXT_NEGATIVE_FILE_EXTENSIONS |
h |
8 |
list of negative files extensions which will NOT be crawled |
E_OPTIMIZE_EVERY |
TREXS_CRUISER-COUNTER |
N |
10 |
call an optimize after indexing n documents |
E_POSITIVE_FILE_EXTENSIONS |
TREXT_POSITIVE_FILE_EXTENSIONS |
h |
8 |
list of positive files extensions which will be crawled |
E_PREPROCESSOR_POOL_SIZE |
TREXS_CRUISER-POOL_SIZE |
N |
2 |
number of parallel requests to the PreProcessor processes |
E_PYTHON_COMMAND_ARGS |
TREXS_CRUISER-STRING_FIELD |
g |
8 |
Python command for scheduled execution |
E_REGULAR_EXPRESSION |
TREXT_CRUISER_REG_EXPRESSION |
h |
8 |
regular expressions for the TREX cruiser/crawler |
E_RESULT_CONNECTOR |
TREXT_CRUISE_RESULT_CONNECTOR |
h |
80 |
TREX cruiser result connector |
E_RESULT_CONNECTOR_TYPE |
TREXS_CRUISER-NUMC1 |
N |
1 |
(default )0=TREX index, 1=file, 2=dummy for python commands |
E_RETURN_CODE |
TREX_RFC-RETURN_CODE |
I |
4 |
Returncode |
E_RETURN_TEXT |
TREX_RFC-RETURN_TEXT |
C |
200 |
Returntext |
E_SCHEDULE_TIME |
TREXS_CRUISER-STRING_FIELD |
g |
8 |
if not empty, scheduler will be used |
E_USER_AGENT |
TREXS_CRUISER-STRING_FIELD |
g |
8 |
user agent for HTTP get |
E_USE_FREESTYLE_CONTAINER |
TREXS_CRUISER-NUMC1 |
N |
1 |
1= concatenate all attribute values to one |
E_USE_QUEUESERVER |
TREXS_CRUISER-NUMC1 |
N |
1 |
X=using queue server for crawling / indexing |
E_USE_RAPTOR_ATTRIBUTES |
TREXS_CRUISER-NUMC1 |
N |
1 |
1= extracting standard attributes |
E_USE_ROBOTRULES |
TREXS_CRUISER-NUMC1 |
N |
1 |
1 = crawler is using robot rules |