Apify act for comparing crawler execution results
This act fetches results from two crawler executions ("old" and "new"), compares them and creates a new result set based on the act settings. By default the final result set will contain only new and updated records.
INPUT
Input is a JSON object with the following properties:
{
"oldExec": OLD_EXECUTION_ID,
"newExec": NEW_EXECUTION_ID,
"idAttr": ID_ATTRIBUTE_NAME,
"return": WHICH_RECORDS_TO_RETURN, // optional, default: "new, updated"
"addStatus": ADD_TEXT_STATUS, // optional, default: false
"statusAttr": STATUS_ATTR_NAME, // optional, default: "status"
"addChanges": ADD_CHANGE_INFO, // optional, default: false
"changesAttr": CHANGES_ATTR_NAME, // optional, default: "changes"
"updatedIf": [ // optional, column list
"column_1",
"column_2",
...
],
"useDataset": USE_DATASET_STORE // optional, default: false
}
The idAttr parameter is a name of an attribute of each record, that will be used as it's ID.
The return parameter can be used to tell the act which records to include in the final result set. Possible values are new, updated, deleted and unchanged, you can provide more than one separated by comma.
The addStatus parameter sets if the act should add a status attribute to each of the resulting records. If true, it's value will be one of NEW, UPDATED, DELETED or UNCHANGED, depending on the value of return parameter.
The statusAttr parameter overrides the default status column name, where the status will be stored.
The addChanges parameter tells the act to include a list of columns that contained changes. This list will be added to a new changes column.
The changesAttr parameter overrides the default changes column name, where the changes will be stored.
The updatedIf parameter can contain an array of column names. If set, the record will be recognized as UPDATED if and only if there was a change in one of those columns. If addChanges is set to true, the changes array will contain the column names that had changes and are also present in the updatedIf array.
The useDataset parameter sets whether the result will be stored in an Apify dataset or in key-value store under the OUTPUT key.
This act can also be run from a crawler webhook, in that case the current execution will be compared with directly preceding execution (unless overridden). To use this act from a webhook, use the Finish webhook data in crawler advanced settings to set up the act.
Example webhook data:
{
"idAttr": ID_ATTRIBUTE_NAME,
"return": WHICH_RECORDS_TO_RETURN,
"addStatus": ADD_TEXT_STATUS,
"addChanges": ADD_CHANGE_INFO,
...
}
If you want to compare the current execution with a specific execution (not the one directly preceding), you can use oldExec parameter to override.