For bulk uploading of asset files to AEM using a CSV file to enable uploading from a local file system. Also includes support for bulk metadata mapping using the Assets API via the CSV file. A Canto metadata exporter is also provided, which will pre-populate your export CSV file.
This is a fork of https://github.com/adobe/aio-cli-plugin-aem Which is built with https://github.com/adobe/aem-upload
AEM Asset Upload via a CSV file
Upload of asset via nodejs
node bin/run aem:upload-csv -h https://<aem-hostname> -c jh-upload-service-account:<password> -i /path/to/upload.csv
Via compiled binary
./aem-asset-upload aem:upload-csv -h https://<aem-hostname> -c jh-upload-service-account:<password> -i /path/to/upload.csv
Transform the Canto exported XML file to a CSV format for use in the -i
argument by the above upload-csv
command
Command flags:
--inputxml
or-i
. Set the path to the Canto exported xml file which contains the asset metadata.--outputcsv
or-o
. Set the path and filename for the migrated CSV file for use in theupload-csv
command.--targetfolder
or-t
. Set the path to the AEM DAM folder which will be added to theaem_target_folder
column of the CSV file.--log
or-l
. The path and filename for the debug log file.
node bin/run xml:export-canto-csv -i "/Users/stewart.leach/project/john-holland/support/Sample Data/Canto XML Export Amaroo Main Sewer Project 5 Items.xml"
The format requires comma separated columns. For the import to work successfully, the csv file requires some specific column headings. The headings for defining metadata fields are optional.
- filepath
- The file to upload to AEM. This can be a relative or absolute path.
- uploaded
- This is a required column heading who's values should be left empty. It is automatically set to
true
after successful file upload to AEM. Subsequent program runs will ignore any rows where this is set totrue
- This is a required column heading who's values should be left empty. It is automatically set to
- aem_target_folder
- The AEM location of where the file will be uploaded to. This should be some path under
/content/dam
- The AEM location of where the file will be uploaded to. This should be some path under
After the required headings have been defined, we can optionally define additional column headings which will be used to create the metadata fields of the heading name, with the value specified in the row. If the value is left blank then the metadata field won't be created for that file
These are currently created on the AEM assets metadata
node.
For example, in the below sample table file1.jpg
would have its child jcr:content/metadata
node defined with the property metadata_field_1
set to the value value1_a
If you were to view an example as a spreadsheet, it would look like:
filepath | uploaded | aem_target_folder | metadata_field_1 | metadata_field_2 |
---|---|---|---|---|
/path/to/file/file1.jpg | /content/dam/jh/example-folder-1 | value1_a | value2_a | |
/path/to/file/file2.tif | /content/dam/jh/example-folder-2 | value1_b | value2_b | |
Which viewed as a text file would look like:
filepath,uploaded,aem_target_folder,metadata_field_1,metadata_field_2
/path/to/file/file1.jpg,,/content/dam/jh/example-folder-1,value1_a,value2_a
/path/to/file/file2.tif,,/content/dam/jh/example-folder-2,value1_b,value2_b
To enable changing the metadata field names from the source value to some other value (for example, if you want to populate a pre-existing AEM metadata field) you can create a metadata-mapping.csv
file.
This should live at the root of the executable.
If it exists, the metadata headings (as described above under Optional Headings
) will be transformed to the value in the to
column.
If the metadata-mapping.csv
isn't available at the root, then no mapping will be attempted.
Note. Currently, it only maps to the default AEM metadata location of jcr:content/metadata/<name>
.
The format should look like:
from | to |
---|---|
Approval (Corp.) | approval-corp |
Asset Name | asset-identifier |
Which viewed as a text file would look like:
from,to
Approval (Corp.),approval-corp
Asset Identifier,asset-identifier
Packaging the node app to a single binary executable using https://github.com/vercel/pkg
Requires the pkg
module to be installed globally with npm install -g pkg
pkg . --targets node14-win-x64
Note: In the --targets
command used to create the executable we can specify the target architecture according to:
- nodeRange
node${n}
or latest → In our case wasnode14
- platform
freebsd
,linux
,alpine
,macos
,win
→ In our casemacos
- arch
x64
,x86
,armv6
,armv7
→ In our casex64
Then run the binary version like:
./aem-asset-upload aem:upload-csv -h https://<aem-hostname> -c jh-upload-service-account:<password> -i /path/to/upload.csv
./aem-asset-upload xml:export-canto-csv -i "/path/to/canto xml export.xml"
When dealing with AEM Assets, there may be some additional configurations or manual steps that will help resolve bugs or prevent issues. These are captured here until a more programmatic fix can be provided.
You will probably want to update the assets title as part of the metadata mapping. By default AEM will display the value of dc:title
as the title.
Attempting to set this directly via the Assets Api (which this tool uses), for some reason I'm sure Adobe understand, causes the jcr:title
to be updated instead.
This results in an unchanged title being displayed for the asset. This behaviour is documented at https://experienceleague.adobe.com/docs/experience-manager-64/assets/extending/mac-api-assets.html?lang=en#update-asset-metadata
Follow these steps to create a workflow which syncs the jcr:title
to dc:title
to resolve this issue.
- Create a workflow script, can have the extension .js, or .ecma (which appears to be the AEM standard, but either work)
- Save it in crx/de as a file under
/etc/workflow/scripts
. It can also probably be saved under/apps/<project>/workflow/scripts
when added to a code repo, but I haven't tried this yet. Details from https://helpx.adobe.com/au/experience-manager/6-2/sites/developing/using/wf-customizing-extending.html
# This is a fixed version of the script provided by the Adobe documentation
# https://experienceleague.adobe.com/docs/experience-manager-64/assets/extending/mac-api-assets.html?lang=en#update-asset-metadata
#
var workflowData = workItem.getWorkflowData();
log.info("JH Content Sync executing script now...");
if (workflowData.getPayloadType() == "JCR_PATH") {
var path = workflowData.getPayload().toString();
log.info("JH Path at:" + path);
var node = workflowSession.getSession().getItem(path);
var metadataNode = node.getNode("metadata");
if (metadataNode.hasProperty("jcr:title")) {
var jcrTitle = metadataNode.getProperty("jcr:title");
log.info("JH jcrTitle 2:" + jcrTitle.getString());
metadataNode.setProperty("dc:title", jcrTitle.getString());
metadataNode.save();
}
}
- Create a new Workflow Model
- Add a new
Process Step
(filter with "process step" and it will show up) component. - Edit the Process Step component. Set a title and description in the Common tab (leave everything else) and in the Process tab select our custom workflow from the dropdown (this was previously added). Select the
Handler Advance
checkbox. Select Done. - Select the
Sync
button on the model. - Create a new Workflow Launcher so the model is automatically processed when an Asset is modified (in our case, it's metadata is updated).
- Select Create > Add Launcher. Populate the following, all other fields can be left blank/default 1. Event Type = Modified 2. Nodetype = dam:AssetContent 3. Path = /content/dam(/.*)/jcr:content 4. Workflow = 5. Description = 6. Active = selected/checked
Our custom script should then activate each time we update the asset metatdata E.g.
curl -X PUT -u admin:admin -H "Content-Type: application/json;" -d '{"class":"asset", "properties": {"metadata":{"jcr:title":"Also Working Title"}}}' http://localhost:4502/api/assets/wknd/en/activities/hiking/hiker-anapurna.jpg
Tail the error.log
and you'll see our log.info(...)
output from the .ecma script.
When working with custom ecma scripts which we've added to a workflow step, you'll need to clear it from this cache each time it's updated in crx/de
http://localhost:4502/system/console/scriptcache
Not doing this means any updates to the script aren't executed.