dmwm/DBS

File properties 'create_by' and 'creation_date' not supported in DBSServer

Closed this issue · 9 comments

As tracked in this WMCore issue: dmwm/WMCore#9042

those two properties are always null and, if the client (WMAgent) provides them, the block fails to get inserted into the database with an error message HTTP Error 400: illegal variable name/number from input.

Further details can be found in the proposed fix within WMCore: dmwm/WMCore#9203

@amaltaro
Alan,

DBS supports block's created_by and creation_date. Here is an example of a block I just inserted with DBS bulk insert API a few minutes: https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader/blocks?block_name=/unittest_web_primary_ds_name_20190708/Summer2011-pstr-v10/GEN-SIM-RAW%23141444&detail=1

Here is the relative part of the XML file I used:
'block': {u'create_by': 'Yuyi', u'creation_date': 1279999999, u'open_for_writing': 1, u'block_name': '/unittest_web_primary_ds_name_20190708/Summer2011-pstr-v10/GEN-SIM-RAW#141444', u'file_count': 10, u'origin_site_name': 'cmssrm.fnal.gov', u'block_size': 20122119010, 'origin_site_name': 'my_site'}
You will find that the inster_date was the date that the block was actually created in DBS, not the date listed in the XML file. DBS does the same thing for the created_by too.

If you do provide these two fields, DBS will just ignore it. It will not give errors.

Hi Alan/Yuyi,

I think there is some confusion here due to the issue title. These fields work at the block level, but they don't work at the file level. Take a look at the two cases here: dmwm/WMCore#9203

The second case is where the fields are never populated, because they are never set in the WMCore code. The proposed fix here dmwm/WMCore@5685cd1 adds those fields, which causes the error in DBSServer.

@yuyiguo Could you test this at the file level?

@yuyiguo can you please test this bulk insert with the information at file level, as suggested by Erik? Thanks

Sorry Erik and Yuyi, I fixed the issue title and reviewed what was initially reported. Indeed, block properties do work fine, but the file properties are always null, as can be seen in this example:
https://cmsweb.cern.ch/dbs/prod/global/DBSReader/files?block_name=/WminusH_HToZZTo4L_M190_13TeV_powheg2-minlo-HWJ_JHUGenV7011_pythia8/RunIIFall17MiniAODv2-PU2017_12Apr2018_94X_mc2017_realistic_v14-v2/MINIAODSIM%239fef307a-6fe4-4dce-9df1-ca700e5dc7a9&detail=true

@yuyiguo could you please run a quick test on your side to confirm what we see and evaluate whether we want/should provide this information or not?

@amaltaro
Did you mean the creation_date and create_by were null, right?
DBS always insert files in a block so the block and its files' creation_date and create_ by are the the same. In addition, there are last_modification_date and last_modified_by for each file. If a file is not updated, then these two are the same as the creation_date and create_by .
Yuyi

Did you mean the creation_date and create_by were null, right?

Yes, Yuyi. However, if you look at the files API link I posted above, you can see those have null values; while the blocks API reports a non-null value for those two creation_date and create_by attributes.

So I'm lost here. Wasn't the files API supposed to respond with the same block level attributes?
Or is it WMAgent that must insert those attributes at file level as well?

Files:
have null in creation_date and create_by, but the last_modification_dat and last_modified_by are filled with the creation time and creator when the files were created in DBS. In other words, creation_date and create_by are useless. Don't use creation_date and create_by.

Blocks:
creation_date and create_by were filled with the creation time and creator when the blocks were created in DBS.

Because all files in DBS were created with the blocks. So the files and blocks have the same value regarding the creation time and creator even if the ones of files are not filled.

WMAgent will not need to fill the creation time and creator. DBS will get the creation time and creator automatically.

Files:
have null in creation_date and create_by, but the last_modification_dat and last_modified_by are filled with the creation time and creator when the files were created in DBS. In other words, creation_date and create_by are useless. Don't use creation_date and create_by.

Given their uselessness, you might consider than deprecating those fields in the future, such that we have less things to store in the database . Problem is that it might break some clients here and there that try to do anything with them.

Anyways, thanks for following this issue up, Yuyi. Feel free to close it.
I'm also going to close the WMCore PR proposing changes to fix it.

@amaltaro
Alan,
As you know that DBS3 had to import data from DBS2 and some fields were for DBS2 data. Now we don't think them are necessary, but they were used in the past.

I am closing the TK.
Yuyi