Method get_event_attribute_values does not get attributes.
Catadanna opened this issue · 4 comments
Hallo,
I have a problem filtering according to attributes. I load my data in a dataframe from a csv file. I have attributes such as 'participant', 'complexity', etc.
Here is my code :
import pandas as pd
import pm4py
df = pd.read_csv(file_path)
event_log = pm4py.format_dataframe(df, case_id='variant_instance_id', activity_key='name', timestamp_key='end_timestamp')
event_log_final = pm4py.convert_to_event_log(event_log)
resources = pm4py.get_event_attribute_values(event_log_final, "org:resource")
print(resources)
The print here is an empty dict.
On the other hand, if I do this, I get the activities, it is OK :
activities = pm4py.get_event_attribute_values(event_log, "concept:name")
Here is the error I get:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py", line 3802, in get_loc
return self._engine.get_loc(casted_key)
File "index.pyx", line 153, in pandas._libs.index.IndexEngine.get_loc
File "index.pyx", line 182, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'org:resource'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
(XXX)
resources = pm4py.get_event_attribute_values(event_log, "org:resource")
File "/home/adminid/.local/lib/python3.10/site-packages/pm4py/stats.py", line 165, in get_event_attribute_values
return get.get_attribute_values(log, attribute, parameters=parameters)
File "/home/adminid/.local/lib/python3.10/site-packages/pm4py/statistics/attributes/pandas/get.py", line 158, in get_attribute_values
attributes_values_dict = df[attribute_key].value_counts().to_dict()
File "/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py", line 4090, in __getitem__
indexer = self.columns.get_loc(key)
File "/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py", line 3809, in get_loc
raise KeyError(key) from err
KeyError: 'org:resource'
The problem concerns accessing the attributes.
Please investigate or tell me where is the problem.
Thank you in advance.
Dear @Catadanna ,
Can you print the columns of the dataframe?
print(event_log.columns)
I think you do not have 'org;resource' among the attributes of your event log.
I do not have org:ressource
among the columns. I have time:timestamp
and concept:name.
Do I have to declare org:resource
?
Here are my columns :
Index(['name', 'taskId', 'complexity', 'retries', 'start_timestamp', 'end_timestamp',
'variant_instance_id', 'task_rank', 'variant_name',
'status', 'case:concept:name', 'concept:name', 'time:timestamp',
'@@index', '@@case_index'],
dtype='object')
I use Python 3.10 and pm4py version 2.7.10.1.
You can apply the get_event_attribute_values only for the attributes (columns of the CSV) that are in the file.
That is what I did. I sent you the column names. What shall I do ? The library adds time:timestamp
as column names why don't I have org:resource
?
I apply this :
print("Ressources", resources)
And the result is :
Ressources {}