bernardodionisi/differences

Having error when clustering standard errors

Opened this issue · 3 comments

Hi. I am having this error when I use cluster_var. The variable is str. What could be the issue? I would greatly appreciate your help. Thanks in advance.


AttributeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 att_g.fit(
2 formula='####', control_group = 'never_treated', cluster_var = 'user')

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.9/lib/python3.9/site-packages/differences/attgt/attgt.py in fit(self, formula, weights_name, control_group, base_delta, est_method, as_repeated_cross_section, boot_iterations, random_state, alpha, cluster_var, split_sample_by, n_jobs, backend, progress_bar)
674 cluster_groups = None
675 if cluster_var:
--> 676 cluster_groups = get_cluster_groups(
677 data=(
678 self._data_matrix[cluster_var]

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.9/lib/python3.9/site-packages/differences/attgt/mboot.py in get_cluster_groups(data, cluster_var)
178 raise ValueError("can't have more than 2 cluster variables")
179
--> 180 if find_time_varying_covars(data=data, covariates=cluster_var):
181 raise ValueError("can't have time-varying cluster variables")
182

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.9/lib/python3.9/site-packages/differences/tools/panel_utility.py in find_time_varying_covars(data, covariates, rtol, atol)
346
347 if rtol is None and atol is None:
--> 348 varying = data.groupby([entity_name])[covariates].nunique().max(axis=0)
349 return list(varying[varying > 1].index)
350

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.9/lib/python3.9/site-packages/pandas/core/base.py in getitem(self, key)
236
237 if isinstance(key, (list, tuple, ABCSeries, ABCIndex, np.ndarray)):
--> 238 if len(self.obj.columns.intersection(key)) != len(set(key)):
239 bad_keys = list(set(key).difference(self.obj.columns))
240 raise KeyError(f"Columns not found: {str(bad_keys)[1:-1]}")

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.9/lib/python3.9/site-packages/pandas/core/generic.py in getattr(self, name)
5573 ):
5574 return self[name]
-> 5575 return object.getattribute(self, name)
5576
5577 def setattr(self, name: str, value) -> None:

AttributeError: 'Series' object has no attribute 'columns'

Hi Anuar, thanks for reporting this! I'll check and get back to you asap.

I may need to fix something for a second level of clustering, at the moment I am not sure when I'll have the time to do that. I am guessing you are trying to cluster by a variable that is not your entity, right? If it's your entity, then when bootstrapping the clustering is on entity by default.

Yes, I am clustering by the entity. Thanks for the prompt response.