xinychen/transdim

LinAlgError: SVD did not converge using LRTC-TNN

lk1983823 opened this issue · 8 comments

I have non-random missing values of about 50% orginal values with 5 feature. I try to use LRTC-TNN to restore the missing values, however, it shows LinAlgError: SVD did not converge. What can I do ? Or is there any other method can be used to impute my data? Thanks.
The original data is shown below (just ignor the last figure, bottom right one with nothing showing):

image

If I understand correctly, you have five time series. But the time series data do not involve the day dimension and thus there is not a tensor. So can you try some most basic matrix factorization models? Just like BTMF available at this repository.

My data involve day dimensions, it is 1 second interval. The data ranges from 2023-02-08 05:02:02 to 2023-02-08 07:00:00. In addition, in order to use LRTC-TNN, I reshape the data to (num_feature, num_sample, time_interval)). And the time_interval is set to 60.

If I understand correctly, you have five time series. But the time series data do not involve the day dimension and thus there is not a tensor. So can you try some most basic matrix factorization models? Just like BTMF available at this repository.

I am not sure what happened in your experiment. Would you mind trying another model and checking out the imputation performance first?

Unfortunately, the BTMF doesn't perform well. Here are my toy code:

sparse_tensor = x_values_wnan.reshape(-1, time_interval, n_feature)
sparse_tensor = np.moveaxis(sparse_tensor, -1, 0)
dim = sparse_tensor.shape
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])

dim1, dim2 = sparse_mat.shape
rank = 10
time_lags = np.array([1, 2, 60])
init = {"W": 0.01 * np.random.randn(dim1, rank), "X": 0.01 * np.random.randn(dim2, rank)}
burn_iter = 1000
gibbs_iter = 200
mat_hat, W, X, A = BTMF(_, sparse_mat, init, rank, time_lags, burn_iter, gibbs_iter)

The feature above didn't include time_feature, like timestamps.
The performance of one imputed feature is as follows:
image

Have you tried the model with more dense time_lags, e.g., time_lags = np.arange(1, 60)? I know your data is rather high-resolutional in the time dimension.

Another comment is that if you only have 5 time series, then please make sure that the rank be not greater than 5.

Another comment is that if you only have 5 time series, then please make sure that the rank be not greater than 5.

Thank you. I tried your suggestion. But it doesn't seem to work. How do you think compressed sensing?
image

Perhaps, I would like to recommend Hankel tensor completion methods for your case. Would you mind taking a look? I don't have any codes about that, but it should be not hard to implement.