Some features have 0 standard deviation and others explode to very large values

Question

Some features have 0 standard deviation and others explode to very large values

Opened this issue 3 months ago · 0 comments

DavidAriOstenfeldt commented 3 months ago

Hi!

I'm working on a project where I use MultiRocket to extract features from a multivariate time series signal. The input signals are of shape (250, 3), where the 250 are the elements of each of the 3 input channels. We have 6900 input signals that we run through, ending up with output features of shape (6900, 50000) (roughly).

The signals are in the range -10 to 40 (after standardization).

We use the standard implementation from your code - no changes.

When investigating the outputted features we see that some features have standard deviation of 0, i.e. through all 6900 outputs those features are always the same, totally independent of the input. There are a total of 336 features where this is the case. The channels are:

1, 4, 7, 14, 15, 70, 72, 78, 112, 114, 117, 151, 174, 177, 185, 211, 240, 248, 308, 311, 316, 321, 342, 376, 384, 449, 452, 476, 478, 481, 486, 488, 491, 514, 520, 546, 551, 554, 559, 578, 580, 583, 585, 588, 787, 795, 826, 829, 850, 852, 884, 918, 921, 926, 934, 999, 1057, 1067, 1088, 1091, 1093, 1125, 1130, 1133, 1159, 1162, 1167, 1169, 1172, 1258, 1261, 1263, 1266, 1269, 1274, 1295, 1300, 1303, 1308, 1329, 1337, 1360, 1363, 1366, 1368, 1373, 1374, 1395, 1494, 1565, 1567, 1648, 1649, 1701, 1703, 1706, 1729, 1803, 1827, 1832, 1884, 1887, 1902, 2078, 2159, 2162, 2258, 2269, 2271, 2311, 2313, 2324, 2326, 2366, 2371, 2381, 2483, 2510, 2552, 2562, 2640, 2643, 2690, 2701, 2769, 2790, 2803, 2819, 2850, 2861, 2891, 2963, 2989, 3044, 3064, 3098, 3106, 3128, 3208, 3226, 3274, 3288, 3345, 3374, 3449, 3478, 3489, 3491, 3497, 3501, 3531, 3555, 3557, 3562, 3573, 3599, 3622, 3640, 3641, 3675, 3703, 3745, 3751, 3793, 3847, 3900, 3931, 3944, 3970, 3975, 4000, 4012, 4018, 4042, 4059, 4060, 4073, 4102, 4109, 4120, 4127, 4143, 4145, 4163, 4175, 4280, 4281, 4285, 4332, 4357, 4375, 4395, 4467, 4487, 4589, 4600, 4609, 4641, 4648, 4649, 4652, 4657, 4743, 4779, 4806, 4815, 4818, 4822, 4846, 4862, 4881, 4901, 4904, 4916, 4932, 4956, 4964, 4965, 4972, 4989, 5013, 5024, 5054, 5074, 5078, 5091, 5111, 5154, 5174, 5175, 5198, 5224, 5288, 5308, 5325, 5382, 5391, 5399, 5415, 5502, 5530, 5576, 5593, 5608, 5637, 5697, 5755, 5787, 5870, 5921, 6033, 6086, 12264, 12265, 12268, 12269, 12272, 12273, 12276, 12277, 12280, 12281, 12284, 12285, 12288, 12289, 12292, 12293, 12296, 12297, 12300, 12301, 12304, 12305, 12308, 12309, 12312, 12313, 12316, 12317, 12320, 12321, 12324, 12325, 12328, 12329, 12332, 12333, 12336, 12337, 12340, 12341, 12344, 12345, 12348, 12349, 12352, 12353, 12356, 12357, 12360, 12361, 12364, 12365, 12368, 12369, 12372, 12373, 12376, 12377, 12380, 12381, 12384, 12385, 12388, 12389, 12392, 12393, 12396, 12397, 12400, 12401, 12404, 12405, 12408, 12409, 12412, 12413, 12416, 12417, 12420, 12421, 12424, 12425, 12428, 12429

And they have the following means:

0.7640, 0.9080, 0.0560, 0.7280, 0.1120, 0.1200, 0.8800, 0.1760, 0.1640, 0.9240, 0.0720, 0.0600, 0.8440, 0.9880, 0.0480, 0.9760, 0.0560, 0.1120, 0.0280, 0.1760, 0.0840, 0.9920, 0.0160, 0.0040, 0.0600, 0.8840, 0.0320, 0.2000, 0.9600, 0.1080, 0.0200, 0.7800, 0.9240, 0.7120, 0.0080, 0.9320, 0.8440, 0.9880, 0.9000, 0.1600, 0.9200, 0.0680, 0.8320, 0.9760, 0.9880, 0.0480, 0.8840, 0.0320, 0.0560, 0.8160, 0.0400, 0.0280, 0.1720, 0.0840, 0.1400, 0.9640, 0.1200, 0.9360, 0.9600, 0.1080, 0.8680, 0.0960, 0.0040, 0.1520, 0.0840, 0.2280, 0.1360, 0.9000, 0.0480, 0.8920, 0.0440, 0.8040, 0.9480, 0.1000, 0.0080, 0.0280, 0.9360, 0.0840, 0.9920, 0.0160, 0.0720, 0.8560, 0.0040, 0.1480, 0.9080, 0.8200, 0.2040, 0.2240, 0.0400, 0.1600, 0.9200, 0.8600, 0.2440, 0.1080, 0.8680, 0.0160, 0.8000, 0.0680, 0.2360, 0.1440, 0.0080, 0.1520, 0.8800, 0.1080, 0.0480, 0.1920, 0.8600, 0.0640, 0.8240, 0.1080, 0.8680, 0.0720, 0.8320, 0.1160, 0.0240, 0.8400, 0.8040, 0.1200, 0.1600, 0.9760, 0.7720, 0.9160, 0.8680, 0.0720, 0.0480, 0.0680, 0.0360, 0.1440, 0.9840, 0.1880, 0.6440, 0.1480, 0.0800, 0.0880, 0.7240, 0.7120, 0.7680, 0.1720, 0.7280, 0.6040, 0.9360, 0.2880, 0.0600, 0.1360, 0.7800, 0.8600, 0.0640, 0.8240, 0.1200, 0.6440, 0.1040, 0.2720, 0.0360, 0.9440, 0.1480, 0.0800, 0.8600, 0.7360, 0.1200, 0.1080, 0.8000, 0.8440, 0.1360, 0.1800, 0.8040, 0.0520, 0.8880, 0.8560, 0.7840, 0.6960, 0.2480, 0.8280, 0.1240, 0.2880, 0.7800, 0.1640, 0.1320, 0.2080, 0.8800, 0.0840, 0.7560, 0.8640, 0.6320, 0.5080, 0.0920, 0.1960, 0.5800, 0.1080, 0.0600, 0.6080, 0.4840, 0.1240, 0.6240, 0.2640, 0.2240, 0.4240, 0.8600, 0.0880, 0.7600, 0.1440, 0.2880, 0.2000, 0.0480, 0.7960, 0.1120, 0.5480, 0.6920, 0.2240, 0.3880, 0.5000, 0.7560, 0.3960, 0.5440, 0.1280, 0.2400, 0.4040, 0.4600, 0.8400, 0.5160, 0.0120, 0.1800, 0.3800, 0.8360, 0.4760, 0.0080, 0.9680, 0.6080, 0.0360, 0.6720, 0.0560, 0.8400, 0.7720, 0.2200, 0.8560, 0.3520, 0.1240, 0.5600, 0.6160, 0.7280, 0.9560, 0.6520, 0.2240, 0.7160, 0.4480, 0.5240, 0.4440, 0.5960, 0.8160, 0.5240, 0.0040, 0.7800, 0.0280, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000

Currently, we get around this issue by setting those channels to 0 before sending them on through our neural network, but would like to solve it.

Similarly, we found that some features explode in values up to scales of 1e+32 (both positive and negative). This seems to happen in channels 4515, 4526, 4531, 4842 reliably. We tested a univariate version of the signal, but it had the same issue.

We have begun looking into why this is happening, but thought we would ask if you had any ideas or pointers?

UPDATE: We found out why we had some features exploding. In some cases the standard deviation of the features is so small that when standardizing them, they get divided by very small values, and explode. This has now been fixed, however, this still leaves us with the first problem of features being the same across inputs (even more now than before).

Best regards,
David