faisaltheparttimecoder/mock-data

Error during committing data: ERROR #54000 index row requires 828856 bytes, maximum size is 8191

janpio opened this issue · 9 comments

Ok, this is a strange one:

mock-windows-amd64-v2.4.exe database -a localhost -w foo -f -v -d gitlab
...
7 148 167 33 67 144 136 252 232 163 164 136 0 178 186 157 232 105 251 53 155 27 27 63 104 30 77 67 212 150 66 6 94 5 251 78 21 119 42 197 158 178 36 254 137 22 194 54 130 18 245 123 27 151 206 49 160 30 202 158 155 86 32 205 93 123 50 118 203 71 108 249 50 165 160 86 18 114 180 224 107 86 31 231 125 71 117 138 145 234 42 196 3 88 53 16 109 238 49 182 19 140 53 32 20 37 89 189 122 176 102 127 38 178 195 232 15 238 240 193 45 240 190 159 9 36 118 223 99 152 141 34 51 72 119 144 243 63 42 199 230 207 160 117 254 224 186 231 159 42 91 64 154 92 89 172 90 172 160 41 56 242 77 214 229 100 30 149 211 179 201 97 51 10 115 149 90 206 151 252 104 201 199 164 20 151 171 115 103 21 168 190 218 211 220 80 3 125 200 245 155 243 149 43 226 192 81 91 36 235 231 63 6 193 69 75 148 106 54 198 58 142 115 24 83 99 228 47 246 176 111 192 59 84 185 240 14 79 1 212 184 243 18 162 16 134 81 42 161 38 102 129 151 179 89 105 79 242 38 200 69 199 70 6 243 15 151 95 178 213 103 78 254 20 119 84 44 113 156 234 245 195 148 253 150 252 121 155 38 45 32 73 36 105 175 57 186 196 212 69 183 164 92 200 104 76 137 177 141 23 17 156 83 161 40 79 130 226 103 147 195 189 246 174 199 162 141 89 83 147 127 63 236 196 122 133 72 86 177 170 136 200 80 245 109 142 213 7 76 73 103 208 13 220 247 170 251 167 149 161 191 243 162 40 162 112 171 51 37 235 120 130 180 41 224 216 164 154 254 107 154 51 118 136 209 78 152 35 1 55 33 143 193 98 37 80 117 177 176 123 182 169 202 164 98 114 47 172 20 49 14 118 249 102 109 6 44 141 235 218 29 104 231 91 245 164 86 165 179 72 243 135 84 57 157 46 247 17 21 133 231 61 226 117 209 252 133 66 4 154 5 104 7 189 55 8 253 28 62 0 90 232 13 35 145 158 107 84 97 66 231 143 200 111 159 159 42 198 146 11 45 179 80 177 15 101 107 70 107 59 189 71 93 156 109 248 1 36 0 86 202 165 63 48 194 113 69 222 61 78 32 215 194 212 157 40 15 26 90 200 71 239 213 143 179 18 245 116 18 194 159 225 132 18 139 121 236 40 233 108 100 229 190 87 18 208 90 51 185 75 246 4 127 172 202 240 27 46 95 203 201 207 215 131 243 214 98 31 13 11 132 32 98 152 183 11 102 179 235 242 230 4 69 231 58 201 239 9 229 6 177 21 8 178 204 22 38 246 160 236 197 91 68 254 210 30 73 72 26 111 246 245 60 18 70 185 26 145 151 213 57 7 102 207 58 9 146 147 70 151 103 125 63 141 182 120 190 229 245 10 184 34 185 142 250 222 110 128 139 72 154 8 130 94 233 121 154 199 74 51 187 80 231 65 173 13 1 192 167 21 128 15 119 233 32 22 108 79 236 98 82 52 8 32 233 174 199 217 219 75 244 105 35 19 243 121 71 32 106 216 125 232 216 97 119 6 240 176 186 46 50 168 43 193 122 93 39 184 251 43 116 72 132 20 88 247 127 227 121 188 187 199 163 81 72 239 188 159 64 159 156 217 82 92 79 115 183 7 111 74 188 145 235 231 28 112 92 233 165 6 34 196 104 152 243 253 69 38 108 206 250 101 58 232 186 20 228 141 211 247 232 48 228 249 38 91 81 6 241 86 63 237 223 115 100 61 165 219 209 208 72 111 19 162 96 78 135 167 110 131 210 204 46 53 66 226 67 122 143 224 244 126 22 11 237 181 242 5 0 14 197 50 48 47 127 207 52 226 24 30 115 198 124 146 238 99 236 240 0 247 87 251 248 183 47 103 141 161 203 8 61 63 239 128 9 174 124 18 94 109 124 149 252 93 186 246 212 59 235 72 92 146 196 163 230 88 124 188 125 175 144 47 82 179 162 140 125 217 210 182 125 245 208 161 67 147 83 111 156 155 174 222 213 66 84 71 37 140 123 193 34 52 23 156 234 233 241 173 55 17 119 145 85 73 169 187 85 173 203 109 134 35 154 200 228 42 148 169 95 189 70 218 112 167 60 84 233 208 127 212 55 83 117 240 145 56 52 179 237 234 73 13 201 42 156 254 77 111 136 37 89 156 134 86 68 42 46 22 145 157 65 206 121 141 108 5 202 248 136 192 70 158 14 50 121 85 205 127 25 7 195 34 145 76 6 102 141 94 233 177 236 221 208 232 159 88 195 184 216 69 70 28 170 248 233 28 224 232 6 153 95 10 252 242 246 206 238 163 208 200 3 37 245 110 242 105 149 43 68 88 67 169 222 61 11 158 242 191 39 50 0 100 111 51 195 244 100 72 94 208 23 142 35 125 148 192 64 238 23 20 171 159 197 96 197 1 71 172 252 81 33 110 8 71 87 175 5 74 244 116 20 219 16 184 38 185 97 15 60 176 197 157 33 230 150 211 0 245 88 99 49 254 26 51 179 3 68 40 144 2 164 233 103 106 61 230 74 158 92 19 18 126 152 55 69 253 111 93 220 252 105 72 33 44 220 50 127 236 93 212 94 70 13 224 21 93 45 71 180 181 8 85 39 149 147 66 54 186 203 107 243 54 12 146 188 124 163 29 0 250 97 132 40 158 137 246 55 111 23 153 172 159 140 207 234 181 57 220 32 81 59 165 119 96 109 9 253 117 219 207 161 58 94 201 103 187 8 248 182 115 193 165 226 136 243 192 179 85 161 8 125 4 4 72 156 146 30 94 50 176 116 124 24 92 151 196 193 167 86 179 186 181 206 199 95 54 198 250 228 70 105 57 30 48 114 126 46 216 155 21 254 84 224 180 55 173 53 148 66 102 204 77 62 14 61 149 81 104 6 186 74 125 31 208 59 184 34 168 120 26 229 75 38 43 222 241 154 231 49 153 64 156 208 66 141 88 249 71 102 151 144 238 161 60 131 47 82 224 78 183 37 173 69 78 21 19 200 208 209 171 214 51 132 9 138 235 230 104 69 175 196 113 206 103 148 116 22 209 22 64 178 175 8 248 105 99 176 15 234 197 10 99 139 151 36 146 61 213 194 96 232 188 161 87 219 236 5 123 241 87 75 240 207 45 145 85 237 144 180 121 211 30 246 213 248 41 24 59 125 133 105 203 178 93 205 33 58 163 228 48 28 221 36 19 41 38 126 8 164 169 90 163 152 117 70 39 14 107 190 48 241 250 210 130 215 215 154 250 188 20 158 137 41 93 151 134 139 133 12 179 77 209 16 2 218 247 152 72 199 125 1 135 42 184 225 117 17 126 184 80 168 253 93 31 20 49 0 228 49 89 155 121 8 21 22 128 163 35 159 10 152 245 207 163 16 0 137 191 198 180 251 157 27 230 169 207 109 142 87 150 241 127 111 16 227 154 53 229 216 98 68 202 112 125 98 253 110 184 216 159 129 87 105 104 37 97 10 237 128 171 6 211 159 2 129 45 248 6 35 27 18 162 58 24 109 40 245 149 212 40 254 60 69 20 224 184 73 32 65 148 117 158 159 139 22 216 4 84 246 136 248 224 109 221 111 186 68 72 209 100 45 85 158 50 139 175 1 2 112 154 186 113 131 26 236 221 205 80 8 18 224 48 204 166 90 212 128 116 253 20 69 140 143 161 211 216 193 236 137 38 253 221 47 31 15 208 112 230 187 28 19 77 254 219 251 72 9 91 208 244 47 181 76 176 185 46 254 41 99 145 14 200 185 147 13 14 233 165 72 221 28 195 134 48 247 48 46 239 16 193 234 187 223 33 167 25 135 61 129 142 64 202 100 187 50 213 166 140 69 47 82 85 4 242 112 171 229 137 87 53 46 239 150 98 50 186 223 187 239 243 240 200 155 152 113 109 21 109 119 177 128 14 124 112 228 7 181 76 190 177 42 149 3 168 107 59 243 241 119 14 70 197 182 200 193 60 202 31 85 74 56 179 77 78 177 155 15 4 181 150 82 132 50 32 187 154 97 139 84 149 173 96 13 168 238 6 42 17 185 194 54 196 152 252 40 17 79 158 36 229 153 70 41 0 77 196 57 71 27 231 183 232 231 15 205 86 244 67 211 92 14 150 220 70 120 220 159 235 172 122 93 114 187 157 27 247 69 213 180 224 251 88 89 182 70 117 63 81 15 82 28 87 190 191 91 209 37 88 244 226 248 68 98 91 174 186 71 138 4 225 55 39 199 112 7 198 128 57 43 140 95 196 39 50 24 82 225 37 210 160 196 58 76 103 53 135 188 57 164 104 24 104 152 25 198 97 185 206 236 66 42 201 117 141 148 102 242 13 56 171 152 85 36 1 22 155 231 94 220 195 120 120 37 215 154 12 64 100 193 238 46 82 202 131 44 204 224 189 162 253 87 36 182 2 141 117 225 91 73 15 182 106 241 113 90 207 130 187 184 51 164 232 65 16 44 71 83 90 104 136 96 51 162 51 235 136 186 125 43 175 233 2 200 95 90 80 145 196 12 140 68 239 90 24 201 250 118 206 210 232 42 140 83]$tenetur dolore hic at excepturi. et numquam blanditiis quia molestiae rerum aut fugit eaque. molestiae animi ut ipsam ea aut. et quam ut omnis cum. vel adipisci dolores sit eius dolor. cumque alias nihil quis sit a amet ducimus temporibus.\tut quia et eius et vel nobis. accusantium ex ut qui ut. earum voluptatum quasi molestiae vel officia. nihil odit odit aut illum pariatur ipsa vel. blanditiis eveniet porro accusantium! error nostrum beatae perferendis ex est! similique et temporibus quidem alias dolores. minus vel nihil debitis impedit facilis rem aut aut.\tsapiente rem et sunt quam voluptatem. eum iste tenetur eum ut ut consequatur doloribus nihil. dolor sunt eum dicta est id. illum laborum vero ut. eius totam et pariatur qui aut ea adipisci. quis et facere ipsum non ratione. quis architecto odit quis quibusdam quod consequatur! asperiores ut aut ullam quaerat.$molestiae reiciendis nisi eaque repellat!$-2236$5158206" file="worker.go:170"
time="2020-03-08 16:50:37" level=fatal msg="Error during committing data: ERROR #54000 index row requires 828856 bytes, maximum size is 8191" file="worker.go:171"

Schema is https://github.com/prisma/database-schema-examples/blob/master/postgres/gitlab/schema.sql

(Output is unfortunately to long for my terminal, so I can not post more context here.)

Piped the output into a to understand what is going on:

...
time="2020-03-08 20:37:50" level=debug msg="Removing constraints for table: \"public\".\"gpg_signatures\"" file="constraintsBackup.go:96"
time="2020-03-08 20:37:50" level=debug msg="Extracting constraint info for table: \"public\".\"gpg_signatures\"" file="sql.go:314"

time="2020-03-08 20:37:50" level=debug msg="Table: \"public\".\"gpg_signatures\"" file="worker.go:168"
time="2020-03-08 20:37:50" level=debug msg="Copy Statement: COPY \"public\".\"gpg_signatures\"(\"created_at\",\"updated_at\",\"project_id\",\"gpg_key_id\",\"commit_sha\",\"gpg_key_primary_keyid\",\"gpg_key_user_name\",\"gpg_key_user_email\",\"verification_status\",\"gpg_key_subkey_id\") FROM STDIN WITH CSV DELIMITER '$' QUOTE e'\\x01'" file="worker.go:169"
time="2020-03-08 20:37:50" level=debug msg="Data: 2021-10-21 04:47:21.000000$2019-06-02 10:45:36.000000$-2415721$-2877003$[155 91 65 244 35 249 101 165 205 238 224 196 186 64 8 129 38 253 188 196 74 78 244 141 176 21 160 129 115 45 23 215 198 229 151 157 109 16 174 131 68 15 171 72 141 88 1 1 240 0 70 101 132 156 128 182 11 19 217 4 101 135 72 62 171 57 9 166 231 1 ...
...

Looks to me that it creates a far to large commit_sha which the table does not accept.

That column is of type bytea:

image

Index definition: https://github.com/prisma/database-schema-examples/blob/cdba7b880b5d9cba9ec5c97c83aa083ea85d89c3/postgres/gitlab/schema.sql#L9210-L9223

CREATE UNIQUE INDEX index_gpg_signatures_on_commit_sha ON public.gpg_signatures USING btree (commit_sha);
CREATE INDEX index_gpg_signatures_on_gpg_key_id ON public.gpg_signatures USING btree (gpg_key_id);

Seems to me the generated data and the index clash here in some way.

Indexing a bytea column isn't super useful in real world since it a datatype that holds large blob objects like images etc, so do we need to fix this ? or do we reduce the size here to be with the limit of index https://github.com/pivotal-gss/mock-data/blob/0fa82557d024ea6ba9355b63ca7a6bd1eba9ff56/engine.go#L157

Doubtful

Gitlab seems to think so in their schema 🤷‍♀

Am I understanding correctly that this is caused by the index only?
Could this become another rule - "if there is an index, generate a shorter byte value that can fit into the index" - maybe?

Yup its the index on the bytea column, I believe its safer and easier to reduce the size.

Anyways its a mock data so who cares on what is the data 🤪

Seems like this is not easy like I thought, I reduced to the lowest values where we can expect a geninue Bytea type data

return RandomBytea(64 * 64), nil

anything lower than this is basically a lot of fake text, which defeat the purpose, even at this settings the index creation ends up with error

ERROR:  index row size 5480 exceeds maximum 2712 for index "t_idx"

Again, I understand we might not able to dictate on what data type user need to use since the user can add any type of data onto their data type ( like user can store text on Bytea data type ), but for the type of data we are introducing using our tool it works to have a index below which reduces the size of index

create unique index t_idx on t(md5(a));

Yes we do have a bug with multiple brackets (#24).

At this point, lets close this issue as wont fix, since for the type of data we are introducing to Bytea data type index is not possible due to postgres size restriction, I will add that to our readme as known issue.

Updated the Readme with known issues and closing, since for the type of Bytea data we are generating the index creation isn't possible

I have reduced the size and it now allows to create index on bytea column, available on next release, marking as fixed

Oh wow. Hope this does not make the tool "worse" for other users.

(If it does indeed, we can of course fall back to custom - or check for the existence of an index before reducing the size of the generated data)