IGS/gEAR

Some uploads failing due to character set

Closed this issue · 3 comments

Example dataset title: Stress-induced β cell early senescence .....

On upload the error is:

Exception: Failed to insert metadata: 1366 (HY000): Incorrect string value: '\xCE\xB2 cel...' for column 'title' at row 1

To support this, need to change the encoding to utf8mb4

Here are the commands needed to patch an existing DB. @adkinsrs , you'll need to do this for your local devel instances.

ALTER TABLE dataset DROP INDEX text_idx;
ALTER TABLE dataset DROP INDEX text_with_geo_idx;
ALTER TABLE dataset DROP INDEX text_with_geo_pubmed_idx;

ALTER TABLE dataset MODIFY COLUMN title VARCHAR(255)  
    CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL;

ALTER TABLE dataset MODIFY COLUMN ldesc TEXT
    CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

ALTER TABLE dataset MODIFY COLUMN geo_id VARCHAR(50)
    CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

ALTER TABLE dataset MODIFY COLUMN pubmed_id VARCHAR(20)
    CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

ALTER TABLE dataset ADD FULLTEXT INDEX text_idx (title, ldesc);
ALTER TABLE dataset ADD FULLTEXT INDEX text_with_geo_idx (title, ldesc, geo_id);
ALTER TABLE dataset ADD FULLTEXT INDEX text_with_geo_pubmed_idx (title, ldesc, geo_id, pubmed_id);

This has been deployed on NeMO.