Reactome - Discrepancy between internal database and web presentation?
nataled opened this issue · 3 comments
I found something odd. The following error messages were generated from my processing of the 'Liam' data. Each of these point out when the stored information for sequence length in Reactome differs from UniProtKB. What's odd is that these errors are not visible when looking at the web presentation (at least for the ones I spot checked). An example:
https://www.reactome.org/content/detail/R-HSA-419522 (shows chain 1-1451)
https://reactome.org/content/schema/instance/browser/419522 (shows endCoordinate 1471)
EWAS R-HSA-9707859 GO:0005654 O15172 1 223 PGRMC2 [nucleoplasm] (O15172 length = 72)
EWAS R-HSA-1236845 GO:0030670 Q03518 1 808 TAP1 [phagocytic vesicle membrane] (Q03518 length = 748)
EWAS R-HSA-400122 GO:0005789 O60427 1 501 FADS1 [endoplasmic reticulum membrane] (O60427 length = 444)
EWAS R-HSA-5216184 GO:0005759 Q5K4L6 1 730 SLC27A3 [mitochondrial matrix] (Q5K4L6 length = 683)
EWAS R-HSA-391965 GO:0005789 Q93084 1 1043 ATP2A3 [endoplasmic reticulum membrane] (Q93084 length = 999)
EWAS R-HSA-198188 GO:0005886 P01850 1 177 TRBC1 [plasma membrane] (P01850 length = 176)
EWAS R-HSA-5690190 GO:0005654 Q9UK80 1 1055 USP21 [nucleoplasm] (Q9UK80 length = 565)
EWAS R-HSA-2682364 GO:0005829 Q9HBY8 1 427 SGK2 [cytosol] (Q9HBY8 length = 367)
EWAS R-HSA-9612177 GO:0005654 Q15742 1 585 NAB2 [nucleoplasm] (Q15742 length = 525)
EWAS R-HSA-419393 GO:0005886 Q9HBW0 1 351 LPAR2 [plasma membrane] (Q9HBW0 length = 348)
EWAS R-HSA-5216046 GO:0005886 Q0D2K0 1 466 NIPAL4 [plasma membrane] (Q0D2K0 length = 404)
EWAS R-HSA-3239030 GO:0005654 Q8TEK3 1 1739 DOT1L [nucleoplasm] (Q8TEK3 length = 1537)
EWAS R-HSA-195060 GO:0005829 O94827 1 1062 PLEKHG5 [cytosol] (O94827 length = 1006)
EWAS R-HSA-3244651 GO:0005789 Q9NSU2 1 369 TREX1 [endoplasmic reticulum membrane] (Q9NSU2 length = 314)
EWAS R-HSA-947474 GO:0005789 Q03518 1 808 TAP1 [endoplasmic reticulum membrane] (Q03518 length = 748)
EWAS R-HSA-418301 GO:0005886 P20020 1 1258 ATP2B1 [plasma membrane] (P20020 length = 1220)
EWAS R-HSA-9751650 GO:0005789 Q8NH41 1 348 OR4K15 [endoplasmic reticulum membrane] (Q8NH41 length = 324)
EWAS R-HSA-442723 GO:0005886 Q13972 1 1275 RASGRF1 [plasma membrane] (Q13972 length = 1273)
EWAS R-HSA-8866674 GO:0005829 P60896 1 70 SEM1 [cytosol] (P60896 length = )
EWAS R-HSA-174204 GO:0005654 Q9UJX3 1 599 ANAPC7 [nucleoplasm] (Q9UJX3 length = 565)
EWAS R-HSA-5625607 GO:0005829 P54252 1 364 ATXN3 [cytosol] (P54252 length = 361)
EWAS R-HSA-140222 GO:0005741 Q9BXH1 1 193 BBC3 [mitochondrial outer membrane] (Q9BXH1 length = )
EWAS R-HSA-5667017 GO:0005829 Q8TCX5 1 695 RHPN1 [cytosol] (Q8TCX5 length = 670)
EWAS R-HSA-1955379 GO:0005758 Q16635 1 292 TAZ [mitochondrial intermembrane space] (Q16635 length = 262)
EWAS R-HSA-3215094 GO:0005654 P11309 1 404 PIM1 [nucleoplasm] (P11309 length = 313)
EWAS R-HSA-3777103 GO:0005829 O95278 1 331 EPM2A [cytosol] (O95278 length = )
EWAS R-HSA-70451 GO:0005829 P60174 1 286 TPI1 [cytosol] (P60174 length = 249)
EWAS R-HSA-6809866 GO:0005829 Q8IUG1 1 177 KRTAP1-3 [cytosol] (Q8IUG1 length = 167)
EWAS R-HSA-391290 GO:0005829 Q5XUX1 1 488 FBXW9 [cytosol] (Q5XUX1 length = 458)
EWAS R-HSA-2023876 GO:0005789 Q96BZ4 1 762 PLD4(1-762) [endoplasmic reticulum membrane] (Q96BZ4 length = 506)
EWAS R-HSA-2454158 GO:0005654 Q96N38 1 555 ZNF714 [nucleoplasm] (Q96N38 length = 554)
EWAS R-HSA-63500 GO:0005654 P0DPB5 1 133 POLR1D [nucleoplasm] (P0DPB5 length = )
EWAS R-HSA-8849372 GO:0005829 Q86Y91 1 864 KIF18B [cytosol] (Q86Y91 length = 852)
EWAS R-HSA-400325 GO:0005654 P35398 1 556 RORA [nucleoplasm] (P35398 length = 523)
EWAS R-HSA-912371 GO:0005635 O94901 1 812 SUN1 [nuclear envelope] (O94901 length = 785)
EWAS R-HSA-1299435 GO:0005743 Q16635 1 292 TAZ [mitochondrial inner membrane] (Q16635 length = 262)
EWAS R-HSA-5228649 GO:0005654 Q9HCS4 1 598 TCF7L1 [nucleoplasm] (Q9HCS4 length = 588)
EWAS R-HSA-6806195 GO:0005576 Q8N1F8 1 1099 STK11IP [extracellular region] (Q8N1F8 length = 1088)
EWAS R-HSA-52425 GO:0005886 P29973 1 690 CNGA1 [plasma membrane] (P29973 length = 686)
EWAS R-HSA-400502 GO:0005886 Q5NUL3 1 377 FFAR4 [plasma membrane] (Q5NUL3 length = 361)
EWAS R-HSA-1472879 GO:0005829 Q96AX9 1 1013 MIB2 [cytosol] (Q96AX9 length = 955)
EWAS R-HSA-2454148 GO:0005654 A8MWA4 1 302 ZNF705E [nucleoplasm] (A8MWA4 length = 300)
EWAS R-HSA-446173 GO:0005654 O95718 1 508 ESRRB [nucleoplasm] (O95718 length = 433)
EWAS R-HSA-5212665 GO:0005829 Q9UK80 1 1055 USP21 [cytosol] (Q9UK80 length = 565)
EWAS R-HSA-416413 GO:0005886 Q9NPC1 1 389 LTB4R2 [plasma membrane] (Q9NPC1 length = 358)
EWAS R-HSA-140220 GO:0005829 Q9BXH1 1 193 BBC3 [cytosol] (Q9BXH1 length = )
EWAS R-HSA-9033109 GO:0005829 Q6QHF9 1 649 PAOX [cytosol] (Q6QHF9 length = 511)
EWAS R-HSA-4127440 GO:0005829 Q3LFD5 1 358 USP41 [cytosol] (Q3LFD5 length = )
EWAS R-HSA-9751614 GO:0005789 Q8NGV7 1 314 OR5H2 [endoplasmic reticulum membrane] (Q8NGV7 length = 309)
EWAS R-HSA-141345 GO:0005782 Q6QHF9 1 649 PAOX [peroxisomal matrix] (Q6QHF9 length = 511)
EWAS R-HSA-5683239 GO:0005886 Q9NZS2 1 232 KLRF1 [plasma membrane] (Q9NZS2 length = 231)
EWAS R-HSA-8874123 GO:0000139 O43889 1 395 CREB3 [Golgi membrane] (O43889 length = 371)
EWAS R-HSA-8866665 GO:0005654 P60896 1 70 SEM1 [nucleoplasm] (P60896 length = )
EWAS R-HSA-5625371 GO:0097542 Q13099 1 832 IFT88 [ciliary tip] (Q13099 length = 824)
EWAS R-HSA-421234 GO:0030054 Q9Y5I7 1 305 CLDN16 [cell junction] (Q9Y5I7 length = 235)
EWAS R-HSA-9818451 GO:0005654 Q17RH7 1 258 TPRXL [nucleoplasm] (Q17RH7 length = 139)
EWAS R-HSA-6783285 GO:0005654 P54252 1 364 ATXN3 [nucleoplasm] (P54252 length = 361)
EWAS R-HSA-426060 GO:0005886 Q13574 1 1117 DGKZ [plasma membrane] (Q13574 length = 928)
EWAS R-HSA-977490 GO:0005886 Q9UGI6 1 736 KCNN3 [plasma membrane] (Q9UGI6 length = 731)
EWAS R-HSA-427902 GO:0031095 Q93084 1 1043 ATP2A3 [platelet dense tubular network membrane] (Q93084 length = 999)
EWAS R-HSA-5419294 GO:0005654 Q9GZS1 1 481 POLR1E [nucleoplasm] (Q9GZS1 length = 419)
EWAS R-HSA-947542 GO:0005829 O96033 1 88 MOCS2 [cytosol] (O96033 length = )
EWAS R-HSA-9645669 GO:0005829 Q8N726 1 132 p14ARF [cytosol] (Q8N726 length = )
EWAS R-HSA-1964466 GO:0005829 P0DPB5 1 133 POLR1D [cytosol] (P0DPB5 length = )
EWAS R-HSA-5358384 GO:0005829 Q9BZG8 1 443 DPH1 [cytosol] (Q9BZG8 length = 438)
EWAS R-HSA-198175 GO:0005886 P01848 1 142 TRAC [plasma membrane] (P01848 length = 140)
EWAS R-HSA-5690794 GO:0005829 Q15843 1 88 NEDD8(1-88) [cytosol] (Q15843 length = 81)
EWAS R-HSA-174242 GO:0005829 Q9UJX3 1 599 ANAPC7 [cytosol] (Q9UJX3 length = 565)
EWAS R-HSA-8875419 GO:0005829 Q9UJ41 1 708 RABGEF1 [cytosol] (Q9UJ41 length = 491)
EWAS R-HSA-2029032 GO:0031901 Q9UJ41 1 708 RABGEF1 [early endosome membrane] (Q9UJ41 length = 491)
EWAS R-HSA-164387 GO:0005886 P63092 1 394 GNAS2 [plasma membrane] (P63092 length = )
EWAS R-HSA-186631 GO:0005829 P10997 1 828 IAPP(1-828) [cytosol] (P10997 length = 89)
EWAS R-HSA-975008 GO:0005654 A8MXY4 1 1036 ZNF99 [nucleoplasm] (A8MXY4 length = 864)
EWAS R-HSA-3322944 GO:0005654 O75486 1 399 SUPT3H [nucleoplasm] (O75486 length = 317)
EWAS R-HSA-419772 GO:0097381 P03999 1 348 OPN1SW [photoreceptor disc membrane] (P03999 length = 345)
EWAS R-HSA-8855200 GO:0005829 Q99871 1 368 HAUS7 [cytosol] (Q99871 length = 358)
EWAS R-HSA-9707674 GO:0005635 O15172 1 223 PGRMC2 [nuclear envelope] (O15172 length = 72)
EWAS R-HSA-390809 GO:0005886 P21917 1 467 DRD4 [plasma membrane] (P21917 length = 419)
EWAS R-HSA-59932 GO:0005743 P56556 1 154 NDUFA6 [mitochondrial inner membrane] (P56556 length = 128)
EWAS R-HSA-8854063 GO:0005829 Q96ME1 1 805 FBXL18 [cytosol] (Q96ME1 length = 718)
EWAS R-HSA-52777 GO:0005654 P16220 1 341 CREB1 [nucleoplasm] (P16220 length = 327)
EWAS R-HSA-6809637 GO:0005829 P19013 1 534 KRT4 [cytosol] (P19013 length = 520)
EWAS R-HSA-9645695 GO:0005759 Q8N726 1 132 p14ARF [mitochondrial matrix] (Q8N726 length = )
EWAS R-HSA-49925 GO:0005829 P23109 1 780 AMPD1 [cytosol] (P23109 length = 747)
EWAS R-HSA-976950 GO:0005576 P01160 1 153 NPPA(1-153) [extracellular region] (P01160 length = 151)
EWAS R-HSA-442469 GO:0005654 Q9Y618 1 2525 NCOR2 [nucleoplasm] (Q9Y618 length = 2514)
EWAS R-HSA-162563 GO:0005829 Q06124 1 597 PTPN11 [cytosol] (Q06124 length = 593)
EWAS R-HSA-380260 GO:0005829 Q99996 1 3911 AKAP9 [cytosol] (Q99996 length = 3907)
EWAS R-HSA-376257 GO:0005829 Q9NXR1 1 346 NDE1 [cytosol] (Q9NXR1 length = 335)
EWAS R-HSA-430060 GO:0005829 O43602 1 441 DCX [cytosol] (O43602 length = 365)
EWAS R-HSA-5610385 GO:0005929 Q13099 1 832 IFT88 [cilium] (Q13099 length = 824)
EWAS R-HSA-6810279 GO:0005829 P60409 1 375 KRTAP10-7 [cytosol] (P60409 length = 370)
EWAS R-HSA-6809608 GO:0005829 O76011 1 436 KRT34 [cytosol] (O76011 length = 394)
EWAS R-HSA-8847510 GO:0000139 Q13948 1 678 CUX1 [Golgi membrane] (Q13948 length = )
EWAS R-HSA-5667147 GO:0005886 Q9UNG2 1 199 TNFSF18 [plasma membrane] (Q9UNG2 length = 177)
EWAS R-HSA-870499 GO:0005829 Q93008 1 2570 USP9X [cytosol] (Q93008 length = 2554)
EWAS R-HSA-8937745 GO:0005654 Q06124 1 597 PTPN11 [nucleoplasm] (Q06124 length = 593)
EWAS R-HSA-376229 GO:0005829 P49454 1 3207 CENPF [cytosol] (P49454 length = 3114)
EWAS R-HSA-2671913 GO:0005886 Q96FT7 1 647 ASIC4 [plasma membrane] (Q96FT7 length = 539)
EWAS R-HSA-6785941 GO:0005829 Q9BUX1 1 264 CHAC1 [cytosol] (Q9BUX1 length = 222)
EWAS R-HSA-8863953 GO:0033116 Q03518 1 808 TAP1 [endoplasmic reticulum-Golgi intermediate compartment membrane] (Q03518 length = 748)
EWAS R-HSA-6799230 GO:0035578 Q8N1F8 1 1099 STK11IP [azurophil granule lumen] (Q8N1F8 length = 1088)
EWAS R-HSA-8874165 GO:0005789 O43889 1 395 CREB3 [endoplasmic reticulum membrane] (O43889 length = 371)
EWAS R-HSA-9751682 GO:0005789 A6NMZ5 1 311 OR4C45 [endoplasmic reticulum membrane] (A6NMZ5 length = 306)
EWAS R-HSA-429782 GO:0000139 Q86VZ5 1 419 SGMS1 [Golgi membrane] (Q86VZ5 length = 413)
EWAS R-HSA-419966 GO:0005886 Q9Y5I7 1 305 CLDN16 [plasma membrane] (Q9Y5I7 length = 235)
EWAS R-HSA-2872286 GO:0030667 O60645 1 756 EXOC3 [secretory granule membrane] (O60645 length = 745)
EWAS R-HSA-1629813 GO:0005654 Q8N726 1 132 p14ARF [nucleoplasm] (Q8N726 length = )
EWAS R-HSA-2872485 GO:0005886 O00476 1 498 SLC17A3(1-498) [plasma membrane] (O00476 length = 420)
EWAS R-HSA-174889 GO:0005654 Q96AP0 1 544 ACD [nucleoplasm] (Q96AP0 length = 458)
EWAS R-HSA-8943140 GO:0005789 Q9MY60 1 181 HLA-B B-60 [endoplasmic reticulum membrane] (Q9MY60 length = )
EWAS R-HSA-2980987 GO:0005829 Q9UBK8 1 725 MTRR [cytosol] (Q9UBK8 length = 698)
EWAS R-HSA-913703 GO:0005796 Q8WXI7 1 22152 MUC16 [Golgi lumen] (Q8WXI7 length = 14507)
EWAS R-HSA-2586609 GO:0005654 A6NNF4 1 738 ZNF726 [nucleoplasm] (A6NNF4 length = 616)
EWAS R-HSA-388592 GO:0005886 P41968 1 360 MC3R(1-360) [plasma membrane] (P41968 length = 323)
EWAS R-HSA-5623379 GO:0005829 O60645 1 756 EXOC3 [cytosol] (O60645 length = 745)
EWAS R-HSA-6803747 GO:0005829 P51606 1 427 RENBP [cytosol] (P51606 length = 417)
EWAS R-HSA-49927 GO:0005829 Q01433 1 879 AMPD2 [cytosol] (Q01433 length = 825)
EWAS R-HSA-9714330 GO:0005829 P42167 1 633 TMPO [cytosol] (P42167 length = )
EWAS R-HSA-419522 GO:0005654 Q9BXW9 1 1471 FANCD2 [nucleoplasm] (Q9BXW9 length = 1451)
Done - fixes logged here -
mismatch list.xlsx
This issue arose because, when we updated the Reactome local copies of UniProt records to conform to UniProt, we did not include a check for changed chain lengths in the UniProt records (with an alert to re-edit affected entityWithAccessionedSequence instances. We still need to implement that update check feature so I'm re-opening the ticket to track progess there.
The update check feature should now be implemented at Reactome but the ticket stays open until the feature is confirmed to work as expected.