Making protein features more visible
Closed this issue · 28 comments
- 1. Change tab label in query builder to "protein features" (from protein motifs)
- Add TM domains to the Examples help text (under search box)
- 3. Add a commonly user query for ALL protein features (motifs, signal sequences and TM domains)
Add a commonly user query for ALL protein features (motifs, signal sequences and TM domains)
Do we want everything with a SO term?: https://www.pombase.org/term/SO:0000001
Do we want everything with a SO term?: https://www.pombase.org/term/SO:0000001
Probably polypeptide_region (SO:0000839) would be a better term?
I've added the commonly used query as "Genes annotated with protein features". Let me know if you can think of a better wording.
Change tab label in query builder to "protein features" (from protein motifs)
Done
Add TM domains to the Examples help text (under search box)
Done. I added "TM domains, " but if you search for that string it doesn't found anything useful. Maybe we should change the text to "transmembrane helix"?
We only use SO in this way for protein features so I think
polypeptide_region (SO:0000839) would be best
the list from the query is likt this:
https://www.pombase.org/results/from/id/384bce14-b710-48a0-9dbf-cfa3f18f1b82
can we make it like this?
https://www.pombase.org/term_genes/SO:0000839
so that people can access the subclasses?
can we make it like this?
https://www.pombase.org/term_genes/SO:0000839
Linking to that page from the commonly used queries will take a bit of work because it's not a standard results page.
But if we add the SO term ID to the query description, it will be clickable link to that page:
It that enough?
yes that would do just fine!
Actually linking directly to
https://www.pombase.org/term/SO:0000839
would be even better (i.e shows the subsets)
Actually linking directly to
https://www.pombase.org/term/SO:0000839
would be even better (i.e shows the subsets)
Perhaps we could have that link in a different list? It feels wrong to have it in the list of commonly used queries since it's not a query - it's just a link to a term page.
That's true. Let's chat about it on Tuesday.
Thoughts
- I'm not sure how useful this list is if it can't be used to access the subsets, so maybe we don't require this (because its such a heterogenous bin)"commonly query".
- 2. It seems slightly odd to have 2 separate ways to search fro TM domains (under TM domains or with the SO term), but maybe this is OK
resolve by removing "transmembrane domain" prompts from the SO search
-
3. should "signal sequence" in the prompts be "signal peptide" (name)
-
4. should we add "transit peptide" to the under the box prompt
-
5. Actually when I search on "transit peptide" or "transit_peptide" in "protein features" I don't locate
SO:0000725 , and although SO:0000725 is recognised by the search it give no results?
(resolved by providing exact term name for searching) -
6. Examples: short motifs such as TM domains, NLS or KEN box, signal sequence, cleaved region, helix
-> Examples: NLS or KEN box, signal sequence, cleaved region, transit peptidehelix(I don't think helix is in here)
OK (Re point 5 and 6) we used mitochondrial_targeting_signal (SO:0001808)
because we thought it would be a more meaningful label so the prompt would need to be
"mitochondrial targeting signal"
helix (I don't think helix is in here)
There are some transmembrane_helix annotations:
https://www.pombase.org/term/SO:0001812
It seems slightly odd to have 2 separate ways to search fro TM domains (under TM domains or with the SO term),
I agree. Searching with a SO term name is quite obscure so I'm not sure it will be a problem.
We have "TM domains" and "helix" separately in the help text. Perhaps we can combine those to just "TM helix"?
should "signal sequence" in the prompts be "signal peptide" (name)
Good point. I've fixed that. I'll re-release the site soon.
OK (Re point 5 and 6) we used mitochondrial_targeting_signal (SO:0001808)
because we thought it would be a more meaningful label so the prompt would need to be
"mitochondrial targeting signal"
True!
It would be helpful if "transit peptide" was a synonym, but SO:0001808 only has:
synonym: "mitochondrial signal sequence" EXACT []
synonym: "mitochondrial targeting signal" EXACT []
synonym: "MTS" EXACT []
I've changed "signal peptide" to "mitochondrial targeting signal" in the help text. The change will be on the main site soon.
I've changed "signal peptide" to "mitochondrial targeting signal" in the help text.
Sorry, that nonsense. I have a bit of a cold and it has made my brain mushy.
I've restored "signal peptide" to the help text and added "mitochondrial targeting signal".
helix (I don't think helix is in here)
Just make sure this "helix" refers to "TM helix" nd not to "helix" as in protein secondary structure
We have "TM domains" and "helix" separately in the help text. Perhaps we can combine those to just "TM helix"?
I think I prefer to say "transmembrane" in full here for the label (as I expect most people will search on this"
Are all transmembrane domains helices (I have no idea!)
Chat GPT
No, not all transmembrane (TM) domains are helices, although many are. Transmembrane domains can adopt different structures depending on the type of protein, its function, and the environment. The two most common types of TM domain structures are:
Alpha helices: The most prevalent structure in transmembrane proteins found in the lipid bilayer, especially in eukaryotic cells, is the alpha helix. Alpha-helical TM domains are typical for single-pass or multi-pass membrane proteins, such as G protein-coupled receptors (GPCRs), ion channels, and transporters.
Beta-barrels: Some transmembrane proteins, particularly in the outer membranes of Gram-negative bacteria, mitochondria, and chloroplasts, form beta-barrels. These proteins are made up of beta-strands that come together to create a barrel-shaped structure. Examples include porins and some transporters in the outer membrane.
So, while alpha helices are very common, TM domains can also consist of beta-strands, or even less common structures depending on the protein’s nature.
so
"transmembrane domain" (I don't think we need to distinguish the subtype at this juncture - but if we can import additional non redundant transmembrane domains from Uniprot we should (they should probably be completely non overlapping with the existing)
I agree. Searching with a SO term name is quite obscure so I'm not sure it will be a problem.
I agree. I think we could actually omit mention of "transmembrane domains" from the help text for the SO search. If people find them this way, that's OK, but the default should be the TM query tool which allows the additional functionality of TM domain number.
Just make sure this "helix" refers to "TM helix" nd not to "helix" as in protein secondary structure
ignore this comment. I forgot it was referring to the search text
I think we could actually omit mention of "transmembrane domains" from the help text for the SO search.
That makes sense. I'll do that now.
I'll do that now.
After the change it's:
Examples: short motifs such as NLS or KEN box, signal peptide, mitochondrial targeting signal,
cleaved region
Are there any other important features we could mention? There's plenty of space if we need a longer list.
That change is on pombase.org now. Can we close this?
Yep, thanks!