yjh126yjh/TianChi_Protein-Secondary-Structure-Prediction

about the initial features of protein

Closed this issue · 2 comments

Hi,thanks for sharing your work!

As a novice, I am curious about the initial features of protein. Besides the one-hot of amino acid, how can we obtain the other features? Is there any reference?

Thanks for your help!~

Each of the last 17 dimensions represents a chemical/physical property of amino acid. Here is the reference: Table of standard amino acid abbreviations and properties
The 38 features are:

  • 0: G
  • 1: P
  • 2: T
  • 3: E
  • 4: S
  • 5: K
  • 6: C
  • 7: L
  • 8: M
  • 9: V
  • 10: D
  • 11: A
  • 12: R
  • 13: I
  • 14: N
  • 15: H
  • 16: F
  • 17: W
  • 18: Y
  • 19: Q
  • 20: X
  • 21: Side Chain Class is acid
  • 22: Side Chain Class is aliphatic
  • 23: Side Chain Class is amide
  • 24: Side Chain Class is aromatic
  • 25: Side Chain Class is basic
  • 26: Side Chain Class is cyclic
  • 27: Side Chain Class is hydroxyl-containing
  • 28: Side Chain Class is sulfur-containing
  • 29: Side Chain Polarity is acidic polar
  • 30: Side Chain Polarity is nonpolar
  • 31: Side Chain Polarity is polar
  • 32: Side Chain Polarity is basic polar
  • 32: Side Chain Polarity is basic polar
  • 33: Side Chain Charge is negative
  • 34: Side Chain Charge is neutral
  • 35: Side Chain Charge is positive
  • 36: Hydropathy Index (non onehot. Original value is divided by 5)
  • 37: Molecular Mass(non onehot. Original value is subtracted by mean then divided by 70)

Understand. Thanks again for your answer!