Pinned Repositories
cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
Diffusion-sources
Some sources about diffusion.
econvRBP
RNA binding proteins (RBPs) determine RNA process from synthesis to decay, which play a significant role in RNA transport, translation and degradation. Therefore, exploring RBPs function from the amino acid sequence using computational methods have become one of the momentous topics in genome annotation. However, no successful works have been achieved yet since follow: (1) shallow feature: the sequence-determining structure is self-evident, but it is difficult to analyze the essential features from simple sequence. (2) Poorly understand: feature-based prediction methods mainly emphasize feature extraction, while in-depth understanding of protein mysteries limits the application of feature engineering. (3) Feature fusion: multi-feature fusion is often used in the prediction of RBPs, but the features are not well integrated. In view of these challenges, we propose a novel ensemble convolutional neural network (econvRBP) to predict RBPs. Meanwhile, we also provide a web server to verify other RBPs for biologists in this field.In order to capture the local and global features of RNA binding proteins simultaneously, first of all, Conjoint Triad and One Hot encoding methods are used to transform amino acid sequence into local and global features, respectively. After that the local and global features are combined with an ensemble method for further high-level feature extraction using convolutional neural networks. Some experiments were constructed to evaluate our method with 10-fold cross validation and the results show that it has achieved the best performance among all the predictors so far. We correctly predicted 97\% of 2875 RBPs and 99\% of 6872 non-RBPs with accuracy of 0.99. Matthew correlation coefficient (MCC) of 0.99, precision of 0.99, and the area under the curve (AUC) of 0.99. In addition, the training sets and testing sets provided by RBPPred are used to validate our models. The homologous sequences of the training set are removed with a threshold of 25\%. Achieving an accuracy of 0.87 at econvRBP simultaneously on the processed training set and testing set. These results indicate that the econvRBP is the most excellent method at present, and will provide reliable guidance for the detection of RBPs.
gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
llama3
The official Meta Llama 3 GitHub site
neurips_llm_efficiency_challenge
neurips_llm_efficiency_challenge at USTC
RePair
This is offical implement of Automated Program Repair with Process-based Feedback.
Semantic-Aligned-Code-Summarization
speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
TnTWoW's Repositories
TnTWoW/RePair
This is offical implement of Automated Program Repair with Process-based Feedback.
TnTWoW/Diffusion-sources
Some sources about diffusion.
TnTWoW/econvRBP
RNA binding proteins (RBPs) determine RNA process from synthesis to decay, which play a significant role in RNA transport, translation and degradation. Therefore, exploring RBPs function from the amino acid sequence using computational methods have become one of the momentous topics in genome annotation. However, no successful works have been achieved yet since follow: (1) shallow feature: the sequence-determining structure is self-evident, but it is difficult to analyze the essential features from simple sequence. (2) Poorly understand: feature-based prediction methods mainly emphasize feature extraction, while in-depth understanding of protein mysteries limits the application of feature engineering. (3) Feature fusion: multi-feature fusion is often used in the prediction of RBPs, but the features are not well integrated. In view of these challenges, we propose a novel ensemble convolutional neural network (econvRBP) to predict RBPs. Meanwhile, we also provide a web server to verify other RBPs for biologists in this field.In order to capture the local and global features of RNA binding proteins simultaneously, first of all, Conjoint Triad and One Hot encoding methods are used to transform amino acid sequence into local and global features, respectively. After that the local and global features are combined with an ensemble method for further high-level feature extraction using convolutional neural networks. Some experiments were constructed to evaluate our method with 10-fold cross validation and the results show that it has achieved the best performance among all the predictors so far. We correctly predicted 97\% of 2875 RBPs and 99\% of 6872 non-RBPs with accuracy of 0.99. Matthew correlation coefficient (MCC) of 0.99, precision of 0.99, and the area under the curve (AUC) of 0.99. In addition, the training sets and testing sets provided by RBPPred are used to validate our models. The homologous sequences of the training set are removed with a threshold of 25\%. Achieving an accuracy of 0.87 at econvRBP simultaneously on the processed training set and testing set. These results indicate that the econvRBP is the most excellent method at present, and will provide reliable guidance for the detection of RBPs.
TnTWoW/cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
TnTWoW/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
TnTWoW/llama3
The official Meta Llama 3 GitHub site
TnTWoW/neurips_llm_efficiency_challenge
neurips_llm_efficiency_challenge at USTC
TnTWoW/Semantic-Aligned-Code-Summarization
TnTWoW/speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
TnTWoW/tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models