- library(devtools)
- install_github('nehiljain/genewiseR')
- get_snp_ids - To get the snp ides of the snps found in the study. Using columns chr_no, snp_pos, ref_allele, in alt_allele
- generate_new_ids - Get new snp ids for snps not found in ref db.
- p_adjustment_genomewide - genomewide multiple correction [fdr hard coded]
- p_adjustment_chrwide - chromosomewide multiple correction any method (bon, fdr, etc)
- p_adjustment_summary - summary plots of comparison between padjusted and raw values genomewide and chromosomeewide
- get_significant_snps - filter significant snps
- get_nlp - add column with negative log p value
- get_max_and_mean - calculates snp count, max and mean on given column name and groups all the counts by chromosome
- get_topX_sample - Get mean of nlp(negative log p-value) of snps in the top x quartile of each gene
- get_topQ - Get mean of nlp(negative log p-value) of snps in the top quartile of each gene
- explore_topQ - explore topq for 1,5,10,20,25,50
- snp_selection - snp-selection based on algorithm
- map_snps_to_gene - It finds all the snps in genome that are in gene +/- window_size
- dir_rbind - Rowise combine all the files in a directory on a distributed cluster
- dir_merge - Combines all the files in a directory using a Full Outer Join
merge(.., all=T)
- norm_var_names - Converts character vector to sanitised varirable names
- 14 and 15? same function?
- 1
- 2
- 3
- 4 it is additional option? not always required?
- 5
- 13
- 6 is this function was used after Hein correction of topQ? is it only statistic?
- 7
- 8
- 9 where x is? 1,5,10,20,25,50 as default?
- 10 ................ 9 and 10 same function (its 9 for top25%)
- 12
combine_gwas_df <- dir_rbind("/Users/nehiljain/code/personal/genewiseR_data/raw_data/", header = F,col_names = c("chr_no","snp_pos","allele","p_value"))
ref_df <- read_tsv("~/code/personal/genewiseR_data/ref/indels.Bos_taurus.vcf", comment = "##", progress = T, trim_ws = T)
result_df <- get_snp_ids(combine_gwas_df, ref_df )