englianhu/binary.com-interview-question

Error in start_shell(master = master, spark_home = spark_home, spark_version = version, : Failed to find 'spark-submit2.cmd' under 'C:\Users\Owner\AppData\Local\spark\spark-3.0.0-bin-hadoop2.7', please verify - SPARK_HOME.

Closed this issue · 4 comments

I tried to setup sparklyr for big-data https://github.com/englianhu/binary.com-interview-question/blob/master/binary-Q1Inter-HFT.Rmd but failed.

> library('BBmisc')
> library('sparklyr')
> sc <- spark_connect(master = 'local')
- Error in start_shell(master = master, spark_home = spark_home, spark_version = version,  : 
-   Failed to find 'spark-submit2.cmd' under 'C:\Users\Owner\AppData\Local\spark\spark-3.0.0-bin-hadoop2.7', please verify - SPARK_HOME.
> spark_home_dir()
[1] "C:\\Users\\Owner\\AppData\\Local/spark/spark-3.0.0-bin-hadoop2.7"
> spark_installed_versions()
  spark hadoop                                                              dir
1 3.0.0    2.7 C:\\Users\\Owner\\AppData\\Local/spark/spark-3.0.0-bin-hadoop2.7
> spark_home_set()
Setting SPARK_HOME environment variable to C:\Users\Owner\AppData\Local/spark/spark-3.0.0-bin-hadoop2.7
> sc <- spark_connect(master = 'local')
- Error in start_shell(master = master, spark_home = spark_home, spark_version = version,  : 
-   Failed to find 'spark-submit2.cmd' under 'C:\Users\Owner\AppData\Local\spark\spark-3.0.0-bin-hadoop2.7', please verify - SPARK_HOME.

Solved
#1

Step:

  1. https://spark.apache.org/downloads.html
  2. extract zipped file to 'C:/Users/scibr/AppData/Local/spark/spark-3.0.1-bin-hadoop3.2'.
  3. manually choose latest version : spark_home_set('C:/Users/scibr/AppData/Local/spark/spark-3.0.1-bin-hadoop3.2')

library('sparklyr')
spark_home_set('C:/Users/scibr/AppData/Local/spark/spark-3.0.1-bin-hadoop3.2')
sc <- spark_connect(master = 'local')
connection_is_open(sc)
# TRUE
smp2_sc <- copy_to(sc, smp2)
smp2_sc
# A tibble: 137 x 9
   Date       Type        Bettor Turnover   Won        Cancelled Rebates Profit   Profit_rate
   <date>     <chr>        <dbl> <chr>      <chr>      <chr>     <chr>   <chr>    <chr>      
 1 2020-05-27 1分快3         698 7994708.00 7649465.34 85262.00  10892.~ 334350.~ 4.18%      
 2 2020-05-27 极速快3        150 1817943.00 1788182.30 0.00      3426.23 26334.47 1.45%      
 3 2020-05-27 3分快3         117 279209.00  264210.09  4180.00   392.13  14606.78 5.23%      
 4 2020-05-27 龙虎斗          45 201525.00  196699.20  0.00      748.04  4077.76  2.02%      
 5 2020-05-27 5分快3          44 121834.00  117366.68  77.00     60.53   4406.80  3.62%      
 6 2020-05-27 传统1分赛车     15 83977.00   79449.58   3656.00   33.35   4494.07  5.35%      
 7 2020-05-27 骰宝            10 81745.00   68456.00   0.00      779.70  12509.30 15.30%     
 8 2020-05-27 传统1分彩       19 76025.00   75857.69   0.00      125.99  41.33    0.05%      
 9 2020-05-27 1分六合         29 55400.00   58311.08   4.00      12.89   -2923.97 -5.28%     
10 2020-05-27 1分彩           38 53185.52   54269.48   5020.00   110.61  -1194.56 -2.25%     
# ... with 127 more rows
spark_available_versions()
  spark
1   1.6
2   2.0
3   2.1
4   2.2
5   2.3
6   2.4
7   3.0