/hadoopiii2016

hadoopiii2016

Primary LanguageJava

Big Data之處理與分析實務班

==========================

Slides

Centos 6.6 檔案下載

安裝步驟影片


CentOS 安裝

Ambari Server 安裝步驟

安裝步驟文字

prepare machine

  • su -
  • ssh-keygen
  • cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  • ssh localhost

setup hostname

  • ifconfig
  • vi /etc/hosts
  • hostname master
  • hostname -f

setup ntp

  • chkconfig ntpd on
  • service ntpd start

Java Download

於安裝主機上建立軟連結

  • ln -s /usr/java/jdk1.7.0_79 /usr/java/java

於安裝主機設定環境變數

  • vi /etc/profile
export JAVA_HOME=/usr/java/java
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib/rt.jar
export PATH=$PATH:$JAVA_HOME/bin

立即更新

  • source /etc/profile

檢查 Java 版本

  • java -version

於安裝主機關閉防火牆

  • chkconfig iptables off
  • service iptables stop

於安裝主機設定SELlinux

  • vi /etc/selinux/config 將 SELINUX=disabled
  • setenforce 0

HDP建議關閉 Transparent Huge Pages,於安裝主機上執行

  • echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
  • echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

確定登入帳號為 root 在 master 上執行 (需要網路環境好的環境下)

設定 Ambari

  • ambari-server setup

安裝 Ambari

  • ambari-server start

連線至Server

  • 0.0.0.0:8080

使用local repo (修改url)

private key

  • cat ~/.ssh/id_rsa

修改 replication

  • HDFS -> block replication -> 1

建議設定

使用Hive View

  • Services > HDFS > Configs.

  • Custom core-site -> Click Add Property:

hadoop.proxyuser.root.groups=*
hadoop.proxyuser.root.hosts=*

leave safemode

  • su - hdfs
  • hadoop dfsadmin -safemode leave

Ambari Memory Revision (with root)

  • vi /var/lib/ambari-server/ambari-env.sh
  • 將參數-Xmx2048m 修改成-Xmx4096m -XX:PermSize=128m -XX:MaxPermSize=128m
  • ambari-server stop
  • ambari-server start

get wiki count

eclipse (hadoop)

Use WordCount.java

eclipse include jar

  • a. /usr/hdp/2.3.4.0-3485/hadoop/client/*.jar
  • b. /usr/hdp/2.3.4.0-3485/hadoop-mapreduce/*.jar

Process Data

  • head part-r-00000
  • cat part-r-00000 | sort -k 2 -nr | head
  • cat part-r-00000 | sort -k 2 -nr | awk '{if(length($0)>10) print $0}' | head

link right java version (root)

  • su -
  • rm /usr/bin/java
  • ln -s /usr/java/java/bin/java /usr/bin/java
  • java -version

move jar

  • su -
  • mv wc* /home/hdfs/
  • chown hdfs:hdfs -R /home/hdfs/wc*
  • su - hdfs
  • hadoop jar wc.jar /tmp/wc.txt /tmp/out

##設定權限

  • su - hdfs
  • hadoop fs -mkdir /user/admin
  • hadoop fs -chown admin:hadoop /user/admin

Download Purchase Order

Examine DB (root)

  • su -
  • mysql
  • show databases;
  • use hive;
  • show tables;
  • select * from DBS;
  • select * from TBLS;
  • select * from COLUMNS_V2;