블로그 이미지

Rurony's Training Gym

Rurony의 트레이닝 도장! by Rurony


하둡 클러스터 환경 및 R 분석 서버 구축

※ 개발 서버 구성

/etc/hosts

192.168.0.101   namenode
192.168.0.102   datanode01
192.168.0.103   datanode02
192.168.0.104   datanode03

각 서버에 호스트 등록

※ 하둡 환경 구축

  • 하둡 설치
    $ get http://apache.mirror.cdnetworks.com/hadoop/common/hadoop-2.6.2/hadoop-2.6.2.tar.gz
    $ tar xvfz hadoop-2.6.2.tar.gz
    $ mv hadoop-2.6.2 /usr/local/hadoop
    $ chown -R hadoop:hadoop /usr/local/hadoop
    
  • 하둡 계정 생성 -> hadoop/hadoop (each server)
    $ useradd hadoop
    $ passwd hadoop
    
  • 하둡 계정 SSH Key 배포
    $ ssh-keygen -t rsa
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@namenode
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode01
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode02
    $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode03
    
  • 하둡 데이터 폴더 생성 -> (each server)
    $ mkdir -p /data/hadoop/tmp
    $ mkdir -p /data/hadoop/dfs/name
    $ mkdir -p /data/hadoop/dfs/data
    
    chown -R hadoop:hadoop /data/hadoop/
    
  • 환경 변수 등록 -> (each server)

~/.bash_profile

# User specific environment and startup programs
############################################################
### Java
############################################################
export JAVA_HOME=/usr/java/jdk1.7.0_79
export CALSSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin

############################################################
### Hadoop
############################################################
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

#export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/navive" 

alias HADOOP_START_ALL=$HADOOP_HOME/sbin/start-all.sh
alias HADOOP_STOP_ALL=$HADOOP_HOME/sbin/stop-all.sh
  • 하둡 설정

$HADOOP_HOME/etc/hadoop/hadoop_env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_79
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
#export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
#export HADOOP_OPTS="${HADOOP_OPTS} -Djava.library.path=$HADOOP_HOME/lib" 

$HADOOP_HOME/etc/hadoop/core-site.xml

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://192.168.0.101:9000</value>
        </property>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/data/hadoop/tmp/</value>
        </property>
</configuration>

$HADOOP_HOME/etc/hadoop/hdfs-site.xml -> namenode

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.permissions.enabled</name>
                <value>false</value>
        </property>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>datanode01:50090</value>
        </property>
        <property>
                <name>dfs.namenode.secondary.https-address</name>
                <value>datanode01:50091</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/data/hadoop/dfs/name</value>
        </property>
        <!--
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/data/hadoop/dfs/data</value>
        </property>
        -->
</configuration>

$HADOOP_HOME/etc/hadoop/hdfs-site.xml -> datanodes

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.permissions.enabled</name>
                <value>false</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/data/hadoop/dfs/data</value>
        </property>
</configuration>

$HADOOP_HOME/etc/hadoop/mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
</configuration>

$HADOOP_HOME/etc/hadoop/yarn-site.xml

<configuration>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>namenode</value>
        </property>
        <property>
                <name>yarn.nodemanager.hostname</name>
                <value>namenode</value> <!-- or hslave1, hslave2, hslave3 -->
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
</configuration>

$HADOOP_HOME/etc/hadoop/slaves

namenode
datanode01
datanode02
datanode03
  • 하둡 설치 파일 및 설정 파일 배포
    $ scp -r /usr/local/hadoop hadoop@datanode03:/usr/local/hadoop
    $ scp -r /usr/local/hadoop hadoop@datanode02:/usr/local/hadoop
    $ scp -r /usr/local/hadoop hadoop@datanode01:/usr/local/hadoop
    
    $ chown -R hadoop:hadoop /usr/local/hadoop
    
  • 하둡 실행 --> namenode
    -- HDFS format
    $ hadoop namenode -format <- 최초 1회
    
    -- start
    $ $HADOOP_HOME/sbin/start-all.sh
    
    -- stop
    $ $HADOOP_HOME/sbin/stop-all.sh
    
  • 하둡 실행 확인
    $ jps
    -- namenode
        NodeManager
        ResourceManager
        NameNode
        DataNode
    
    -- datanode
        SecondaryNameNode (192.168.0.102 only)
        NodeManager
        DataNode
    
  • 하둡 Web 관리 Console

NameNode Infomation --> http://192.168.0.101:50070/
Nodes of the cluster --> http://192.168.0.101:8088/

  • hdfs 확인 : namenode or datanode file add check, cluster namenode에서 조회 확인

    $ hadoop fs -mkdir /input $ hadoop fs -copyFromLocal start-all.sh /input

  • MapReduce 확인

    $ hadoop fs -copyFromLocal README.txt /input $ hadoop jar WordCount-1.0-SNAPSHOT.jar WordCount /output /input/README.txt $ hadoop fs -cat /output/part-r-00000 | more

※ R 환경 구축

  • Extra Packages for Enterprise Linux (EPEL)
    $ wget http://mirror.us.leaseweb.net/epel/6/x86_64/epel-release-6-8.noarch.rpm
    $ rpm -ivh epel-release-6-8.noarch.rpm
    
  • R Base 설치
    $ yum install R
    
  • RStudio Server 설치
    $ wget https://download2.rstudio.org/rstudio-server-rhel-0.99.489-x86_64.rpm
    $ sudo yum install --nogpgcheck rstudio-server-rhel-0.99.489-x86_64.rpm
    
  • RStudio Server 설치 확인
    $ rstudio-server verify-installation
        rstudio-server stop/waiting
        rstudio-server start/running, process 32626
    
  • RStudio Web 관리 Console

http://192.168.0.101:8787/
Developer user : os 계정

※ Zookeeper 환경 구축

  • Zookeeper 설치
    $ wget http://apache.mirror.cdnetworks.com/zookeeper/current/zookeeper-3.4.6.tar.gz
    $ tar xvzf zookeeper-3.4.6.tar.gz
    $ mv zookeeper-3.4.6 /usr/local/zookepper
    $ chown -R hadoop:hadoop /usr/local/zookepper
    
  • Zookeeper 데이터 폴더 생성 -> (each server)
    $ mkdir -p /data/zookeeper
    
    $ chown -R hadoop:hadoop /data/zookeeper/
    
  • 환경 변수 등록 -> (each server)

~/.bash_profile

export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
  • Zookeeper 설정

$ZOOKEEPER_HOME/conf/zoo.cfg <- dataDir & servers 등록

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

server.1=namenode:2888:3888
server.2=datanode01:2888:3888
server.3=datanode02:2888:3888
server.4=datanode03:2888:3888
  • Zookeeper 설치 파일 및 설정 파일 배포
    $ scp -r /usr/local/zookeeper root@datanode01:/usr/local/zookeeper
    $ scp -r /usr/local/zookeeper root@datanode02:/usr/local/zookeeper
    $ scp -r /usr/local/zookeeper root@datanode03:/usr/local/zookeeper
    
    $ chown -R hadoop:hadoop /usr/local/zookepper
    
  • Zookeeper myid 파일 생성 --> 서버 자신의 고유 ID (each server)

/data/zookeeper/myid

namenode -> 1
datanode01 -> 2
datanode02 -> 3
datanode03 -> 4
  • Zookeeper 실행 -> (each server)
    $ zkServer.sh start
    
  • Zookeeper 실행 확인
    $ jps
        QuorumPeerMain
    

※ HBase 환경 구축

  • HBase 설치
    $ wget http://mirror.apache-kr.org/hbase/0.98.16.1/hbase-0.98.16.1-hadoop2-bin.tar.gz
    $ tar xvzf hbase-0.98.16.1-hadoop2-bin.tar.gz
    $ mv hbase-0.98.16.1-hadoop2 /usr/local/hbase
    $ chown -R hadoop:hadoop /usr/local/hbase
    
  • 환경 변수 등록 -> (each server)
    export HBASE_HOME=/usr/local/hbase
    export PATH=$PATH:$HBASE_HOME/bin
    
  • HBase 설정

$HBASE_HOME/conf/hbase-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_79
export HBASE_CLASSPATH=/usr/local/hadoop/etc/hadoop <- hadoop conf dir
export HBASE_MANAGES_ZK=false

$HBASE_HOME/conf/regionservers -> only datanodes

datanode01
datanode02
datanode03

$HBASE_HOME/conf/hbase-site.xml

<configuration>
        <property>
                <name>hbase.rootdir</name>
                <value>hdfs://namenode:9000/hbase</value>
        </property>
        <property>
                <name>hbase.zookeeper.quorum</name>
                <value>namenode,datanode01,datanode02,datanode03</value>
        </property>
        <property>
                <name>hbase.zookeeper.property.dataDir</name>
                <value>/data/zookeeper</value>
        </property>
        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>
        <property>
                <name>hbase.dynamic.jars.dir</name>
                <value>/usr/local/hbase/lib</value>
        </property>
</configuration>
  • HBase 설치 파일 및 설정 파일 배포
    $ scp -r /usr/local/hbase root@datanode01:/usr/local/hbase
    $ scp -r /usr/local/hbase root@datanode02:/usr/local/hbase
    $ scp -r /usr/local/hbase root@datanode03:/usr/local/hbase
    
    $ chown -R hadoop:hadoop /usr/local/hbase
    
  • HBase 실행 -> namenode
    $ start-hbase.sh
    
  • 하둡 실행 확인
    $ jps
    -- namenode
        HMaster
    
    -- datanode
        HRegionServer
    
  • Hbase Web 관리 Console

HBase Master: namenode --> http://192.168.0.101:60010/

※ R HBase 연동 환경 구축

  • Apache Thrift 설치

Install the dependencies for Thrift.

$ yum -y install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel openssl-devel

Download Thrift archive

$ wget http://archive.apache.org/dist/thrift/0.8.0/thrift-0.8.0.tar.gz
$ tar -xvzf thrift-0.8.0.tar.gz
$ cd thrift-0.8.0
$ ./configure --without-ruby --without-python
$ make
$ make install
$ ln -s /usr/local/lib/libthrift-0.8.0.so /usr/lib64
  • 환경 변수 등록 (root user)
    export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
    
  • Thrift 실행
    hbase-daemon.sh start thrift
    
  • Thrift 확인
    $ jps
        ThriftServer
    
  • rhbase package 설치 (R COMMAND INSTALL)
    $ wget -O rhbase_1.2.1.tar.gz https://github.com/RevolutionAnalytics/rhbase/blob/master/build/rhbase_1.2.1.tar.gz?raw=true
    
    $ R CMD INSTALL rhbase_1.2.1.tar.gz


Top

Spring JAXB를 활용한 Tree 구현

Spring OXM모듈의 JAXB를 활용한 Tree 구현 예제.

1. spring-oxm 라이브러리 추가 (maven dependency 사용)
jar download url : http://mvnrepository.com/artifact/org.springframework/spring-oxm/3.0.5.RELEASE


2. Spring Bean 설정

XML 응답을 위한 MarshallingView 등록하고 XML 출력을 위한 바운드 객체를 property로 주입.

<bean id="xmlViewer" class="org.springframework.web.servlet.view.xml.MarshallingView">

<constructor-arg>

<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">

<property name="classesToBeBound">

<list>

<value>com.rurony.format.common.tree.TreeNode</value>

</list>

</property>

</bean>

</constructor-arg>

</bean>


3. XML 출력을 위한 바운드 객체

XML 출력을 위해 JAXB annotation을 설정하고 Tree 구현을 위해 객체 내에 하위 객체를 포함하도록 구현.

@XmlRootElement(name="Tree")

@XmlAccessorType(XmlAccessType.FIELD)

public class TreeNode {

@XmlAttribute

private String option;

@XmlAttribute

private String status;

@XmlAttribute

private String name;

@XmlAttribute

private int depth;

@XmlAttribute

private int seq;

@XmlAttribute

private String type;

@XmlAttribute

private String parentId;

@XmlAttribute

private String id;

@XmlElement(name="Node")

protected List<TreeNode> nodes = new ArrayList<TreeNode>();

 

public TreeNode() {

}


public TreeNode(String id, String parentId, String type, int seq, String name, String status, String option) {

this.id = id;

this.parentId = parentId;

this.type = type;

this.seq = seq;

this.depth = 0;

this.name = name;

this.status = status;

this.option = option;

}


public void add(TreeNode node) {

if (id.equals(node.parentId)) {

node.depth = depth + 1;

nodes.add(node);

return;

}

Iterator<TreeNode> iter = nodes.iterator();

while (iter.hasNext()) {

iter.next().add(node);

}

}  


[... Setter/Getter ...]

public List<TreeNode> getNodes() {

return nodes;

}


public void setNodes(List<TreeNode> nodes) {

this.nodes = nodes;

}

}


4. MakeTreeNode Class

XML 바운드 객체을 사용하여 Tree XML을 구현하는 객체.

public class MakeTreeNode {

TreeNode treeNode;

public MakeTreeNode(String id, String parentId, String type, int seq, String name, String status, String option) {

treeNode = new TreeNode(id, parentId, type, seq, name, status, option);

}

public MakeTreeNode() {

treeNode = new TreeNode("0", "-1", "1", 1, "ROOT", "1", "");

}


public void add(TreeNode node) {

treeNode.add(node);

}


public void addAll(List<TreeNode> nodes) {

Iterator<TreeNode> iter = nodes.iterator();

while (iter.hasNext()) {

add(iter.next());

}

}

public TreeNode getTreeXmlNode() {

return treeNode;

}

} 


5. Test Tree Util

Domo Tree 추가

public static TreeNode getTestTreeXml() {

List<TreeNode> treeList = new ArrayList<TreeNode>();

treeList.add(new TreeNode("1", "0", "type", 1, "Node 1", "status", "option"));

treeList.add(new TreeNode("2", "1", "type", 1, "Node 1-1", "status", "option"));

treeList.add(new TreeNode("3", "1", "type", 2, "Node 1-2", "status", "option"));

treeList.add(new TreeNode("4", "0", "type", 2, "Node 2", "status", "option"));

treeList.add(new TreeNode("5", "4", "type", 1, "Node 2-1", "status", "option"));

treeList.add(new TreeNode("6", "4", "type", 2, "Node 2-2", "status", "option"));

treeList.add(new TreeNode("7", "5", "type", 1, "Node 2-1-1", "status", "option"));

treeList.add(new TreeNode("8", "5", "type", 2, "Node 2-1-2", "status", "option"));

treeList.add(new TreeNode("9", "0", "type", 3, "Node 3", "status", "option"));

treeList.add(new TreeNode("10", "9", "type", 1, "Node 3-1", "status", "option"));

MakeTreeNode tree = new MakeTreeNode("0", "-1", "1", 1, "Tree Root", "status", "option");

tree.addAll(treeList);

return tree.getTreeXmlNode();

}


6. Controller

@ResponseBody annotation을 사용하여 Url Mapping.

@RequestMapping("/testTreeXML")

public @ResponseBody TreeNode treeXML() {

return MakeTreeUtil.getTestTreeXml();

}


7. Domo Page

Top

Dbunit + MySql 사용 시 경고 메시지 해결

1. IDatabaseTester 또는 IDatabaseConnection 사용 시 경고 메시지 
WARN : org.dbunit.dataset.AbstractTableMetaData - Potential problem found: The configured data type factory 'class org.dbunit.dataset.datatype.DefaultDataTypeFactory' might cause problems with the current database 'MySQL' (e.g. some datatypes may not be supported properly). In rare cases you might see this message because the list of supported database products is incomplete (list=[derby]). If so please request a java-class update via the forums.If you are using your own IDataTypeFactory extending DefaultDataTypeFactory, ensure that you override getValidDbProducts() to specify the supported database products.


2. MySqlConnection 사용 예제
MySqlConnection, OracleConnection, Db2Connection 등 해당 DBMS의 Connection 사용
private final static String driverClass = "com.mysql.jdbc.Driver";
private final static String connectionUrl = "jdbc:mysql://localhost/testdb";
private final static String username = "testdb";
private final static String password = "testdb";
private final static String schema = "testdb";

//private IDatabaseTester databaseTester;

public static Connection getConnection() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
Connection conn = null;
try {
Class.forName(driverClass);
conn = DriverManager.getConnection(connectionUrl, username, password);
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return conn;
}

@Before
public void setUp() throws Exception {
//databaseTester = new JdbcDatabaseTester(driverClass, connectionUrl, username, password);
//IDatabaseConnection connection = new DatabaseConnection(getConnection());
IDatabaseConnection connection = new MySqlConnection(getConnection(), schema);
try {
IDataSet dataSet = new FlatXmlDataSetBuilder().build(new File("resetDB.xml"));
//DatabaseOperation.CLEAN_INSERT.execute(databaseTester.getConnection(), dataSet);
DatabaseOperation.CLEAN_INSERT.execute(connection, dataSet);
} catch (Exception e) {
// TODO: handle exception
//databaseTester.getConnection().close();
connection.close();
}
}
Top

prev 1 2 3 4 ··· 8 next