Monday, November 7, 2011

MYSQL with Cloudera-VM for Hadoop..

I will discuss about how to configure hive to run using MYSQL Db instead of default derby DB in cloudera's VM.

First step is to install mysql-server which is provided in the packages.
Go to terminal and start:

$ sudo apt-get install mysql-server
It will start the installation of MYSQL.

Keep the username and password as root

After it gets installed, u can check that it works fine. For it, u go to the location where u hav installed it (/etc/mysql) and give the command:

$ mysql -uroot -proot

or if u don't want to show the password:

$ mysql -uroot -p

(it will ask for the password)

Den if u want a seperate user for hadoop then u can do that:
create user 'hadoop'@'localhost' identified by 'hadoop'
grant all privileges on *.* to 'hadoop'@'localhost'with grant option

I have done it though using the root user only. So i don't need to explicitely give the permissions (it has by default)



Then u can 'exit' from it.

After this, u need to change the hive configuration so it can use MySQL.
(/etc/hive/conf/hive-site.xml or whereever u have installed)




Now u need to get the MYSQL JDBC driver to make the connection. Go to  this site  to get the latest version. If it is not compatible with the version of your MYSQL version, check for previous versions.

After downloading the file, untar it. Copy the jar file from inside it to the lib folder of hive. (default is /usr/lib/hive/lib/).

Now u can check that hive connects to the MYSQL DB. Go to hive (/etc/hive) and type command:

$ hive

hive> show tables;

U will see that the tables which u created in your derby DB are now not available in it.. It is now connected to MYSQL DB.

U can now start playing around hive with MYSQL as DB.





No comments:

Post a Comment