cloudpgm3

program 3 for cloud programming

Using hadoop mapreduce to process large twitter data to determine the tweets that are natural disasters

tightvnc installation: https://www.youtube.com/watch?v=x9xTyh63Tos

connect to a slave VM from Master VM ssh slave1

connect to from slave VM to Master VM ssh master

See all files in hadoop (from Master VM) cd hadoop-1.2.1 bin/hadoop dfs -ls /

See data from a specific folder/file (from Master VM) - outputted as a json object bin/hadoop dfs -cat /Arthur | less

see the data (on slaves VM) [change directory to data folder] cd ~/data/hadoop/dfs/data/current/ [list data] ls -l | less

[temp storage]
cd ~/data/hadoop_local/ttprivate/taskTracker/visa/jobcache/job*

see commands from dfs: bin/hadoop dfs help [delete] bin/hadoop dfs -rmr /Arthur_grep_output [list all files] bin/hadoop dfs -ls / DO NOT DELETE THE ARTHUR AND KINGFIRE FOLDER

location of hadoop examples /home/visa/hadoop-1.2.1./src/examples/org/apache/hadoop/examples //check data - in one of the 3 slaves directory: /data/hadoop_local/userlogs/LASTJOB/ATTEMPT/ command: less syslog

To check MapReduce status: http://localhost:50030

To check HDFS status: http://localhost:50070

To manage jobs: bin/hadoop job

To manage HDFS: bin/hadoop dfs

To run your job with multiple reduce tasks (recommended): bin/hadoop jar socialmining.jar WordCount -Dmapred.reduce.tasks=6 /arthur /arthur_output

To use the logger in your program:

//import at the beginning of your program

import org.apache.commons.logging.Log;

import org.apache.commons.logging.LogFactory;

//declare in your class

private static final Log log = LogFactory.getLog(WordCount.class);

//call log.info or log.debug where you want to log

log.info(value.toString());

//if you log inside of a map or reduce task, go to 
/data/hadoop_local/userlogs/jobxxx
to check the log

To use the json parser class:

Google json-2008071.jar and download it (json jar 20080701.jar)

//import the jason parser at the beginning of your program

import org.json.JSONObject;

import org.json.JSONException;

TUTORIALS (hadoop and java) https://www.youtube.com/watch?v=PAaIAdtx080

Compiling and running instructions Step 0 START all namenodes cd hadoop-1.2.1 sh start-all.sh

Step 1 prepare program (compile and put into a jar file)
    compile a java program
    javac -classpath “../*:../lib/*” SocialMining.java

    [location of libraries] ~/hadoop-1.2.1/lib

    pack program into a jar 
    (NEED TO BE OUT OF DIRECTORY FOLDER THAT IS GOING TO BE PACKED)
    jar -cvf socialmining.jar -C socialmining/ .
    
Step 2 run program
    bin/hadoop jar socialmining.jar SocialMining /KingFire /KingFireout 

    [hadoop examples located 
    inside the jar file hadoop-examples-1.2.1.jar 
    or /hadoop-1.2.1/src/examples/org/apache/hadoop/examples/  
    ]
    bin/hadoop jar hadoop-examples-1.2.1.jar grep /Arthur /arthurOut12 rain

    [running without json]
    bin/hadoop jar wordcount.jar WordCount -Dmapred.reduce.tasks=6 /Kingfire /Kingfire_out1

    [running with json]
    bin/hadoop jar socialmining.jar SocialMining -libsjar json-20080701.jar -Dmapred.reduce.tasks=6 /Kingfire /Kingfire_out1

Step 3 see data run from web browser (FROM MASTER VM ONLY)
    open web browser
    localhost:50030/jobtracker.jsp

Step 4 STOP all namenodes
    cd hadoop-1.2.1 
    sh stop-all.sh

Name	Name	Last commit message	Last commit date
Latest commit LuAndreCast readme update and file organization May 6, 2015 82b1420 · May 6, 2015 History 16 Commits
Logic for Social Mining	Logic for Social Mining	adding sample of input json view and logic java file	May 6, 2015
LuisCastilloYamelPeraza	LuisCastilloYamelPeraza	adding last files	May 6, 2015
hadoop examples	hadoop examples	adding socialmining test pgm	May 1, 2015
sample input and output data	sample input and output data	readme update and file organization	May 6, 2015
shell scripts	shell scripts	adding json jar and shell scripts	May 6, 2015
Hadoop Installation Guide.pdf	Hadoop Installation Guide.pdf	adding hadoop examples	Apr 18, 2015
README.md	README.md	readme update and file organization	May 6, 2015
json-20080701.jar	json-20080701.jar	adding json jar and shell scripts	May 6, 2015
lab 3.pdf	lab 3.pdf	adding hadoop examples	Apr 18, 2015
socialMiningTemplate.java	socialMiningTemplate.java	adding template file and output file for arthur	May 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cloudpgm3

About

Releases

Packages

Languages

LuAndreCast/cloudpgm3

Folders and files

Latest commit

History

Repository files navigation

cloudpgm3

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages