Skip to content

christiandietze/hadoop-log-getter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hadoop-log-getter

A simple tool to show all logs of a Hadoop job

  • Takes the job id
  • Asks the YARN History Server REST-API for the log URLs of all job attempts
  • Scrapes the log pages
  • Strips the HTML surrounding the log output
  • Returns everything on one page

Prerequisites

  • Hadoop >= 2.0, tested with 2.0.6-alpha
  • History Server installed and configured
  • Java 8
  • Maven for building

Usage

  • Set your history server hostname in src/main/resources/logGetter.properties, it is assumed to be running on the standard port (19888)
mvn package
java -jar target/log-getter-1.0-SNAPSHOT.jar

Limitations

  • Works for already finished jobs only
  • Returns all logs in a single request, could be very slow

About

A simple tool to show all logs of a Hadoop job

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages