From the time I started working on hadoop, I always felt the need of a tool to visualize the blocks of a HDFS file/directory. Unfortunately I could not find any such tool in open source hadoop versions. Here is an online simple tool to visualize the output of hadoop fsck command in graphical format. To use the tool just run the following command and get the output in a text file. This text file can be used by the tool to visualize the nodes, blocks and the total size on each slave nodes.
The following command collects the blocks details for the HDFS folder '/in' and creates the file fsck.txt. Please check my previous blog for details.
hadoop fsck /in -files -blocks -locations -racks > fsck.txt
Just use this file and click 'Choose File' to choose the fsck output file and voila! you see the HDFS chunks in graphical format without any software installation!
The following snapshot shows the sample output:
Visualization Tool for HDFS blocks/chunks
The following command collects the blocks details for the HDFS folder '/in' and creates the file fsck.txt. Please check my previous blog for details.
hadoop fsck /in -files -blocks -locations -racks > fsck.txt
Just use this file and click 'Choose File' to choose the fsck output file and voila! you see the HDFS chunks in graphical format without any software installation!
The following snapshot shows the sample output:
Just get started with the tool:
HDFS Blocks Visualization
Max Blocks:Help: Please click Choose File and select a fsck output file stored on your system to visualize the HDFS blocks or chunks across your hadoop cluster. You can use the Sample Data button to see the output for a sample file, or to see the contents of sample file press Show Sample Data.
Note that I have done limited testing of the tool and mostly with hadoop 1.x.x and Chrome.
Please share your comments to enhance the tool and also let me know if you see any issues. Also as all the processing happens on your browser, try to use it with a single HDFS file or a directory with limited blocks/files.