PDA

View Full Version : How to clear job history in Hadoop?



QEERT
11-07-2023, 08:37 PM
I'm having trouble clearing task history in Hadoop and don't really understand how to do it. Maybe someone knows a detailed way to clear task history in Hadoop without affecting cluster performance?

feev_red
11-07-2023, 08:40 PM
Clearing job history in Hadoop is not a complicated thing. You can use the command 'mapred job -history -clear'. But it's best to first consider whether it's necessary to do this, because removing the history can affect data analysis and tracking. If you find it difficult to understand, here is a link (http://devhubby.com/thread/how-to-clear-hadoop-jobs-history) to find information about Hadoop job history.

benauther
11-09-2023, 10:47 AM
To clear job history in Hadoop, first, stop the JobHistory Server using mapred --daemon stop historyserver. Then, remove the existing job history data, typically stored in HDFS (hdfs dfs -rm -r /user/history/*). If the data is stored locally, clear the local directory as well (rm -rf /path/to/local/history/directory/*). Finally, restart the JobHistory Server with mapred --daemon start historyserver. This process ensures the removal of historical job data, facilitating a clean slate. Verify permissions and customize paths based on your Hadoop configuration, taking necessary precautions and backups before performing cleanup actions.