添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Description

On our cluster, we've seen Pig( http://incubator.apache.org/pig/ ) filling up the /tmp and failing.
(also inefficient since all the local tasks were spilling to the same disk)

Pig is simply using java api createTempFile,

http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File

Can we add -Djava.io.tmpdir="./tmp" somewhere ?

so that,

1) Tasks can utilize all disks when using tmp
2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) is done.

The easiest way is to set it inside mapred.child.java.opts in the config, but this can be overwritten if the users set their own task heapsize.

Attachments

  1. patch-2735.txt
    10 kB
    Amareshwari Sriramadasu
  2. patch-2735.txt
    10 kB
    Amareshwari Sriramadasu
  3. patch-2735.txt
    8 kB
    Amareshwari Sriramadasu
  4. patch-2735.txt
    8 kB
    Amareshwari Sriramadasu
  5. patch-2735.txt
    2 kB
    Amareshwari Sriramadasu
  6. patch-2735.txt
    2 kB
    Amareshwari Sriramadasu
  7. patch-2735.txt
    0.7 kB
    Amareshwari Sriramadasu

Activity

People