java.io.IOException: Max block location exceeded for split异常

最近集群升级到hadoop 2.0后,一个以前能正常运行的mapreduce job出现了如下的异常:

13/09/05 12:02:05 ERROR security.UserGroupInformation: PriviledgedActionException as:admin (auth:SIMPLE) cause:java.io.IOException: Max block location exceeded for split: hdfs://s001001.sqa:9000/test/baoniu/data/part-r-00002:268435456+289312157 splitsize: 20 maxsize: 10
Exception in thread "main" java.io.IOException: Max block location exceeded for split: hdfs://s001001.sqa:9000/test/baoniu/data/part-r-00002:268435456+289312157 splitsize: 20 maxsize: 10
        at org.apache.hadoop.mapreduce
.split.JobSplitWriter.writeNewSplits(JobSplitWriter.java:132)
        at org.apache.hadoop.mapreduce
.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:74)
        at org.apache.hadoop.mapreduce.
JobSubmitter.writeNewSplits(JobSubmitter.java:458)
        at org.apache.hadoop.mapreduce.
JobSubmitter.writeSplits(JobSubmitter.java:469)
        at org.apache.hadoop.mapreduce.
JobSubmitter.submitJobInternal(JobSubmitter.java:366)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.
UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
        at com.taobao.group.group_inc.run(group_inc.java:332)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at com.taobao.group.group_inc.main(group_inc.java:349)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.
invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.
invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
13/09/05 12:02:05 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException
(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
 No lease on /user/admin/.staging/job_1377833725640_0284/job.split: File does not exist. Holder DFSClient_NONMAPREDUCE_-1494218586_1 does not have any open files.
        at org.apache.hadoop.hdfs.server.
namenode.FSNamesystem.checkLease(FSNamesystem.java:2445)
        at org.apache.hadoop.hdfs.server.
namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2262)
        at org.apache.hadoop.hdfs.server.
namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2175)
        at org.apache.hadoop.hdfs.server.
namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:480)

看代码意思是输入有很多小文件,导致一个split的数据来源机器大于系统默认的10个:

      int maxBlockLocations = conf.getInt(MRConfig.MAX_BLOCK_LOCATIONS_KEY,
          MRConfig.MAX_BLOCK_LOCATIONS_DEFAULT);
      long offset = out.getPos();
      for(T split: array) {
        long prevCount = out.getPos();
        Text.writeString(out, split.getClass().getName());
        Serializer<T> serializer =
          factory.getSerializer((Class<T>) split.getClass());
        serializer.open(out);
        serializer.serialize(split);
        long currCount = out.getPos();
        String[] locations = split.getLocations();
        if (locations.length > maxBlockLocations) {
          throw new IOException("Max block location exceeded for split: "
              + split + " splitsize: " + locations.length +
              " maxsize: " + maxBlockLocations);
        }

看了下之前的0.20.X版本的代码里面并没有抛出异常,只是打印warning信息,不知道为啥在新版里面是抛出IOException,这种情况是很容易触发的。
在程序执行的时候加上:

 -Dmapreduce.job.max.split.locations=25

就可以了。或者通过调整文件大小和dfs block size来减少文件的split数量。

看到有个patch里面改为warning了,还没合到主干:https://issues.apache.org/jira/browse/MAPREDUCE-5186

发表评论

电子邮件地址不会被公开。 必填项已用*标注