JVM最多能创建多少个线程: unable to create new native thread

有应用报出这样的异常“java.lang.OutOfMemoryError: unable to create new native thread”。甚至机器上执行shell命令也会报”-bash: fork: Resource temporarily unavailable”异常。机器上的其他应用如hadoop也会受影响:

2013-08-21 20:15:48,496 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:640)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:524)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:456)
        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128)
        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
        at java.lang.Thread.run(Thread.java:662)
2013-08-21 20:15:48,497 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..

一看以为内存不够导致无法创建新的线程,但是观察机器上的内存还有空闲,猜测是哪个地方对线程创建有限制。

首先需要排除操作系统对线程创建数的限制,参考:《JVM中可生成的最大Thread数量》一文,设置操作系统可以支持创建10万个线程:

echo "100000" > /proc/sys/kernel/threads-max 
echo "100000" > /proc/sys/kernel/pid_max     (默认32768)
echo "200000" > /proc/sys/vm/max_map_count   (默认65530)
ulimit -u unlimited   (设置max user processes的值)

当前测试环境为:

[admin@bufer108081.tbc ~]$ uname -a
Linux bufer108081.tbc 2.6.32-220.23.2.ali927.el5.x86_64 #1 SMP Mon Jan 28 14:57:06 CST 2013 x86_64 x86_64 x86_64 GNU/Linux
[admin@bufer108081.tbc ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
[admin@bufer108081.tbc ~]$ java -version
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
OpenJDK (Alibaba) 64-Bit Server VM (build 24.45-b08-internal, mixed mode)
[admin@bufer108081.tbc ~]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 387068
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[admin@bufer108081.tbc ~/dev/baoniu]$ free -g
             total       used       free     shared    buffers     cached
Mem:            47         31         15          0          3         25
-/+ buffers/cache:          3         44
Swap:            0          0          0

测试程序见本文最后面。测试结果:突破了网上所说的32000个线程数,成功创建了 10万个线程
(由于/proc/sys/kernel/pid_max默认为32768,所以网上很多测试程序测试JVM只能创建32000个线程。)

[admin@bufer108081.tbc ~/dev/baoniu]$ java -Xss128k MaxThreadsMain
The stack size specified is too small, Specify at least 228k
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
[admin@bufer108081.tbc ~/dev/baoniu]$ java -Xss228k MaxThreadsMain
4,000 threads: Time to create 4,000 threads was 0.846 seconds
8,000 threads: Time to create 4,000 threads was 2.425 seconds
12,000 threads: Time to create 4,000 threads was 4.813 seconds
16,000 threads: Time to create 4,000 threads was 7.229 seconds
20,000 threads: Time to create 4,000 threads was 10.443 seconds
24,000 threads: Time to create 4,000 threads was 14.480 seconds
28,000 threads: Time to create 4,000 threads was 19.709 seconds
32,000 threads: Time to create 4,000 threads was 24.742 seconds
36,000 threads: Time to create 4,000 threads was 31.181 seconds
40,000 threads: Time to create 4,000 threads was 36.629 seconds
44,000 threads: Time to create 4,000 threads was 42.796 seconds
48,000 threads: Time to create 4,000 threads was 48.659 seconds
52,000 threads: Time to create 4,000 threads was 55.030 seconds
56,000 threads: Time to create 4,000 threads was 60.130 seconds
60,000 threads: Time to create 4,000 threads was 67.419 seconds
64,000 threads: Time to create 4,000 threads was 73.507 seconds
68,000 threads: Time to create 4,000 threads was 79.416 seconds
72,000 threads: Time to create 4,000 threads was 85.261 seconds
76,000 threads: Time to create 4,000 threads was 92.201 seconds
80,000 threads: Time to create 4,000 threads was 98.087 seconds
84,000 threads: Time to create 4,000 threads was 108.263 seconds
88,000 threads: Time to create 4,000 threads was 114.840 seconds
92,000 threads: Time to create 4,000 threads was 121.841 seconds
96,000 threads: Time to create 4,000 threads was 127.714 seconds
After creating 99,410 threads, java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:713)
        at MaxThreadsMain.addThread(MaxThreadsMain.java:43)
        at MaxThreadsMain.main(MaxThreadsMain.java:13)

创建9W多个线程后,进程占用内存:VIRT=40.5g RES=4.7g,用free -g查看系统还有9G的空闲(free)内存。


JVM最多能启动的线程数参照公式:

(MaxProcessMemory - JVMMemory – ReservedOsMemory) / (ThreadStackSize) = Number of threads
  • MaxProcessMemory : 进程的最大寻址空间
  • JVMMemory : JVM内存
  • ReservedOsMemory : 保留的操作系统内存,如Native heap,JNI之类,一般100多M
  • ThreadStackSize : 线程栈的大小,jvm启动时由Xss指定

MaxProcessMemory:如32位的linux默认每个进程最多申请3G的地址空间,64位的操作系统可以支持到46位(64TB)的物理地址空间和47位(128T)的进程虚拟地址空间(linux 64位CPU内存限制)。

JVM内存:由Heap区和Perm区组成。通过-Xms和-Xmx可以指定heap区大小,通过-XX:PermSize和-XX:MaxPermSize指定perm区的大小(默认从32MB 到64MB,和JVM版本有关)。

线程栈ThreadStackSize:

Java程序中,每个线程都有自己的Stack Space。这个Stack Space的空间是独立分配的,与-Xmx和-Xms指定的堆大小无关。Stack Space用来做方法的递归调用时压入Stack Frame。所以当递归调用太深的时候,就有可能耗尽Stack Space,爆出StackOverflow的错误。对于32位JVM,缺省值为256KB,对于64位JVM,缺省值为512KB。最大值根据平台和特定机器配置的不同而不同。如果超过最大值,那么将报告java/lang/OutOfMemoryError消息。

可见,减少Xss指定的线程栈大小能够启动更多的线程,但是线程总数也受到系统空闲内存和操作系统的限制。

总结下影响Java线程数量的因素:

  • Java虚拟机本身:-Xms,-Xmx,-Xss;
  • 系统限制:
    /proc/sys/kernel/pid_max,
    /proc/sys/kernel/thread-max,
    max_user_process(ulimit -u),
    /proc/sys/vm/max_map_count。

ps: 最后发现是这台机器上有个应用代码问题创建了过多的线程,达到系统限制,而影响了YARN和其他应用。一般来说,单机线程数过多可以考虑使用线程池或者更多的服务器。


附测试程序:

import java.util.ArrayList;
import java.util.List;

public class MaxThreadsMain {

  public static final int BATCH_SIZE = 4000;

  public static void main(String... args) throws InterruptedException {
    List<Thread> threads = new ArrayList<Thread>();
    try {
      for (int i = 0; i <= 100 * 1000; i += BATCH_SIZE) {
        long start = System.currentTimeMillis();
        addThread(threads, BATCH_SIZE);
        long end = System.currentTimeMillis();
        Thread.sleep(1000);
        long delay = end - start;
        System.out.printf("%,d threads: Time to create %,d threads was %.3f seconds %n", threads.size(), BATCH_SIZE, delay / 1e3);
      }
    } catch (Throwable e) {
      System.err.printf("After creating %,d threads, ", threads.size());
      e.printStackTrace();
    }

  }

  private static void addThread(List<Thread> threads, int num) {
    for (int i = 0; i < num; i++) {
      Thread t = new Thread(new Runnable() {
        @Override
        public void run() {
          try {
            while (!Thread.interrupted()) {
              Thread.sleep(1000);
            }
          } catch (InterruptedException ignored) {
            //
          }
        }
      });
      t.setDaemon(true);
      t.setPriority(Thread.MIN_PRIORITY);
      threads.add(t);
      t.start();
    }
  }
}

发表评论

电子邮件地址不会被公开。 必填项已用*标注