FAQ-实时任务运行报错 unable to create new native thread

问题描述/异常栈

2022-01-17 14:28:23
java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:717)
    at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1378)
    at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1375)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:991)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$5(StreamTask.java:887)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:860)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:820)
    at org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:86)
    at org.apache.flink.streaming.runtime.io.CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:177)
    at org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155)
    at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:133)
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
    at java.lang.Thread.run(Thread.java:748)

FAQ-实时任务运行报错 unable to create new native thread - 图1

解决方案

系统 ulimit 参数和 pid_max 参数

ulimit -a 查看所有参数
ulimit参数主要是:open flies  max user processes
修改命令:
1
# 修改/etc/security/limits.d/20-nproc.conf,将* soft nproc 1024修改为unlimited
# 快捷命令:    sed -i 's/4096/unlimited/g' /etc/security/limits.d/20-nproc.conf
*          soft    nproc     unlimited
root       soft    nproc     unlimited
2
ulimit -n 1048576
ulimit -u 131072
3
# 在/etc/security/limits.conf中末尾添加如下信息:
* soft nofile 1048576
* hard nofile 1048576
* soft nproc 131072
* hard nproc 131072

FAQ-实时任务运行报错 unable to create new native thread - 图2

#可查看当前进程数:
ps -eLf| wc -l
#可查看当前pid_max参数值:
sysctl kernel.pid_max
或者:
cat /proc/sys/kernel/pid_max 查看pid_max参数

修改命令:
临时生效:echo 131072 > /proc/sys/kernel/pid_max
或者写入文件
# 修改/etc/sysctl.conf添加永久生效
kernel.pid_max = 131072

FAQ-实时任务运行报错 unable to create new native thread - 图3

加载系统内核参数配置,必须执行
命令如下: sysctl -p

问题原因

多为运行节点用户线程数达到上限。

作者:张鸿运