Nacos 出现大量线程创建的问题排查

问题描述

有用户反馈说在使用nacos时,随着程序的运行,Java线程在不断的创建,达到了两三千的情况,导致CPULoad指标达到百分之百

解决过程

观察nacos发现,这些被大量创建的线程,最终挂钩的对象为NacosConfigService

1
2
3
4
5
6
7
8
9
10
11
12
public NacosConfigService(Properties properties) throws NacosException {
String encodeTmp = properties.getProperty(PropertyKeyConst.ENCODE);
if (StringUtils.isBlank(encodeTmp)) {
encode = Constants.ENCODE;
} else {
encode = encodeTmp.trim();
}
initNamespace(properties);
agent = new MetricsHttpAgent(new ServerHttpAgent(properties));
agent.start();
worker = new ClientWorker(agent, configFilterChainManager, properties);
}

而其实的挂钩对象为ClientWorker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
@SuppressWarnings("PMD.ThreadPoolCreationRule")
public ClientWorker(final HttpAgent agent, final ConfigFilterChainManager configFilterChainManager, final Properties properties) {
this.agent = agent;
this.configFilterChainManager = configFilterChainManager;

// Initialize the timeout parameter

init(properties);

executor = Executors.newScheduledThreadPool(1, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.Worker." + agent.getName());
t.setDaemon(true);
return t;
}
});

executorService = Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors(), new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.Worker.longPolling." + agent.getName());
t.setDaemon(true);
return t;
}
});

executor.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
try {
checkConfigInfo();
} catch (Throwable e) {
LOGGER.error("[" + agent.getName() + "] [sub-check] rotate check error", e);
}
}
}, 1L, 10L, TimeUnit.MILLISECONDS);
}

因此我最初是怀疑用户是否是创建了大量的NacosConfigService对象

用户jmap数据

用户的JMAP对象直方图数据

可以看出,当前JVM中的ClientWorker对象达到了两千多个,而从上面的nacos源码分析可知,ClientWorker对象挂着线程池

用户自排查

首先让用户自行排查是否自行创建了大量的NacosConfigService实例,这是部分用户反馈确实由于自己的误操作导致创建了大量的NacosConfigService对象

Spring-Cloub-Alibaba组件检查

但是还有部分用户说,他们仅仅依赖spring-cloud-alibaba-nacos组件,没有自己操作NacosConfigService对象,仍然存在大量线程被创建的问题,最终由一个用户的自检查的反馈确定了spring-cloud-alibaba-nacosBUG

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
@ConfigurationProperties(NacosConfigProperties.PREFIX)
public class NacosConfigProperties {
...
private ConfigService configService;
...
@Deprecated
public ConfigService configServiceInstance() {

if (null != configService) {
return configService;
}

Properties properties = new Properties();
...

try {
configService = NacosFactory.createConfigService(properties);
return configService;
}
catch (Exception e) {
log.error("create config service error!properties={},e=,", this, e);
return null;
}
}
}

这个配置类中,缓存着一个ConfigService对象实例,本意是自己维护一个对象的单例,但是实际,每当spring-cloudcontext刷新后,这个NacosConfigPropertiesbean是会被重新创建的,因此,一旦有配置更新——>Context刷新——>NacosConfigProperties被重新创建——>ConfigService缓存失效——>ConfigService重新创建

因此,由于这个因果关系的存在,导致这个ConfigService的缓存在Context刷新后就无法作用了

解决PR

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
public class NacosConfigManager implements ApplicationContextAware {

private ConfigService configService;

public ConfigService getConfigService() {
return configService;
return ServiceHolder.getInstance().getService();
}

@Override
public void setApplicationContext(ApplicationContext applicationContext)
throws BeansException {
NacosConfigProperties properties = applicationContext
.getBean(NacosConfigProperties.class);
configService = properties.configServiceInstance();
ServiceHolder holder = ServiceHolder.getInstance();
if (!holder.alreadyInit) {
ServiceHolder.getInstance().setService(properties.configServiceInstance());
}
}

static class ServiceHolder {
private ConfigService service = null;

private boolean alreadyInit = false;

private static final ServiceHolder holder = new ServiceHolder();

ServiceHolder() {
}

static ServiceHolder getInstance() {
return holder;
}

void setService(ConfigService service) {
alreadyInit = true;
this.service = service;
}

ConfigService getService() {
return service;
}
}

}