问题描述
有用户反馈说在使用nacos
时,随着程序的运行,Java
线程在不断的创建,达到了两三千的情况,导致CPU
的Load
指标达到百分之百
解决过程
观察nacos
发现,这些被大量创建的线程,最终挂钩的对象为NacosConfigService
1 2 3 4 5 6 7 8 9 10 11 12
| public NacosConfigService(Properties properties) throws NacosException { String encodeTmp = properties.getProperty(PropertyKeyConst.ENCODE); if (StringUtils.isBlank(encodeTmp)) { encode = Constants.ENCODE; } else { encode = encodeTmp.trim(); } initNamespace(properties); agent = new MetricsHttpAgent(new ServerHttpAgent(properties)); agent.start(); worker = new ClientWorker(agent, configFilterChainManager, properties); }
|
而其实的挂钩对象为ClientWorker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
| @SuppressWarnings("PMD.ThreadPoolCreationRule") public ClientWorker(final HttpAgent agent, final ConfigFilterChainManager configFilterChainManager, final Properties properties) { this.agent = agent; this.configFilterChainManager = configFilterChainManager;
init(properties);
executor = Executors.newScheduledThreadPool(1, new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); t.setName("com.alibaba.nacos.client.Worker." + agent.getName()); t.setDaemon(true); return t; } });
executorService = Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors(), new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); t.setName("com.alibaba.nacos.client.Worker.longPolling." + agent.getName()); t.setDaemon(true); return t; } });
executor.scheduleWithFixedDelay(new Runnable() { @Override public void run() { try { checkConfigInfo(); } catch (Throwable e) { LOGGER.error("[" + agent.getName() + "] [sub-check] rotate check error", e); } } }, 1L, 10L, TimeUnit.MILLISECONDS); }
|
因此我最初是怀疑用户是否是创建了大量的NacosConfigService
对象
用户jmap
数据

可以看出,当前JVM
中的ClientWorker
对象达到了两千多个,而从上面的nacos
源码分析可知,ClientWorker
对象挂着线程池
用户自排查
首先让用户自行排查是否自行创建了大量的NacosConfigService
实例,这是部分用户反馈确实由于自己的误操作导致创建了大量的NacosConfigService
对象
Spring-Cloub-Alibaba
组件检查
但是还有部分用户说,他们仅仅依赖spring-cloud-alibaba-nacos
组件,没有自己操作NacosConfigService
对象,仍然存在大量线程被创建的问题,最终由一个用户的自检查的反馈确定了spring-cloud-alibaba-nacos
的BUG
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| @ConfigurationProperties(NacosConfigProperties.PREFIX) public class NacosConfigProperties { ... private ConfigService configService; ... @Deprecated public ConfigService configServiceInstance() {
if (null != configService) { return configService; }
Properties properties = new Properties(); ...
try { configService = NacosFactory.createConfigService(properties); return configService; } catch (Exception e) { log.error("create config service error!properties={},e=,", this, e); return null; } } }
|
这个配置类中,缓存着一个ConfigService
对象实例,本意是自己维护一个对象的单例,但是实际,每当spring-cloud
的context
刷新后,这个NacosConfigProperties
的bean
是会被重新创建的,因此,一旦有配置更新——>Context
刷新——>NacosConfigProperties
被重新创建——>ConfigService
缓存失效——>ConfigService
重新创建
因此,由于这个因果关系的存在,导致这个ConfigService
的缓存在Context
刷新后就无法作用了
解决PR
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
| public class NacosConfigManager implements ApplicationContextAware {
private ConfigService configService;
public ConfigService getConfigService() { return configService; return ServiceHolder.getInstance().getService(); }
@Override public void setApplicationContext(ApplicationContext applicationContext) throws BeansException { NacosConfigProperties properties = applicationContext .getBean(NacosConfigProperties.class); configService = properties.configServiceInstance(); ServiceHolder holder = ServiceHolder.getInstance(); if (!holder.alreadyInit) { ServiceHolder.getInstance().setService(properties.configServiceInstance()); } }
static class ServiceHolder { private ConfigService service = null;
private boolean alreadyInit = false;
private static final ServiceHolder holder = new ServiceHolder();
ServiceHolder() { }
static ServiceHolder getInstance() { return holder; }
void setService(ConfigService service) { alreadyInit = true; this.service = service; }
ConfigService getService() { return service; } }
}
|