基于Docker的单节点部署集群模式Nacos服务

前情提要: 记录由于Nacos不稳定导致gateway服务报CORS跨域错误

为了解决这个问题, 但是我又不想专门为Nacos开几个虚拟机并且独占式的使用整个机器的资源, 所以这里选择了使用Docker在单机上部署三个Nacos服务组成集群

下图是Nacos官方文档中提供的集群部署架构图

deploy-dns-vip-mode

本文中采用域名+SLB模式, 在NGINX上绑定Nacos服务的域名, 使用NG的负载均衡功能将流量转发给Nacos节点

Docker镜像的构建

由于涉及到较多的自定义功能和版本的需求, 所以这里采用自主构建镜像的策略

在构建开始前, 确保构建时的文件目录结构如下所示

1
2
3
4
5
6
nacos_2.0.3_cluster/
├── Dockerfile
├── bin/
│ └── docker-startup.sh
└── conf/
└── application.properties

Dockerfile的内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
FROM alpine:latest

LABEL maintainer="dauyohaN <song.daoyuan@qq.com>"
LABEL version="1.0"
LABEL document="https://songdaoyuan.github.io/2024/12/5/Docker-based-single-node-deployment-of-cluster-mode-Nacos-services"

# 设定Nacos的版本和热修复标识
ARG NACOS_VERSION=2.0.3
ARG HOT_FIX_FLAG=""

# 安装依赖
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g' /etc/apk/repositories \
&& apk update && apk add --no-cache \
openjdk8-jre-base \
curl \
iputils \
ncurses \
vim \
libcurl \
bash \
libc6-compat

# 设置环境变量
ENV MODE="cluster" \
PREFER_HOST_MODE="ip"\
BASE_DIR="/home/nacos" \
CLASSPATH=".:/home/nacos/conf:$CLASSPATH" \
CLUSTER_CONF="/home/nacos/conf/cluster.conf" \
FUNCTION_MODE="all" \
JAVA_HOME="/usr/lib/jvm/java-1.8-openjdk" \
NACOS_USER="nacos" \
JAVA="/usr/lib/jvm/java-1.8-openjdk/bin/java" \
JVM_XMS="1g" \
JVM_XMX="1g" \
JVM_XMN="512m" \
JVM_MS="128m" \
JVM_MMS="320m" \
NACOS_DEBUG="n" \
TOMCAT_ACCESSLOG_ENABLED="false" \
TIME_ZONE="Asia/Shanghai"

WORKDIR $BASE_DIR

# 下载并安装 Nacos
# 注意这里curl使用了代理, 如果不需要记得删掉 --proxy http://172.28.12.252:7897
RUN set -x \
&& curl --proxy http://172.28.12.252:7897 -SL "https://github.com/alibaba/nacos/releases/download/${NACOS_VERSION}${HOT_FIX_FLAG}/nacos-server-${NACOS_VERSION}.tar.gz" -o nacos-server.tar.gz \
&& tar -xzvf nacos-server.tar.gz -C /home \
&& rm -rf nacos-server.tar.gz /home/nacos/bin/* /home/nacos/conf/*.properties /home/nacos/conf/*.example /home/nacos/conf/nacos-mysql.sql \
&& ln -snf /usr/share/zoneinfo/$TIME_ZONE /etc/localtime && echo $TIME_ZONE > /etc/timezone

ADD bin/docker-startup.sh bin/docker-startup.sh
ADD conf/application.properties conf/application.properties

# 设置启动日志目录
RUN mkdir -p logs \
&& touch logs/start.out \
&& ln -sf /dev/stdout logs/start.out \
&& ln -sf /dev/stderr logs/start.out \
&& chmod +x bin/docker-startup.sh

EXPOSE 8848
EXPOSE 9848

ENTRYPOINT ["sh","bin/docker-startup.sh"]

基础镜像选择最新的alpine, Nacos选择2.0.3, 安装了一些必备的工具

由于Alpine使用的是musl libc 而不是 glibc, 会导致启动时加载 RocksDB 相关的动态链接库时出现错误, 通过安装glibc兼容包来修复这个问题

如果你在拉取镜像的过程中遇到了网络问题, 可以参考为Docker配置代理 - 详解Docker的三种网络代理配置

在镜像内部使用curl下载GitHub的Nacos Release使用了代理, 如果不需要记得删掉--proxy http://172.28.12.252:7897

docker-startup.sh的内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
set -x
export CUSTOM_SEARCH_NAMES="application"
export CUSTOM_SEARCH_LOCATIONS=file:${BASE_DIR}/conf/
export MEMBER_LIST="$MEMBER_LIST"
PLUGINS_DIR="/home/nacos/plugins/peer-finder"
function print_servers() {
if [[ ! -d "${PLUGINS_DIR}" ]]; then
echo "" >"$CLUSTER_CONF"
for server in ${NACOS_SERVERS}; do
echo "$server" >>"$CLUSTER_CONF"
done
else
bash $PLUGINS_DIR/plugin.sh
sleep 30
fi
}

function join_if_exist() {
if [ -n "$2" ]; then
echo "$1$2"
else
echo ""
fi
}

#===========================================================================================
# JVM Configuration
#===========================================================================================
Xms=$(join_if_exist "-Xms" ${JVM_XMS})
Xmx=$(join_if_exist "-Xmx" ${JVM_XMX})
Xmn=$(join_if_exist "-Xmn" ${JVM_XMN})
XX_MS=$(join_if_exist "-XX:MetaspaceSize=" ${JVM_MS})
XX_MMS=$(join_if_exist "-XX:MaxMetaspaceSize=" ${JVM_MMS})

JAVA_OPT="${JAVA_OPT} -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:SurvivorRatio=8 "
if [[ "${MODE}" == "standalone" ]]; then
JAVA_OPT="${JAVA_OPT} $Xms $Xmx $Xmn"
JAVA_OPT="${JAVA_OPT} -Dnacos.standalone=true"
else
if [[ "${EMBEDDED_STORAGE}" == "embedded" ]]; then
JAVA_OPT="${JAVA_OPT} -DembeddedStorage=true"
fi
JAVA_OPT="${JAVA_OPT} -server $Xms $Xmx $Xmn $XX_MS $XX_MMS"
if [[ "${NACOS_DEBUG}" == "y" ]]; then
JAVA_OPT="${JAVA_OPT} -Xdebug -Xrunjdwp:transport=dt_socket,address=9555,server=y,suspend=n"
fi
JAVA_OPT="${JAVA_OPT} -XX:-OmitStackTraceInFastThrow -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${BASE_DIR}/logs/java_heapdump.hprof"
JAVA_OPT="${JAVA_OPT} -XX:-UseLargePages"
print_servers
fi

#===========================================================================================
# Setting system properties
#===========================================================================================
# set mode that Nacos Server function of split
if [[ "${FUNCTION_MODE}" == "config" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.functionMode=config"
elif [[ "${FUNCTION_MODE}" == "naming" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.functionMode=naming"
fi
# set nacos server ip
if [[ ! -z "${NACOS_SERVER_IP}" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.server.ip=${NACOS_SERVER_IP}"
fi

if [[ ! -z "${USE_ONLY_SITE_INTERFACES}" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.inetutils.use-only-site-local-interfaces=${USE_ONLY_SITE_INTERFACES}"
fi

if [[ ! -z "${PREFERRED_NETWORKS}" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.inetutils.preferred-networks=${PREFERRED_NETWORKS}"
fi

if [[ ! -z "${IGNORED_INTERFACES}" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.inetutils.ignored-interfaces=${IGNORED_INTERFACES}"
fi

### If turn on auth system:
if [[ ! -z "${NACOS_AUTH_ENABLE}" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.core.auth.enabled=${NACOS_AUTH_ENABLE}"
fi

if [[ "${PREFER_HOST_MODE}" == "hostname" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.preferHostnameOverIp=true"
fi
JAVA_OPT="${JAVA_OPT} -Dnacos.member.list=${MEMBER_LIST}"

JAVA_MAJOR_VERSION=$($JAVA -version 2>&1 | sed -E -n 's/.* version "([0-9]*).*$/\1/p')
if [[ "$JAVA_MAJOR_VERSION" -ge "9" ]]; then
JAVA_OPT="${JAVA_OPT} -Xlog:gc*:file=${BASE_DIR}/logs/nacos_gc.log:time,tags:filecount=10,filesize=102400"
else
JAVA_OPT_EXT_FIX="-Djava.ext.dirs=${JAVA_HOME}/jre/lib/ext:${JAVA_HOME}/lib/ext"
JAVA_OPT="${JAVA_OPT} -Xloggc:${BASE_DIR}/logs/nacos_gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M"
fi

JAVA_OPT="${JAVA_OPT} -Dloader.path=${BASE_DIR}/plugins,${BASE_DIR}/plugins/health,${BASE_DIR}/plugins/cmdb,${BASE_DIR}/plugins/selector"
JAVA_OPT="${JAVA_OPT} -Dnacos.home=${BASE_DIR}"
JAVA_OPT="${JAVA_OPT} -jar ${BASE_DIR}/target/nacos-server.jar"
JAVA_OPT="${JAVA_OPT} ${JAVA_OPT_EXT}"
JAVA_OPT="${JAVA_OPT} --spring.config.additional-location=${CUSTOM_SEARCH_LOCATIONS}"
JAVA_OPT="${JAVA_OPT} --spring.config.name=${CUSTOM_SEARCH_NAMES}"
JAVA_OPT="${JAVA_OPT} --logging.config=${BASE_DIR}/conf/nacos-logback.xml"
JAVA_OPT="${JAVA_OPT} --server.max-http-header-size=524288"

echo "Nacos is starting, you can docker logs your container"
exec $JAVA ${JAVA_OPT}

application.properties的内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# spring
server.servlet.contextPath=${SERVER_SERVLET_CONTEXTPATH:/nacos}
server.contextPath=/nacos
server.port=${NACOS_APPLICATION_PORT:8848}
server.tomcat.accesslog.max-days=30
server.tomcat.accesslog.pattern=%h %l %u %t "%r" %s %b %D %{User-Agent}i %{Request-Source}i
server.tomcat.accesslog.enabled=${TOMCAT_ACCESSLOG_ENABLED:false}
server.error.include-message=ALWAYS
# default current work dir
server.tomcat.basedir=file:.
#*************** Config Module Related Configurations ***************#
### Deprecated configuration property, it is recommended to use `spring.sql.init.platform` replaced.
#spring.datasource.platform=${SPRING_DATASOURCE_PLATFORM:}
spring.sql.init.platform=${SPRING_DATASOURCE_PLATFORM:}
nacos.cmdb.dumpTaskInterval=3600
nacos.cmdb.eventTaskInterval=10
nacos.cmdb.labelTaskInterval=300
nacos.cmdb.loadDataAtStart=false
db.num=${MYSQL_DATABASE_NUM:1}
db.url.0=jdbc:mysql://${MYSQL_SERVICE_HOST}:${MYSQL_SERVICE_PORT:3306}/${MYSQL_SERVICE_DB_NAME}?${MYSQL_SERVICE_DB_PARAM:characterEncoding=utf8&connectTimeout=1000&socketTimeout=3000&autoReconnect=true&useSSL=false}
db.user.0=${MYSQL_SERVICE_USER}
db.password.0=${MYSQL_SERVICE_PASSWORD}
## DB connection pool settings
db.pool.config.connectionTimeout=${DB_POOL_CONNECTION_TIMEOUT:30000}
db.pool.config.validationTimeout=10000
db.pool.config.maximumPoolSize=20
db.pool.config.minimumIdle=2
### The auth system to use, currently only 'nacos' and 'ldap' is supported:
nacos.core.auth.system.type=${NACOS_AUTH_SYSTEM_TYPE:nacos}
### worked when nacos.core.auth.system.type=nacos
### The token expiration in seconds:
nacos.core.auth.plugin.nacos.token.expire.seconds=${NACOS_AUTH_TOKEN_EXPIRE_SECONDS:18000}
### The default token:
nacos.core.auth.plugin.nacos.token.secret.key=${NACOS_AUTH_TOKEN:}
### Turn on/off caching of auth information. By turning on this switch, the update of auth information would have a 15 seconds delay.
nacos.core.auth.caching.enabled=${NACOS_AUTH_CACHE_ENABLE:false}
nacos.core.auth.enable.userAgentAuthWhite=${NACOS_AUTH_USER_AGENT_AUTH_WHITE_ENABLE:false}
nacos.core.auth.server.identity.key=${NACOS_AUTH_IDENTITY_KEY:}
nacos.core.auth.server.identity.value=${NACOS_AUTH_IDENTITY_VALUE:}
## spring security config
### turn off security
nacos.security.ignore.urls=${NACOS_SECURITY_IGNORE_URLS:/,/error,/**/*.css,/**/*.js,/**/*.html,/**/*.map,/**/*.svg,/**/*.png,/**/*.ico,/console-fe/public/**,/v1/auth/**,/v1/console/health/**,/actuator/**,/v1/console/server/**}
# metrics for elastic search
management.metrics.export.elastic.enabled=false
management.metrics.export.influx.enabled=false
nacos.naming.distro.taskDispatchThreadCount=10
nacos.naming.distro.taskDispatchPeriod=200
nacos.naming.distro.batchSyncKeyCount=1000
nacos.naming.distro.initDataRatio=0.9
nacos.naming.distro.syncRetryDelay=5000
nacos.naming.data.warmup=true
nacos.console.ui.enabled=true
nacos.core.param.check.enabled=true

构建镜像

在确保路径结构正确, 文件已经准备就绪后, 就可以开始打包镜像了

1
docker build --no-cache -t nacos_cluster:v2.0.3 .

使用docker-compose管理服务

docker-compose.yml文件的配置和说明

为了方便快速启动多个节点并且管理节点的配置, 使用docker-compose进行容器编排

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
version: "3.8"
services:
nacos-node1:
hostname: nacos-node1
container_name: nacos-node1
image: nacos_cluster:v2.0.3
volumes:
- ./cluster-logs/nacos-node1:/home/nacos/logs
ports:
- "7848:7848"
- "8848:8848"
- "9868:9848"
- "9850:9849"
env_file:
- nacos-cluster.env
restart: always
nacos-node2:
hostname: nacos-node2
image: nacos_cluster:v2.0.3
container_name: nacos-node2
volumes:
- ./cluster-logs/nacos-node2:/home/nacos/logs
ports:
- "7849:7848"
- "8849:8848"
- "9869:9848"
- "9851:9849"
env_file:
- nacos-cluster.env
restart: always
nacos-node3:
hostname: nacos-node3
image: nacos_cluster:v2.0.3
container_name: nacos-node3
volumes:
- ./cluster-logs/nacos-node3:/home/nacos/logs
ports:
- "7850:7848"
- "8850:8848"
- "9870:9848"
- "9852:9849"
env_file:
- nacos-cluster.env
restart: always

观察compose文件可知, 在启动Nacos集群的过程中, 没有启动和初始化MySQL实例, 是因为本次的目标是从单点部署的Nacos迁移到集群部署, 可以沿用原有的MySQL数据库, 如果你是初始部署, 需要借鉴官方的Nacos Docker部署文档, 额外增加启动和初始化MySQL数据库的操作

在compose文件中我们能观察到容器映射了很多的端口, 原因在官方的部署架构说明可知:

Nacos2.X版本新增了gRPC的通信方式, 因此需要增加2个端口
新增端口是在配置的主端口(server.port, 默认8848)基础上, 进行一定偏移量自动生成, 具体端口内容及偏移量请参考如下

端口 与主端口的偏移量 描述
9848 1000 客户端gRPC请求服务端端口,用于客户端向服务端发起连接和请求
9849 1001 服务端gRPC请求服务端端口,用于服务间同步等
7848 -1000 Jraft请求服务端端口,用于处理服务端间的Raft相关请求

nacos-cluster.env文件的配置和说明

compose中依赖的nacos-cluster.env文件内容如下

1
2
3
4
5
6
7
8
9
10
11
12
PREFER_HOST_MODE=hostname
NACOS_SERVERS=nacos-node1:8848 nacos-node2:8848 nacos-node3:8848
SPRING_DATASOURCE_PLATFORM=mysql
MYSQL_SERVICE_HOST=192.168.6.192
MYSQL_SERVICE_DB_NAME=nacos
MYSQL_SERVICE_PORT=3306
MYSQL_SERVICE_USER=root
MYSQL_SERVICE_PASSWORD=123456
MYSQL_SERVICE_DB_PARAM=characterEncoding=utf8&connectTimeout=1000&socketTimeout=3000&autoReconnect=true&useSSL=false&allowPublicKeyRetrieval=true
NACOS_AUTH_IDENTITY_KEY=2222
NACOS_AUTH_IDENTITY_VALUE=2xxx
NACOS_AUTH_TOKEN=SecretKey012345678901234567890123456789012345678901234567890123456789

这个文件主要控制了Nacos的节点发现、数据源、鉴权这三个核心配置, 根据你实际的配置替换成正确的选项即可

集群的启动

在准备完成后, 检查下docker-compose的目录结构是否如下所示

1
2
3
nacos_2.0.3_cluster/
├── docker-compose.yml
└── nacos-cluster.env

再检查下compose文件和配置文件中主机名是否一致、MySQL数据源配置是否正确, 确认无误后, 启动集群

1
2
3
4
docker-compose up -d
docker logs nacos-node1
---
2024-12-05 14:55:23,831 INFO Nacos started successfully in cluster mode. use external storage

NGINX负载均衡的配置

使用NGINX实现负载均衡, 主要是配置upstream代码块

1
2
3
4
5
6
upstream test-nacos.com {
# server 192.168.6.192:8848;
server 192.168.6.141:8848 max_fails=3 fail_timeout=5s;
server 192.168.6.141:8849 max_fails=3 fail_timeout=5s;
server 192.168.6.141:8850 max_fails=3 fail_timeout=5s;
}

配置完成后重启NGINX, 访问域名, 在集群管理-节点列表里面查看当前Nacos节点的状态

nacos-node-status