skywalking简介

SkyWalking 是一款开源的分布式应用性能监控(APM)系统,它通过 Agent 收集应用的调用链和性能指标(如 CPU、内存、数据库和 HTTP),将数据发送到 OAP(核心分析平台)进行处理和存储(支持 Elasticsearch、MySQL 等),并在 UI 前端可视化展示服务拓扑、指标和告警信息,帮助开发和运维快速定位微服务、容器化或云原生应用中的性能瓶颈和异常问题。

SkyWalking vs Zabbix 的对比表

特性 SkyWalking Zabbix
类型 分布式应用性能监控(APM) 基础设施监控和告警系统
主要监控对象 微服务应用、容器、云原生环境 服务器、网络设备、应用服务
核心功能 分布式追踪、指标监控、服务拓扑、告警 主机监控、网络监控、服务可用性、告警
数据采集方式 Agent 嵌入应用、自动采集调用链和指标 SNMP、Agent、IPMI、JMX 等协议采集
数据存储 Elasticsearch、MySQL、H2、TiDB 等 内置数据库(MySQL、PostgreSQL)、可扩展
可视化 Web UI 展示调用拓扑图、指标趋势、Trace Web UI 展示主机/服务状态、图表、报表
告警能力 基于指标和异常事件,可定制策略 支持阈值、事件触发告警,可邮件、短信、Webhook
适用场景 微服务性能分析、调用链追踪、容器化应用监控 基础设施监控、网络设备监控、服务器健康检查
部署复杂度 中等,需要 Agent + OAP + 存储 中等偏低,安装 Zabbix Server + Agent
优势 全链路追踪,定位服务性能瓶颈 监控范围广,覆盖 IT 基础设施,成熟稳定

github地址:apache/skywalking: APM, Application Performance Monitoring System

官网:Apache SkyWalking

skywalking部署(不包含持久化)

1. 准备镜像

​ 需要有dockerdocker-compose环境

​ 需要准备docker-compose.yml。如果是离线环境还需要准备三个镜像文件。分别是

apache/skywalking-ui:9.5.0
apache/skywalking-oap-server:9.5.0
docker.elastic.co/elasticsearch/elasticsearch:7.17.10

image-20250829104438779

离线环境切到资源文件夹下执行导入镜像。

1
2
3
4
cd /data/resources
docker load -i skywalking-ui9.5.0.tar
docker load -i sskywalking-oap-server9.5.0.tar
docker load -i elasticsearch7.17.10.tar

2. docker-compose起容器

docker-compose.yml目录下运行

1
docker-compose up -d

image-20250829105013603

至此skywalking部署完毕,暂未进行数据卷的持久化挂载

docker-compose.yml源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.10
container_name: skywalking-es
restart: always
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms1g -Xmx1g
- TZ=Asia/Shanghai
- ELASTIC_USERNAME=elastic
- ELASTIC_PASSWORD=123456
- xpack.security.enabled=false
ports:
- "9200:9200"
ulimits:
memlock:
soft: -1
hard: -1
networks:
- skywalking-net

oap:
image: apache/skywalking-oap-server:9.5.0
container_name: skywalking-oap
depends_on:
- elasticsearch
restart: always
environment:
- TZ=Asia/Shanghai
- SW_STORAGE=elasticsearch
- SW_STORAGE_ES_CLUSTER_NODES=elasticsearch:9200
- SW_ES_USER=elastic
- SW_ES_PASSWORD="123456"
- SW_ES_SSL_VERIFY_CERTIFICATE=false
ports:
- "11800:11800"
- "12800:12800"
healthcheck:
test: ["CMD-SHELL", "curl -s -X POST http://localhost:12800/graphql -H 'Content-Type: application/json' -d '{\"query\": \"query { version }\"}' | grep -q '\"data\"'"]
interval: 30s
timeout: 120s
retries: 3
start_period: 90s
networks:
- skywalking-net

ui:
image: apache/skywalking-ui:9.5.0
container_name: skywalking-ui
depends_on:
oap:
condition: service_healthy
restart: always
environment:
- TZ=Asia/Shanghai
- SW_OAP_ADDRESS=http://oap:12800
ports:
- "8080:8080"
networks:
- skywalking-net


networks:
skywalking-net:
driver: bridge

3. 检查skywalking-ui

通过IP地址+端口访问skywalking-ui:http://192.168.72.128:8080/general

image-20250829105313821

4. 检查skywalking-oap

可以通过编写Python脚本模拟发送数据。可以看到数据正常显示,说明三个组件都工作正常了。至此部署成功。

image-20250829110339226

python脚本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# -*- coding: utf-8 -*-
import os
import time
import requests

os.environ["SW_AGENT_NAME"] = "python-test-service-2"
os.environ["SW_AGENT_COLLECTOR_BACKEND_SERVICES"] = "127.0.0.1:11800"

from skywalking import agent, config


# 启动 Agent
agent.start()

# 生成 Trace 的业务函数
def do_work():
try:
# requests 会被 Agent 自动 patch,生成 Span
r = requests.get("http://httpbin.org/get")
print(f"[Trace] Request status: {r.status_code}")
except Exception as e:
print("[Trace] Request failed:", e)


if __name__ == "__main__":
print("SkyWalking Python agent test started")
while True:
do_work()
time.sleep(5)

skywalking部署(包含持久化)

1. 准备文件夹

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@localhost skywalking]# tree -L 2
.
├── docker-compose.yml
├── es
│   ├── config
│   ├── data
│   └── logs
├── oap
│   ├── config
│   ├── ext-config
│   ├── ext-libs
│   └── logs
[root@localhost skywalking]# pwd
/data/soft/skywalking/

image-20250901165510844

2. 准备配置文件

es

在es的配置文件夹config中创建 elasticsearch.yml写入:

1
2
cluster.name: "docker-cluster"
network.host: 0.0.0.0

oap

在config文件夹中放入默认配置。这一步需要单独创建一个一次性容器并将里面的配置文件docker cp出来

  1. 起容器
1
docker run -d --name skywalking-temp apache/skywalking-oap-server:9.5.0
  1. 把容器里的 /skywalking/config 目录复制到当前目录
1
docker cp skywalking-temp:/skywalking/config ./config
  1. 用完删除临时容器
1
docker rm -f skywalking-temp
  1. 修改log4j2.xml实现日志持久化
1
vim log4j2.xml

将以下文件覆盖源文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
~
-->

<Configuration status="INFO">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout>
<LevelPatternSelector defaultPattern="%d %c %L [%t] %-5p %x - %m%n">
<PatternMatch key="ERROR" pattern="%d %c %L [%t] %-5p %x - [%swversion] %m%n" />
</LevelPatternSelector>
</PatternLayout>
</Console>
<File name="FileLog" fileName="/skywalking/logs/oap-server.log">
<PatternLayout pattern="[%d] [%t] %-5level %logger{36} - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="Console"/>
<AppenderRef ref="FileLog"/>
</Root>
</Loggers>
</Configuration>

3. 准备docker-compose.yml

以下是docker-compose.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
version: '3.8'

services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.10
container_name: skywalking-es
restart: always
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms1g -Xmx1g
- TZ=Asia/Shanghai
- ELASTIC_USERNAME=elastic
- ELASTIC_PASSWORD=123456
- xpack.security.enabled=false
ports:
- "9200:9200"
- "9300:9300"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./es/data:/usr/share/elasticsearch/data
- ./es/logs:/usr/share/elasticsearch/logs
- ./es/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9200/_cluster/health"]
interval: 30s
timeout: 120s
retries: 5
start_period: 40s
networks:
- skywalking-net

oap:
image: apache/skywalking-oap-server:9.5.0
container_name: skywalking-oap
depends_on:
elasticsearch:
condition: service_healthy
restart: always
environment:
- TZ=Asia/Shanghai
- SW_STORAGE=elasticsearch
- SW_STORAGE_ES_CLUSTER_NODES=elasticsearch:9200
- SW_ES_USER=elastic
- SW_ES_PASSWORD="123456"
- SW_ES_SSL_VERIFY_CERTIFICATE=false
- INGEST_GEOIP_DOWNLOADER_ENABLED=false
ports:
- "11800:11800"
- "12800:12800"
volumes:
- ./oap/logs:/skywalking/logs
- ./oap/config:/skywalking/config
- ./oap/ext-config:/skywalking/ext-config
- ./oap/ext-libs:/skywalking/ext-libs
healthcheck:
test: ["CMD-SHELL", "curl -s -X POST http://localhost:12800/graphql -H 'Content-Type: application/json' -d '{\"query\": \"query { version }\"}' | grep -q '\"data\"'"]
interval: 30s
timeout: 120s
retries: 3
start_period: 90s
networks:
- skywalking-net

ui:
image: apache/skywalking-ui:9.5.0
container_name: skywalking-ui
depends_on:
oap:
condition: service_healthy
restart: always
environment:
- TZ=Asia/Shanghai
- SW_OAP_ADDRESS=http://oap:12800
ports:
- "8080:8080"
networks:
- skywalking-net

networks:
skywalking-net:
driver: bridge

4. 起容器

1
docker-cpmpose up -d

image-20250901165643543

5. 检查容器

1. 检查容器

1
docker ps

image-20250901165743428

2. 检查持久化

1
2
3
ls ./es/data
ls ./es/logs
ls ./oap/logs

image-20250901170812610

6. 检查skywalking-ui

通过IP地址+端口访问skywalking-ui:http://192.168.72.128:8080/general

image-20250829105313821

7. 检查skywalking-oap

可以通过编写Python脚本模拟发送数据。可以看到数据正常显示,说明三个组件都工作正常了。至此部署成功。

image-20250829110339226

python脚本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# -*- coding: utf-8 -*-
import os
import time
import requests

os.environ["SW_AGENT_NAME"] = "python-test-service-2"
os.environ["SW_AGENT_COLLECTOR_BACKEND_SERVICES"] = "127.0.0.1:11800"

from skywalking import agent, config


# 启动 Agent
agent.start()

# 生成 Trace 的业务函数
def do_work():
try:
# requests 会被 Agent 自动 patch,生成 Span
r = requests.get("http://httpbin.org/get")
print(f"[Trace] Request status: {r.status_code}")
except Exception as e:
print("[Trace] Request failed:", e)


if __name__ == "__main__":
print("SkyWalking Python agent test started")
while True:
do_work()
time.sleep(5)