skywalking简介
SkyWalking 是一款开源的分布式应用性能监控(APM)系统,它通过 Agent 收集应用的调用链和性能指标(如 CPU、内存、数据库和 HTTP),将数据发送到 OAP(核心分析平台)进行处理和存储(支持 Elasticsearch、MySQL 等),并在 UI 前端可视化展示服务拓扑、指标和告警信息,帮助开发和运维快速定位微服务、容器化或云原生应用中的性能瓶颈和异常问题。
SkyWalking vs Zabbix 的对比表
特性
SkyWalking
Zabbix
类型
分布式应用性能监控(APM)
基础设施监控和告警系统
主要监控对象
微服务应用、容器、云原生环境
服务器、网络设备、应用服务
核心功能
分布式追踪、指标监控、服务拓扑、告警
主机监控、网络监控、服务可用性、告警
数据采集方式
Agent 嵌入应用、自动采集调用链和指标
SNMP、Agent、IPMI、JMX 等协议采集
数据存储
Elasticsearch、MySQL、H2、TiDB 等
内置数据库(MySQL、PostgreSQL)、可扩展
可视化
Web UI 展示调用拓扑图、指标趋势、Trace
Web UI 展示主机/服务状态、图表、报表
告警能力
基于指标和异常事件,可定制策略
支持阈值、事件触发告警,可邮件、短信、Webhook
适用场景
微服务性能分析、调用链追踪、容器化应用监控
基础设施监控、网络设备监控、服务器健康检查
部署复杂度
中等,需要 Agent + OAP + 存储
中等偏低,安装 Zabbix Server + Agent
优势
全链路追踪,定位服务性能瓶颈
监控范围广,覆盖 IT 基础设施,成熟稳定
github地址: apache/skywalking: APM, Application Performance Monitoring System
官网: Apache SkyWalking
skywalking部署(不包含持久化) 1. 准备镜像 需要有docker
和docker-compose
环境
需要准备docker-compose.yml
。如果是离线环境还需要准备三个镜像文件。分别是
apache/skywalking-ui:9.5.0 apache/skywalking-oap-server:9.5.0 docker.elastic.co/elasticsearch/elasticsearch:7.17.10
离线环境切到资源文件夹下执行导入镜像。
1 2 3 4 cd /data/resourcesdocker load -i skywalking-ui9.5.0.tar docker load -i sskywalking-oap-server9.5.0.tar docker load -i elasticsearch7.17.10.tar
2. docker-compose起容器 在docker-compose.yml
目录下运行
至此skywalking部署完毕,暂未进行 数据卷的持久化挂载 。
docker-compose.yml
源码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:7.17.10 container_name: skywalking-es restart: always environment: - discovery.type=single-node - ES_JAVA_OPTS=-Xms1g -Xmx1g - TZ=Asia/Shanghai - ELASTIC_USERNAME=elastic - ELASTIC_PASSWORD=123456 - xpack.security.enabled=false ports: - "9200:9200" ulimits: memlock: soft: -1 hard: -1 networks: - skywalking-net oap: image: apache/skywalking-oap-server:9.5.0 container_name: skywalking-oap depends_on: - elasticsearch restart: always environment: - TZ=Asia/Shanghai - SW_STORAGE=elasticsearch - SW_STORAGE_ES_CLUSTER_NODES=elasticsearch:9200 - SW_ES_USER=elastic - SW_ES_PASSWORD="123456" - SW_ES_SSL_VERIFY_CERTIFICATE=false ports: - "11800:11800" - "12800:12800" healthcheck: test: ["CMD-SHELL" , "curl -s -X POST http://localhost:12800/graphql -H 'Content-Type: application/json' -d '{\"query\": \"query { version }\"}' | grep -q '\"data\"'" ] interval: 30s timeout: 120s retries: 3 start_period: 90s networks: - skywalking-net ui: image: apache/skywalking-ui:9.5.0 container_name: skywalking-ui depends_on: oap: condition: service_healthy restart: always environment: - TZ=Asia/Shanghai - SW_OAP_ADDRESS=http://oap:12800 ports: - "8080:8080" networks: - skywalking-net networks: skywalking-net: driver: bridge
3. 检查skywalking-ui 通过IP地址+端口访问skywalking-ui:http://192.168.72.128:8080/general
。
4. 检查skywalking-oap 可以通过编写Python脚本模拟发送数据。可以看到数据正常显示,说明三个组件都工作正常了。至此部署成功。
python脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 import osimport timeimport requestsos.environ["SW_AGENT_NAME" ] = "python-test-service-2" os.environ["SW_AGENT_COLLECTOR_BACKEND_SERVICES" ] = "127.0.0.1:11800" from skywalking import agent, configagent.start() def do_work (): try : r = requests.get("http://httpbin.org/get" ) print (f"[Trace] Request status: {r.status_code} " ) except Exception as e: print ("[Trace] Request failed:" , e) if __name__ == "__main__" : print ("SkyWalking Python agent test started" ) while True : do_work() time.sleep(5 )
skywalking部署(包含持久化) 1. 准备文件夹 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [root@localhost skywalking]# tree -L 2 . ├── docker-compose.yml ├── es │ ├── config │ ├── data │ └── logs ├── oap │ ├── config │ ├── ext-config │ ├── ext-libs │ └── logs [root@localhost skywalking]# pwd /data /soft/skywalking/
2. 准备配置文件 es 在es的配置文件夹config
中创建 elasticsearch.yml
写入:
1 2 cluster.name: "docker-cluster" network.host: 0.0.0.0
oap 在config文件夹中放入默认配置。这一步需要单独创建一个一次性容器并将里面的配置文件docker cp
出来
起容器
1 docker run -d --name skywalking-temp apache/skywalking-oap-server:9.5.0
把容器里的 /skywalking/config 目录复制到当前目录
1 docker cp skywalking-temp:/skywalking/config ./config
用完删除临时容器
1 docker rm -f skywalking-temp
修改log4j2.xml
实现日志持久化
将以下文件覆盖源文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 <?xml version="1.0" encoding="UTF-8" ?> <!-- ~ Licensed to the Apache Software Foundation (ASF) under one or more ~ contributor license agreements. See the NOTICE file distributed with ~ this work for additional information regarding copyright ownership. ~ The ASF licenses this file to You under the Apache License, Version 2.0 ~ (the "License" ); you may not use this file except in compliance with ~ the License. You may obtain a copy of the License at ~ ~ http://www.apache.org/licenses/LICENSE-2.0 ~ ~ Unless required by applicable law or agreed to in writing, software ~ distributed under the License is distributed on an "AS IS" BASIS, ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~ See the License for the specific language governing permissions and ~ limitations under the License. ~ --> <Configuration status="INFO" > <Appenders> <Console name="Console" target="SYSTEM_OUT" > <PatternLayout> <LevelPatternSelector defaultPattern="%d %c %L [%t] %-5p %x - %m%n" > <PatternMatch key="ERROR" pattern="%d %c %L [%t] %-5p %x - [%swversion] %m%n" /> </LevelPatternSelector> </PatternLayout> </Console> <File name="FileLog" fileName="/skywalking/logs/oap-server.log" > <PatternLayout pattern="[%d] [%t] %-5level %logger{36} - %msg%n" /> </File> </Appenders> <Loggers> <Root level="INFO" > <AppenderRef ref="Console" /> <AppenderRef ref="FileLog" /> </Root> </Loggers> </Configuration>
3. 准备docker-compose.yml 以下是docker-compose.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 version: '3.8' services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:7.17.10 container_name: skywalking-es restart: always environment: - discovery.type=single-node - ES_JAVA_OPTS=-Xms1g -Xmx1g - TZ=Asia/Shanghai - ELASTIC_USERNAME=elastic - ELASTIC_PASSWORD=123456 - xpack.security.enabled=false ports: - "9200:9200" - "9300:9300" ulimits: memlock: soft: -1 hard: -1 volumes: - ./es/data:/usr/share/elasticsearch/data - ./es/logs:/usr/share/elasticsearch/logs - ./es/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml healthcheck: test: ["CMD" , "curl" , "-f" , "http://localhost:9200/_cluster/health" ] interval: 30s timeout: 120s retries: 5 start_period: 40s networks: - skywalking-net oap: image: apache/skywalking-oap-server:9.5.0 container_name: skywalking-oap depends_on: elasticsearch: condition: service_healthy restart: always environment: - TZ=Asia/Shanghai - SW_STORAGE=elasticsearch - SW_STORAGE_ES_CLUSTER_NODES=elasticsearch:9200 - SW_ES_USER=elastic - SW_ES_PASSWORD="123456" - SW_ES_SSL_VERIFY_CERTIFICATE=false - INGEST_GEOIP_DOWNLOADER_ENABLED=false ports: - "11800:11800" - "12800:12800" volumes: - ./oap/logs:/skywalking/logs - ./oap/config:/skywalking/config - ./oap/ext-config:/skywalking/ext-config - ./oap/ext-libs:/skywalking/ext-libs healthcheck: test: ["CMD-SHELL" , "curl -s -X POST http://localhost:12800/graphql -H 'Content-Type: application/json' -d '{\"query\": \"query { version }\"}' | grep -q '\"data\"'" ] interval: 30s timeout: 120s retries: 3 start_period: 90s networks: - skywalking-net ui: image: apache/skywalking-ui:9.5.0 container_name: skywalking-ui depends_on: oap: condition: service_healthy restart: always environment: - TZ=Asia/Shanghai - SW_OAP_ADDRESS=http://oap:12800 ports: - "8080:8080" networks: - skywalking-net networks: skywalking-net: driver: bridge
4. 起容器
5. 检查容器 1. 检查容器
2. 检查持久化 1 2 3 ls ./es/data ls ./es/logs ls ./oap/logs
6. 检查skywalking-ui 通过IP地址+端口访问skywalking-ui:http://192.168.72.128:8080/general
。
7. 检查skywalking-oap 可以通过编写Python脚本模拟发送数据。可以看到数据正常显示,说明三个组件都工作正常了。至此部署成功。
python脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 import osimport timeimport requestsos.environ["SW_AGENT_NAME" ] = "python-test-service-2" os.environ["SW_AGENT_COLLECTOR_BACKEND_SERVICES" ] = "127.0.0.1:11800" from skywalking import agent, configagent.start() def do_work (): try : r = requests.get("http://httpbin.org/get" ) print (f"[Trace] Request status: {r.status_code} " ) except Exception as e: print ("[Trace] Request failed:" , e) if __name__ == "__main__" : print ("SkyWalking Python agent test started" ) while True : do_work() time.sleep(5 )