主题
198-性能优化平台实战
课程目标
- 掌握性能优化平台的设计与实现
- 熟悉系统性能监控技术和工具
- 实现应用性能分析系统
- 掌握数据库优化技术
- 掌握网络优化技术
- 开发性能优化平台的前端和后端
一、系统性能监控
1.1 监控技术
1.1.1 Prometheus + Grafana
bash
# 安装 Prometheus
sudo apt install prometheus
# 安装 Grafana
sudo apt install grafana
# 启动服务
sudo systemctl start prometheus
sudo systemctl start grafana-server
sudo systemctl enable prometheus
sudo systemctl enable grafana-server
# 访问 Grafana
# http://localhost:30001.1.2 Node Exporter
bash
# 安装 Node Exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar xvfz node_exporter-1.3.1.linux-amd64.tar.gz
cd node_exporter-1.3.1.linux-amd64
sudo cp node_exporter /usr/local/bin/
# 创建系统服务
sudo nano /etc/systemd/system/node_exporter.serviceini
[Unit]
Description=Node Exporter
After=network.target
[Service]
Type=simple
User=node_exporter
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.targetbash
# 启动服务
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter1.1.3 Blackbox Exporter
bash
# 安装 Blackbox Exporternwget https://github.com/prometheus/blackbox_exporter/releases/download/v0.19.0/blackbox_exporter-0.19.0.linux-amd64.tar.gz
tar xvfz blackbox_exporter-0.19.0.linux-amd64.tar.gz
cd blackbox_exporter-0.19.0.linux-amd64
sudo cp blackbox_exporter /usr/local/bin/
# 创建配置文件
sudo nano /etc/blackbox_exporter/config.ymlyaml
modules:
http_2xx:
prober: http
timeout: 5s
http:
valid_status_codes: [200]
method: GET
icmp:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: ipv4bash
# 创建系统服务
sudo nano /etc/systemd/system/blackbox_exporter.serviceini
[Unit]
Description=Blackbox Exporter
After=network.target
[Service]
Type=simple
User=blackbox_exporter
ExecStart=/usr/local/bin/blackbox_exporter --config.file=/etc/blackbox_exporter/config.yml
[Install]
WantedBy=multi-user.targetbash
# 启动服务
sudo systemctl daemon-reload
sudo systemctl start blackbox_exporter
sudo systemctl enable blackbox_exporter1.2 监控系统设计
1.2.1 架构设计
- 前端:Vue.js + Element Plus + ECharts
- 后端:Python + FastAPI
- 监控:Prometheus + Grafana
- 存储:InfluxDB
- 告警:Alertmanager
1.2.2 后端实现
python
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import Monitor, Alert
from schemas import MonitorCreate, MonitorUpdate, MonitorResponse, AlertResponse
from database import get_db
import requests
app = FastAPI()
# 获取监控指标
@app.get("/monitoring/metrics")
async def get_metrics(
metric_name: str = None,
start_time: str = None,
end_time: str = None,
step: str = "1m"
):
# 构建 Prometheus 查询 URL
prometheus_url = "http://localhost:9090/api/v1"
if start_time and end_time:
# 范围查询
url = f"{prometheus_url}/query_range"
params = {
"query": metric_name or "up",
"start": start_time,
"end": end_time,
"step": step
}
else:
# 即时查询
url = f"{prometheus_url}/query"
params = {
"query": metric_name or "up"
}
try:
response = requests.get(url, params=params)
response.raise_for_status()
return response.json()
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get metrics: {str(e)}")
# 获取监控配置
@app.get("/monitoring/configs", response_model=list[MonitorResponse])
async def get_monitors(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
monitors = db.query(Monitor).offset(skip).limit(limit).all()
return monitors
# 创建监控配置
@app.post("/monitoring/configs", response_model=MonitorResponse)
async def create_monitor(monitor: MonitorCreate, db: Session = Depends(get_db)):
db_monitor = Monitor(**monitor.dict())
db.add(db_monitor)
db.commit()
db.refresh(db_monitor)
return db_monitor
# 获取告警列表
@app.get("/monitoring/alerts", response_model=list[AlertResponse])
async def get_alerts(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
alerts = db.query(Alert).offset(skip).limit(limit).all()
return alerts1.2.3 前端实现
vue
<template>
<div class="system-monitoring">
<el-card>
<template #header>
<div class="card-header">
<span>系统性能监控</span>
<el-button type="primary" @click="openCreateDialog">创建监控</el-button>
</div>
</template>
<el-tabs v-model="activeTab">
<el-tab-pane label="监控面板" name="dashboard">
<div class="dashboard-container">
<el-row :gutter="20">
<el-col :span="8">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>CPU 使用率</span>
</div>
</template>
<div class="metric-value">{{ cpuUsage }}%</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="cpuHistory" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="8">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>内存使用率</span>
</div>
</template>
<div class="metric-value">{{ memoryUsage }}%</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="memoryHistory" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="8">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>磁盘使用率</span>
</div>
</template>
<div class="metric-value">{{ diskUsage }}%</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="diskHistory" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
<el-row :gutter="20" style="margin-top: 20px;">
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>网络流量</span>
</div>
</template>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="networkHistory" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>系统负载</span>
</div>
</template>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="loadHistory" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
</div>
</el-tab-pane>
<el-tab-pane label="监控配置" name="configs">
<el-table :data="monitors" style="width: 100%">
<el-table-column prop="id" label="ID" width="80" />
<el-table-column prop="name" label="名称" />
<el-table-column prop="metric" label="指标" />
<el-table-column prop="threshold" label="阈值" width="120" />
<el-table-column prop="status" label="状态" width="100">
<template #default="{ row }">
<el-tag :type="getStatusType(row.status)">{{ row.status }}</el-tag>
</template>
</el-table-column>
<el-table-column label="操作" width="150">
<template #default="{ row }">
<el-button size="small" @click="editMonitor(row)">编辑</el-button>
<el-button size="small" type="danger" @click="deleteMonitor(row.id)">删除</el-button>
</template>
</el-table-column>
</el-table>
</el-tab-pane>
<el-tab-pane label="告警管理" name="alerts">
<el-table :data="alerts" style="width: 100%">
<el-table-column prop="id" label="ID" width="80" />
<el-table-column prop="monitor_name" label="监控名称" />
<el-table-column prop="metric" label="指标" />
<el-table-column prop="value" label="值" width="100" />
<el-table-column prop="threshold" label="阈值" width="100" />
<el-table-column prop="status" label="状态" width="100">
<template #default="{ row }">
<el-tag :type="getStatusType(row.status)">{{ row.status }}</el-tag>
</template>
</el-table-column>
<el-table-column prop="created_at" label="创建时间" width="180" />
</el-table>
</el-tab-pane>
</el-tabs>
</el-card>
<!-- 创建监控对话框 -->
<el-dialog v-model="dialogVisible" title="创建监控">
<el-form :model="form" label-width="120px">
<el-form-item label="名称">
<el-input v-model="form.name" />
</el-form-item>
<el-form-item label="指标">
<el-input v-model="form.metric" placeholder="例如: node_cpu_seconds_total" />
</el-form-item>
<el-form-item label="阈值">
<el-input v-model.number="form.threshold" type="number" />
</el-form-item>
<el-form-item label="告警级别">
<el-select v-model="form.severity">
<el-option label="信息" value="info" />
<el-option label="警告" value="warning" />
<el-option label="严重" value="critical" />
</el-select>
</el-form-item>
</el-form>
<template #footer>
<span class="dialog-footer">
<el-button @click="dialogVisible = false">取消</el-button>
<el-button type="primary" @click="createMonitor">创建</el-button>
</span>
</template>
</el-dialog>
</div>
</template>
<script setup>
import { ref, onMounted } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'
const activeTab = ref('dashboard')
const cpuUsage = ref(0)
const memoryUsage = ref(0)
const diskUsage = ref(0)
const cpuHistory = ref([])
const memoryHistory = ref([])
const diskHistory = ref([])
const networkHistory = ref([])
const loadHistory = ref([])
const monitors = ref([])
const alerts = ref([])
const dialogVisible = ref(false)
const form = ref({
name: '',
metric: '',
threshold: 0,
severity: 'warning'
})
// 获取系统指标
const getSystemMetrics = async () => {
try {
// 获取 CPU 使用率
const cpuResponse = await axios.get('/api/monitoring/metrics', {
params: { metric_name: '100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100' }
})
cpuUsage.value = Math.round(cpuResponse.data.data.result[0].value[1])
// 获取内存使用率
const memoryResponse = await axios.get('/api/monitoring/metrics', {
params: { metric_name: '(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100' }
})
memoryUsage.value = Math.round(memoryResponse.data.data.result[0].value[1])
// 获取磁盘使用率
const diskResponse = await axios.get('/api/monitoring/metrics', {
params: { metric_name: '(node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100' }
})
diskUsage.value = Math.round(diskResponse.data.data.result[0].value[1])
} catch (error) {
ElMessage.error('获取系统指标失败')
console.error(error)
}
}
// 获取历史数据
const getHistoryData = async () => {
try {
// 获取 CPU 历史数据
const cpuResponse = await axios.get('/api/monitoring/metrics', {
params: {
metric_name: '100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100',
start_time: new Date(Date.now() - 3600000).toISOString(),
end_time: new Date().toISOString(),
step: '1m'
}
})
cpuHistory.value = cpuResponse.data.data.result[0].values.map(item => ({
time: new Date(item[0] * 1000).toLocaleTimeString(),
value: Math.round(item[1])
}))
// 类似获取其他指标的历史数据...
} catch (error) {
console.error('获取历史数据失败:', error)
}
}
// 获取监控配置
const getMonitors = async () => {
try {
const response = await axios.get('/api/monitoring/configs')
monitors.value = response.data
} catch (error) {
ElMessage.error('获取监控配置失败')
console.error(error)
}
}
// 获取告警列表
const getAlerts = async () => {
try {
const response = await axios.get('/api/monitoring/alerts')
alerts.value = response.data
} catch (error) {
ElMessage.error('获取告警列表失败')
console.error(error)
}
}
// 创建监控
const createMonitor = async () => {
try {
await axios.post('/api/monitoring/configs', form.value)
ElMessage.success('创建监控成功')
dialogVisible.value = false
getMonitors()
} catch (error) {
ElMessage.error('创建监控失败')
console.error(error)
}
}
// 编辑监控
const editMonitor = (monitor) => {
form.value = { ...monitor }
dialogVisible.value = true
}
// 删除监控
const deleteMonitor = async (id) => {
try {
await axios.delete(`/api/monitoring/configs/${id}`)
ElMessage.success('删除监控成功')
getMonitors()
} catch (error) {
ElMessage.error('删除监控失败')
console.error(error)
}
}
// 获取状态标签类型
const getStatusType = (status) => {
const typeMap = {
'ok': 'success',
'warning': 'warning',
'critical': 'danger',
'info': 'info'
}
return typeMap[status] || 'info'
}
// 初始加载
onMounted(() => {
getSystemMetrics()
getHistoryData()
getMonitors()
getAlerts()
// 定时刷新
setInterval(() => {
getSystemMetrics()
getHistoryData()
getMonitors()
getAlerts()
}, 30000)
})
</script>
<style scoped>
.system-monitoring {
padding: 20px;
}
.card-header {
display: flex;
justify-content: space-between;
align-items: center;
}
.dashboard-container {
margin-top: 20px;
}
.metric-card {
height: 250px;
}
.metric-header {
display: flex;
justify-content: center;
font-weight: bold;
}
.metric-value {
font-size: 36px;
font-weight: bold;
text-align: center;
margin: 20px 0;
color: #1E40AF;
}
.metric-chart {
height: 150px;
}
.dialog-footer {
width: 100%;
display: flex;
justify-content: flex-end;
}
</style>二、应用性能分析
2.1 分析技术
2.1.1 OpenTelemetry
bash
# 安装 OpenTelemetry Collector
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.58.0/otelcol_0.58.0_linux_amd64.tar.gz
tar xvfz otelcol_0.58.0_linux_amd64.tar.gz
cd otelcol_0.58.0_linux_amd64
sudo cp otelcol /usr/local/bin/
# 创建配置文件
sudo nano /etc/otelcol/config.yamlyaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
prometheus:
endpoint: 0.0.0.0:8889
jaeger:
endpoint: localhost:14250
tls:
insecure: true
processors:
batch:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]bash
# 创建系统服务
sudo nano /etc/systemd/system/otelcol.serviceini
[Unit]
Description=OpenTelemetry Collector
After=network.target
[Service]
Type=simple
User=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml
[Install]
WantedBy=multi-user.targetbash
# 启动服务
sudo systemctl daemon-reload
sudo systemctl start otelcol
sudo systemctl enable otelcol2.1.2 Jaeger
bash
# 安装 Jaeger
docker run -d --name jaeger \
-e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
-p 5775:5775/udp \
-p 6831:6831/udp \
-p 6832:6832/udp \
-p 5778:5778 \
-p 16686:16686 \
-p 14268:14268 \
-p 14250:14250 \
-p 9411:9411 \
jaegertracing/all-in-one:1.35
# 访问 Jaeger
# http://localhost:166862.2 应用性能分析系统设计
2.2.1 架构设计
- 前端:Vue.js + Element Plus + ECharts
- 后端:Python + FastAPI
- 监控:OpenTelemetry + Jaeger
- 存储:Elasticsearch
2.2.2 后端实现
python
# 应用性能分析 API
@app.get("/performance/traces")
async def get_traces(
service_name: str = None,
operation_name: str = None,
start_time: str = None,
end_time: str = None,
limit: int = 100
):
# 构建 Jaeger 查询 URL
jaeger_url = "http://localhost:16686/api/traces"
params = {
"limit": limit
}
if service_name:
params["service"] = service_name
if operation_name:
params["operation"] = operation_name
if start_time:
params["start"] = start_time
if end_time:
params["end"] = end_time
try:
response = requests.get(jaeger_url, params=params)
response.raise_for_status()
return response.json()
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get traces: {str(e)}")
# 获取服务列表
@app.get("/performance/services")
async def get_services():
jaeger_url = "http://localhost:16686/api/services"
try:
response = requests.get(jaeger_url)
response.raise_for_status()
return response.json()
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get services: {str(e)}")
# 获取操作列表
@app.get("/performance/services/{service_name}/operations")
async def get_operations(service_name: str):
jaeger_url = f"http://localhost:16686/api/services/{service_name}/operations"
try:
response = requests.get(jaeger_url)
response.raise_for_status()
return response.json()
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get operations: {str(e)}")
# 获取性能统计
@app.get("/performance/stats")
async def get_performance_stats(
service_name: str = None,
start_time: str = None,
end_time: str = None
):
# 构建查询
query = {
"service_name": service_name,
"start_time": start_time,
"end_time": end_time
}
# 这里可以实现更复杂的性能统计逻辑
# 例如计算平均响应时间、错误率等
return {
"average_response_time": 123.45,
"p95_response_time": 234.56,
"p99_response_time": 345.67,
"error_rate": 0.02,
"throughput": 1234
}2.2.3 前端实现
vue
<template>
<div class="application-performance">
<el-card>
<template #header>
<div class="card-header">
<span>应用性能分析</span>
</div>
</template>
<el-form :inline="true" :model="searchForm" class="search-form">
<el-form-item label="服务">
<el-select v-model="searchForm.service" placeholder="选择服务">
<el-option v-for="service in services" :key="service" :label="service" :value="service" />
</el-select>
</el-form-item>
<el-form-item label="操作">
<el-select v-model="searchForm.operation" placeholder="选择操作">
<el-option v-for="operation in operations" :key="operation" :label="operation" :value="operation" />
</el-select>
</el-form-item>
<el-form-item label="时间范围">
<el-date-picker
v-model="searchForm.timeRange"
type="daterange"
range-separator="至"
start-placeholder="开始时间"
end-placeholder="结束时间"
format="YYYY-MM-DD HH:mm:ss"
value-format="YYYY-MM-DD HH:mm:ss"
/>
</el-form-item>
<el-form-item>
<el-button type="primary" @click="searchTraces">查询</el-button>
<el-button @click="resetForm">重置</el-button>
</el-form-item>
</el-form>
<el-tabs v-model="activeTab">
<el-tab-pane label="性能概览" name="overview">
<div class="overview-container">
<el-row :gutter="20">
<el-col :span="6">
<div class="stat-card">
<div class="stat-value">{{ stats.average_response_time }}ms</div>
<div class="stat-label">平均响应时间</div>
</div>
</el-col>
<el-col :span="6">
<div class="stat-card">
<div class="stat-value">{{ stats.p95_response_time }}ms</div>
<div class="stat-label">P95 响应时间</div>
</div>
</el-col>
<el-col :span="6">
<div class="stat-card">
<div class="stat-value">{{ stats.error_rate }}%</div>
<div class="stat-label">错误率</div>
</div>
</el-col>
<el-col :span="6">
<div class="stat-card">
<div class="stat-value">{{ stats.throughput }}/s</div>
<div class="stat-label">吞吐量</div>
</div>
</el-col>
</el-row>
<el-row :gutter="20" style="margin-top: 20px;">
<el-col :span="12">
<el-card class="chart-card">
<template #header>
<div class="chart-header">
<span>响应时间趋势</span>
</div>
</template>
<div class="chart-content">
<el-chart>
<el-line-chart :data="responseTimeTrend" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="12">
<el-card class="chart-card">
<template #header>
<div class="chart-header">
<span>错误率趋势</span>
</div>
</template>
<div class="chart-content">
<el-chart>
<el-line-chart :data="errorRateTrend" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
</div>
</el-tab-pane>
<el-tab-pane label="trace 详情" name="traces">
<el-table :data="traces" style="width: 100%">
<el-table-column prop="traceID" label="Trace ID" width="300" />
<el-table-column prop="spans[0].operationName" label="操作" />
<el-table-column prop="spans[0].duration" label="耗时(μs)" width="120" />
<el-table-column prop="spans[0].startTime" label="开始时间" width="180" />
<el-table-column label="操作" width="100">
<template #default="{ row }">
<el-button size="small" @click="viewTraceDetails(row.traceID)">查看</el-button>
</template>
</el-table-column>
</el-table>
<div class="pagination">
<el-pagination
v-model:current-page="currentPage"
v-model:page-size="pageSize"
:page-sizes="[10, 20, 50, 100]"
layout="total, sizes, prev, pager, next, jumper"
:total="total"
@size-change="handleSizeChange"
@current-change="handleCurrentChange"
/>
</div>
</el-tab-pane>
<el-tab-pane label="服务依赖" name="dependencies">
<div class="dependencies-container">
<el-card class="chart-card">
<template #header>
<div class="chart-header">
<span>服务依赖图</span>
</div>
</template>
<div class="chart-content">
<el-chart>
<el-graph-chart :data="serviceDependencies" />
</el-chart>
</div>
</el-card>
</div>
</el-tab-pane>
</el-tabs>
</el-card>
<!-- Trace 详情对话框 -->
<el-dialog v-model="traceDetailsVisible" title="Trace 详情" width="80%">
<div class="trace-details">
<el-tree :data="traceDetails" :props="treeProps" />
</div>
</el-dialog>
</div>
</template>
<script setup>
import { ref, onMounted, watch } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'
const activeTab = ref('overview')
const services = ref([])
const operations = ref([])
const traces = ref([])
const stats = ref({
average_response_time: 0,
p95_response_time: 0,
error_rate: 0,
throughput: 0
})
const responseTimeTrend = ref([])
const errorRateTrend = ref([])
const serviceDependencies = ref([])
const traceDetails = ref([])
const currentPage = ref(1)
const pageSize = ref(10)
const total = ref(0)
const traceDetailsVisible = ref(false)
const searchForm = ref({
service: '',
operation: '',
timeRange: []
})
const treeProps = {
children: 'spans',
label: 'operationName'
}
// 获取服务列表
const getServices = async () => {
try {
const response = await axios.get('/api/performance/services')
services.value = response.data.data
} catch (error) {
ElMessage.error('获取服务列表失败')
console.error(error)
}
}
// 获取操作列表
const getOperations = async () => {
if (!searchForm.value.service) {
operations.value = []
return
}
try {
const response = await axios.get(`/api/performance/services/${searchForm.value.service}/operations`)
operations.value = response.data.data
} catch (error) {
ElMessage.error('获取操作列表失败')
console.error(error)
}
}
// 搜索 traces
const searchTraces = async () => {
try {
const params = {
service_name: searchForm.value.service,
operation_name: searchForm.value.operation,
start_time: searchForm.value.timeRange[0],
end_time: searchForm.value.timeRange[1],
limit: pageSize.value
}
const response = await axios.get('/api/performance/traces', { params })
traces.value = response.data.data
total.value = response.data.total
} catch (error) {
ElMessage.error('搜索 traces 失败')
console.error(error)
}
}
// 获取性能统计
const getPerformanceStats = async () => {
try {
const params = {
service_name: searchForm.value.service,
start_time: searchForm.value.timeRange[0],
end_time: searchForm.value.timeRange[1]
}
const response = await axios.get('/api/performance/stats', { params })
stats.value = response.data
} catch (error) {
ElMessage.error('获取性能统计失败')
console.error(error)
}
}
// 查看 trace 详情
const viewTraceDetails = async (traceId) => {
try {
const response = await axios.get(`/api/performance/traces/${traceId}`)
traceDetails.value = response.data
traceDetailsVisible.value = true
} catch (error) {
ElMessage.error('获取 trace 详情失败')
console.error(error)
}
}
// 重置表单
const resetForm = () => {
searchForm.value = {
service: '',
operation: '',
timeRange: []
}
operations.value = []
}
// 分页处理
const handleSizeChange = (size) => {
pageSize.value = size
searchTraces()
}
const handleCurrentChange = (current) => {
currentPage.value = current
searchTraces()
}
// 监听服务变化
watch(() => searchForm.value.service, () => {
getOperations()
getPerformanceStats()
searchTraces()
})
// 初始加载
onMounted(() => {
getServices()
getPerformanceStats()
searchTraces()
})
</script>
<style scoped>
.application-performance {
padding: 20px;
}
.card-header {
display: flex;
justify-content: space-between;
align-items: center;
}
.search-form {
margin-bottom: 20px;
padding: 15px;
background-color: #f5f7fa;
border-radius: 8px;
}
.overview-container {
margin-top: 20px;
}
.stat-card {
background-color: #f5f7fa;
border-radius: 8px;
padding: 20px;
text-align: center;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
.stat-value {
font-size: 24px;
font-weight: bold;
color: #1E40AF;
}
.stat-label {
font-size: 14px;
color: #64748B;
margin-top: 5px;
}
.chart-card {
margin-top: 20px;
}
.chart-header {
display: flex;
justify-content: center;
font-weight: bold;
}
.chart-content {
height: 300px;
}
.pagination {
margin-top: 20px;
display: flex;
justify-content: flex-end;
}
.trace-details {
max-height: 600px;
overflow-y: auto;
}
.dependencies-container {
margin-top: 20px;
}
</style>三、数据库优化
3.1 优化技术
3.1.1 索引优化
sql
-- 创建索引
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at);
-- 创建复合索引
CREATE INDEX idx_orders_user_id_created_at ON orders(user_id, created_at);
-- 查看索引
SHOW INDEX FROM users;
-- 分析索引使用情况
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';
-- 删除无用索引
DROP INDEX idx_users_old_index;3.1.2 查询优化
sql
-- 避免 SELECT *
SELECT id, name, email FROM users WHERE active = 1;
-- 使用 LIMIT 限制结果集
SELECT * FROM users ORDER BY created_at DESC LIMIT 10;
-- 避免在 WHERE 子句中使用函数
-- 不好的做法
SELECT * FROM users WHERE YEAR(created_at) = 2023;
-- 好的做法
SELECT * FROM users WHERE created_at BETWEEN '2023-01-01' AND '2023-12-31';
-- 使用 JOIN 替代子查询
-- 子查询
SELECT * FROM users WHERE id IN (SELECT user_id FROM orders WHERE amount > 1000);
-- JOIN
SELECT u.* FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 1000 GROUP BY u.id;
-- 避免使用 OR,使用 IN 替代
-- 不好的做法
SELECT * FROM users WHERE status = 'active' OR status = 'pending';
-- 好的做法
SELECT * FROM users WHERE status IN ('active', 'pending');3.1.3 表结构优化
sql
-- 选择合适的数据类型
-- 不好的做法
CREATE TABLE users (
id INT,
name VARCHAR(255),
email VARCHAR(255),
status VARCHAR(20),
created_at DATETIME
);
-- 好的做法
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100),
email VARCHAR(100) UNIQUE,
status ENUM('active', 'pending', 'inactive'),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- 分区表
CREATE TABLE logs (
id INT PRIMARY KEY AUTO_INCREMENT,
level VARCHAR(20),
message TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) PARTITION BY RANGE (YEAR(created_at)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026)
);
-- 分表
-- 创建用户表分表
CREATE TABLE users_0 (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE,
status ENUM('active', 'pending', 'inactive'),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE users_1 (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE,
status ENUM('active', 'pending', 'inactive'),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);3.1.4 配置优化
bash
# MySQL 配置优化
# /etc/mysql/my.cnf
[mysqld]
# 内存配置
innodb_buffer_pool_size = 4G
key_buffer_size = 256M
# 查询缓存
query_cache_type = 1
query_cache_size = 64M
# 连接配置
max_connections = 1000
wait_timeout = 60
# 日志配置
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 1
# 存储引擎
default-storage-engine = InnoDB
# InnoDB 配置
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 16M
innodb_log_file_size = 256M
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
# 重启 MySQL
sudo systemctl restart mysql3.2 数据库优化系统设计
3.2.1 架构设计
- 前端:Vue.js + Element Plus + ECharts
- 后端:Python + FastAPI
- 数据库:MySQL + PostgreSQL
- 监控:Prometheus + Grafana
3.2.2 后端实现
python
# 数据库监控 API
@app.get("/database/metrics")
async def get_database_metrics(
database: str = None,
metric_name: str = None,
start_time: str = None,
end_time: str = None,
step: str = "1m"
):
# 构建 Prometheus 查询 URL
prometheus_url = "http://localhost:9090/api/v1"
if start_time and end_time:
# 范围查询
url = f"{prometheus_url}/query_range"
params = {
"query": metric_name or "mysql_global_status_connections",
"start": start_time,
"end": end_time,
"step": step
}
else:
# 即时查询
url = f"{prometheus_url}/query"
params = {
"query": metric_name or "mysql_global_status_connections"
}
try:
response = requests.get(url, params=params)
response.raise_for_status()
return response.json()
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get metrics: {str(e)}")
# 获取慢查询日志
@app.get("/database/slow-queries")
async def get_slow_queries(
start_time: str = None,
end_time: str = None,
limit: int = 100
):
# 这里可以实现慢查询日志解析逻辑
# 例如从文件或数据库中读取慢查询日志
return [
{
"id": 1,
"query": "SELECT * FROM users WHERE email = 'user@example.com'",
"duration": 1.23,
"created_at": "2023-01-01T12:00:00Z"
}
]
# 获取数据库表信息
@app.get("/database/tables")
async def get_tables(
database: str = None
):
# 这里可以实现获取数据库表信息的逻辑
# 例如通过 SHOW TABLES 或 information_schema 查询
return [
{
"name": "users",
"rows": 1000000,
"size": "100MB",
"engine": "InnoDB"
}
]
# 分析表
@app.post("/database/tables/{table_name}/analyze")
async def analyze_table(
table_name: str,
database: str = "default"
):
# 这里可以实现表分析逻辑
# 例如执行 ANALYZE TABLE 语句
return {
"message": f"Table {table_name} analyzed successfully",
"recommendations": [
"Add index on column 'email'",
"Optimize query: SELECT * FROM users WHERE created_at > '2023-01-01'"
]
}3.2.3 前端实现
vue
<template>
<div class="database-optimization">
<el-card>
<template #header>
<div class="card-header">
<span>数据库优化</span>
</div>
</template>
<el-tabs v-model="activeTab">
<el-tab-pane label="数据库监控" name="monitoring">
<div class="monitoring-container">
<el-row :gutter="20">
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>连接数</span>
</div>
</template>
<div class="metric-value">{{ connectionCount }}</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="connectionHistory" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>查询速率</span>
</div>
</template>
<div class="metric-value">{{ queryRate }}/s</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="queryRateHistory" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
<el-row :gutter="20" style="margin-top: 20px;">
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>慢查询数</span>
</div>
</template>
<div class="metric-value">{{ slowQueryCount }}</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="slowQueryHistory" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>缓存命中率</span>
</div>
</template>
<div class="metric-value">{{ cacheHitRate }}%</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="cacheHitRateHistory" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
</div>
</el-tab-pane>
<el-tab-pane label="慢查询分析" name="slow-queries">
<el-table :data="slowQueries" style="width: 100%">
<el-table-column prop="id" label="ID" width="80" />
<el-table-column prop="query" label="查询语句">
<template #default="{ row }">
<div class="query-content">{{ row.query }}</div>
</template>
</el-table-column>
<el-table-column prop="duration" label="执行时间(秒)" width="120" />
<el-table-column prop="created_at" label="执行时间" width="180" />
<el-table-column label="操作" width="150">
<template #default="{ row }">
<el-button size="small" @click="explainQuery(row.query)">分析</el-button>
<el-button size="small" @click="optimizeQuery(row.query)">优化</el-button>
</template>
</el-table-column>
</el-table>
</el-tab-pane>
<el-tab-pane label="表分析" name="tables">
<el-table :data="tables" style="width: 100%">
<el-table-column prop="name" label="表名" />
<el-table-column prop="rows" label="行数" width="120" />
<el-table-column prop="size" label="大小" width="120" />
<el-table-column prop="engine" label="引擎" width="100" />
<el-table-column label="操作" width="200">
<template #default="{ row }">
<el-button size="small" @click="analyzeTable(row.name)">分析</el-button>
<el-button size="small" @click="optimizeTable(row.name)">优化</el-button>
<el-button size="small" @click="viewIndexes(row.name)">查看索引</el-button>
</template>
</el-table-column>
</el-table>
</el-tab-pane>
<el-tab-pane label="配置管理" name="config">
<el-form :model="configForm" label-width="120px">
<el-form-item label="innodb_buffer_pool_size">
<el-input v-model="configForm.innodb_buffer_pool_size" />
</el-form-item>
<el-form-item label="max_connections">
<el-input v-model.number="configForm.max_connections" type="number" />
</el-form-item>
<el-form-item label="long_query_time">
<el-input v-model.number="configForm.long_query_time" type="number" step="0.1" />
</el-form-item>
<el-form-item label="query_cache_size">
<el-input v-model="configForm.query_cache_size" />
</el-form-item>
<el-form-item>
<el-button type="primary" @click="saveConfig">保存配置</el-button>
<el-button @click="loadConfig">加载当前配置</el-button>
</el-form-item>
</el-form>
</el-tab-pane>
</el-tabs>
</el-card>
<!-- 查询分析对话框 -->
<el-dialog v-model="explainDialogVisible" title="查询分析">
<div class="explain-dialog">
<el-table :data="explainResult" style="width: 100%">
<el-table-column prop="id" label="ID" />
<el-table-column prop="select_type" label="类型" />
<el-table-column prop="table" label="表" />
<el-table-column prop="type" label="访问类型" />
<el-table-column prop="possible_keys" label="可能的索引" />
<el-table-column prop="key" label="使用的索引" />
<el-table-column prop="key_len" label="索引长度" />
<el-table-column prop="ref" label="引用" />
<el-table-column prop="rows" label="扫描行数" />
<el-table-column prop="Extra" label="额外信息" />
</el-table>
</div>
</el-dialog>
<!-- 索引管理对话框 -->
<el-dialog v-model="indexDialogVisible" title="索引管理">
<div class="index-dialog">
<el-table :data="indexes" style="width: 100%">
<el-table-column prop="Table" label="表" />
<el-table-column prop="Non_unique" label="非唯一" />
<el-table-column prop="Key_name" label="索引名" />
<el-table-column prop="Seq_in_index" label="序列" />
<el-table-column prop="Column_name" label="列名" />
<el-table-column prop="Collation" label="排序" />
<el-table-column prop="Cardinality" label="基数" />
<el-table-column prop="Sub_part" label="子部分" />
<el-table-column prop="Packed" label="压缩" />
<el-table-column prop="Null" label="可为空" />
<el-table-column prop="Index_type" label="索引类型" />
<el-table-column prop="Comment" label="注释" />
</el-table>
</div>
</el-dialog>
</div>
</template>
<script setup>
import { ref, onMounted } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'
const activeTab = ref('monitoring')
const connectionCount = ref(0)
const queryRate = ref(0)
const slowQueryCount = ref(0)
const cacheHitRate = ref(0)
const connectionHistory = ref([])
const queryRateHistory = ref([])
const slowQueryHistory = ref([])
const cacheHitRateHistory = ref([])
const slowQueries = ref([])
const tables = ref([])
const indexes = ref([])
const explainResult = ref([])
const explainDialogVisible = ref(false)
const indexDialogVisible = ref(false)
const configForm = ref({
innodb_buffer_pool_size: '4G',
max_connections: 1000,
long_query_time: 1,
query_cache_size: '64M'
})
// 获取数据库指标
const getDatabaseMetrics = async () => {
try {
// 获取连接数
const connectionResponse = await axios.get('/api/database/metrics', {
params: { metric_name: 'mysql_global_status_connections' }
})
connectionCount.value = connectionResponse.data.data.result[0].value[1]
// 获取查询速率
const queryResponse = await axios.get('/api/database/metrics', {
params: { metric_name: 'rate(mysql_global_status_queries[1m])' }
})
queryRate.value = Math.round(queryResponse.data.data.result[0].value[1])
// 获取慢查询数
const slowQueryResponse = await axios.get('/api/database/metrics', {
params: { metric_name: 'mysql_global_status_slow_queries' }
})
slowQueryCount.value = slowQueryResponse.data.data.result[0].value[1]
// 获取缓存命中率
const cacheHitResponse = await axios.get('/api/database/metrics', {
params: { metric_name: '(mysql_global_status_qcache_hits / (mysql_global_status_qcache_hits + mysql_global_status_qcache_inserts)) * 100' }
})
cacheHitRate.value = Math.round(cacheHitResponse.data.data.result[0].value[1])
} catch (error) {
ElMessage.error('获取数据库指标失败')
console.error(error)
}
}
// 获取慢查询日志
const getSlowQueries = async () => {
try {
const response = await axios.get('/api/database/slow-queries')
slowQueries.value = response.data
} catch (error) {
ElMessage.error('获取慢查询日志失败')
console.error(error)
}
}
// 获取数据库表信息
const getTables = async () => {
try {
const response = await axios.get('/api/database/tables')
tables.value = response.data
} catch (error) {
ElMessage.error('获取数据库表信息失败')
console.error(error)
}
}
// 分析查询
const explainQuery = async (query) => {
try {
const response = await axios.post('/api/database/explain', { query })
explainResult.value = response.data
explainDialogVisible.value = true
} catch (error) {
ElMessage.error('分析查询失败')
console.error(error)
}
}
// 优化查询
const optimizeQuery = async (query) => {
try {
const response = await axios.post('/api/database/optimize-query', { query })
ElMessage.success('查询优化建议已生成')
// 显示优化建议
console.log('Optimization suggestions:', response.data)
} catch (error) {
ElMessage.error('优化查询失败')
console.error(error)
}
}
// 分析表
const analyzeTable = async (tableName) => {
try {
const response = await axios.post(`/api/database/tables/${tableName}/analyze`)
ElMessage.success(`表 ${tableName} 分析完成`)
// 显示分析结果
console.log('Analysis result:', response.data)
} catch (error) {
ElMessage.error('分析表失败')
console.error(error)
}
}
// 优化表
const optimizeTable = async (tableName) => {
try {
await axios.post(`/api/database/tables/${tableName}/optimize`)
ElMessage.success(`表 ${tableName} 优化完成`)
} catch (error) {
ElMessage.error('优化表失败')
console.error(error)
}
}
// 查看索引
const viewIndexes = async (tableName) => {
try {
const response = await axios.get(`/api/database/tables/${tableName}/indexes`)
indexes.value = response.data
indexDialogVisible.value = true
} catch (error) {
ElMessage.error('查看索引失败')
console.error(error)
}
}
// 保存配置
const saveConfig = async () => {
try {
await axios.post('/api/database/config', configForm.value)
ElMessage.success('配置保存成功')
} catch (error) {
ElMessage.error('保存配置失败')
console.error(error)
}
}
// 加载配置
const loadConfig = async () => {
try {
const response = await axios.get('/api/database/config')
configForm.value = response.data
ElMessage.success('配置加载成功')
} catch (error) {
ElMessage.error('加载配置失败')
console.error(error)
}
}
// 初始加载
onMounted(() => {
getDatabaseMetrics()
getSlowQueries()
getTables()
loadConfig()
// 定时刷新
setInterval(() => {
getDatabaseMetrics()
getSlowQueries()
}, 30000)
})
</script>
<style scoped>
.database-optimization {
padding: 20px;
}
.card-header {
display: flex;
justify-content: space-between;
align-items: center;
}
.monitoring-container {
margin-top: 20px;
}
.metric-card {
height: 250px;
}
.metric-header {
display: flex;
justify-content: center;
font-weight: bold;
}
.metric-value {
font-size: 36px;
font-weight: bold;
text-align: center;
margin: 20px 0;
color: #1E40AF;
}
.metric-chart {
height: 150px;
}
.query-content {
white-space: pre-wrap;
font-family: monospace;
font-size: 12px;
background-color: #f5f7fa;
padding: 10px;
border-radius: 4px;
max-height: 100px;
overflow-y: auto;
}
.explain-dialog {
max-height: 600px;
overflow-y: auto;
}
.index-dialog {
max-height: 600px;
overflow-y: auto;
}
</style>四、网络优化
4.1 优化技术
4.1.1 网络监控
bash
# 安装网络监控工具
sudo apt install net-tools iftop nethogs tcpdump wireshark
# 使用 iftop 监控网络流量
sudo iftop -i eth0
# 使用 nethogs 监控进程网络使用
sudo nethogs
# 使用 tcpdump 抓包
sudo tcpdump -i eth0 port 80
# 使用 ping 测试网络延迟
ping google.com
# 使用 traceroute 测试路由
traceroute google.com
# 使用 mtr 测试网络质量
sudo apt install mtr
sudo mtr google.com4.1.2 网络参数优化
bash
# 优化 Linux 网络参数
# /etc/sysctl.conf
# 最大文件句柄数
fs.file-max = 65535
# 网络参数
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# TCP 参数
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_slow_start_after_idle = 0
# 应用配置
sudo sysctl -p4.1.3 CDN 优化
bash
# 使用 Cloudflare CDN
# 1. 登录 Cloudflare 控制台
# 2. 添加域名
# 3. 配置 DNS 记录
# 4. 启用 CDN 功能
# 配置 Nginx 支持 CDN
# /etc/nginx/nginx.conf
http {
# Gzip 压缩
gzip on;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_min_length 256;
gzip_vary on;
# 缓存配置
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m use_temp_path=off;
server {
listen 80;
server_name example.com;
# CDN 配置
location / {
proxy_pass http://backend;
proxy_cache my_cache;
proxy_cache_valid 200 302 60m;
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_cache_background_update on;
proxy_cache_lock on;
# 缓存控制
add_header X-Proxy-Cache $upstream_cache_status;
expires 7d;
}
# 静态文件
location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
root /var/www/html;
expires 30d;
add_header Cache-Control "public, immutable";
}
}
}4.1.4 负载均衡
bash
# Nginx 负载均衡配置
# /etc/nginx/nginx.conf
http {
upstream backend {
# 轮询
server 192.168.1.10:8080;
server 192.168.1.11:8080;
server 192.168.1.12:8080;
# 权重
# server 192.168.1.10:8080 weight=5;
# server 192.168.1.11:8080 weight=3;
# server 192.168.1.12:8080 weight=2;
# IP 哈希
# ip_hash;
# 最少连接
# least_conn;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
}4.2 网络优化系统设计
4.2.1 架构设计
- 前端:Vue.js + Element Plus + ECharts
- 后端:Python + FastAPI
- 监控:Prometheus + Grafana
- 存储:InfluxDB
4.2.2 后端实现
python
# 网络监控 API
@app.get("/network/metrics")
async def get_network_metrics(
interface: str = None,
metric_name: str = None,
start_time: str = None,
end_time: str = None,
step: str = "1m"
):
# 构建 Prometheus 查询 URL
prometheus_url = "http://localhost:9090/api/v1"
if start_time and end_time:
# 范围查询
url = f"{prometheus_url}/query_range"
params = {
"query": metric_name or "node_network_receive_bytes_total",
"start": start_time,
"end": end_time,
"step": step
}
else:
# 即时查询
url = f"{prometheus_url}/query"
params = {
"query": metric_name or "node_network_receive_bytes_total"
}
if interface:
params["query"] += f"{{device='{interface}'}}"
try:
response = requests.get(url, params=params)
response.raise_for_status()
return response.json()
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get metrics: {str(e)}")
# 获取网络接口
@app.get("/network/interfaces")
async def get_network_interfaces():
# 构建 Prometheus 查询 URL
prometheus_url = "http://localhost:9090/api/v1/query"
params = {
"query": "count by(device) (node_network_receive_bytes_total)"
}
try:
response = requests.get(prometheus_url, params=params)
response.raise_for_status()
interfaces = [result['metric']['device'] for result in response.json()['data']['result']]
return interfaces
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to get interfaces: {str(e)}")
# 网络质量测试
@app.post("/network/test")
async def test_network(
target: str,
type: str = "ping"
):
# 执行网络测试
if type == "ping":
try:
result = subprocess.run(
["ping", "-c", "5", target],
capture_output=True,
text=True,
timeout=10
)
return {
"type": "ping",
"target": target,
"stdout": result.stdout,
"stderr": result.stderr,
"returncode": result.returncode
}
except Exception as e:
return {
"type": "ping",
"target": target,
"error": str(e)
}
elif type == "traceroute":
try:
result = subprocess.run(
["traceroute", target],
capture_output=True,
text=True,
timeout=30
)
return {
"type": "traceroute",
"target": target,
"stdout": result.stdout,
"stderr": result.stderr,
"returncode": result.returncode
}
except Exception as e:
return {
"type": "traceroute",
"target": target,
"error": str(e)
}
else:
raise HTTPException(status_code=400, detail="Invalid test type")4.2.3 前端实现
vue
<template>
<div class="network-optimization">
<el-card>
<template #header>
<div class="card-header">
<span>网络优化</span>
</div>
</template>
<el-tabs v-model="activeTab">
<el-tab-pane label="网络监控" name="monitoring">
<div class="monitoring-container">
<el-row :gutter="20">
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>入站流量</span>
</div>
</template>
<div class="metric-value">{{ inboundTraffic }}</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="inboundHistory" />
</el-chart>
</div>
</el-card>
</el-col>
<el-col :span="12">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>出站流量</span>
</div>
</template>
<div class="metric-value">{{ outboundTraffic }}</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="outboundHistory" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
<el-row :gutter="20" style="margin-top: 20px;">
<el-col :span="24">
<el-card class="metric-card">
<template #header>
<div class="metric-header">
<span>接口流量</span>
</div>
</template>
<div class="interface-selector">
<el-select v-model="selectedInterface" placeholder="选择接口">
<el-option v-for="interface in interfaces" :key="interface" :label="interface" :value="interface" />
</el-select>
</div>
<div class="metric-chart">
<el-chart>
<el-line-chart :data="interfaceHistory" />
</el-chart>
</div>
</el-card>
</el-col>
</el-row>
</div>
</el-tab-pane>
<el-tab-pane label="网络测试" name="test">
<div class="test-container">
<el-form :inline="true" :model="testForm" class="test-form">
<el-form-item label="目标">
<el-input v-model="testForm.target" placeholder="例如: google.com" />
</el-form-item>
<el-form-item label="测试类型">
<el-select v-model="testForm.type">
<el-option label="Ping" value="ping" />
<el-option label="Traceroute" value="traceroute" />
</el-select>
</el-form-item>
<el-form-item>
<el-button type="primary" @click="runTest">执行测试</el-button>
</el-form-item>
</el-form>
<div class="test-result">
<el-card>
<template #header>
<div class="result-header">
<span>测试结果</span>
</div>
</template>
<pre>{{ testResult }}</pre>
</el-card>
</div>
</div>
</el-tab-pane>
<el-tab-pane label="参数优化" name="config">
<el-form :model="configForm" label-width="120px">
<el-form-item label="net.core.somaxconn">
<el-input v-model.number="configForm.net_core_somaxconn" type="number" />
</el-form-item>
<el-form-item label="net.core.netdev_max_backlog">
<el-input v-model.number="configForm.net_core_netdev_max_backlog" type="number" />
</el-form-item>
<el-form-item label="net.ipv4.tcp_max_syn_backlog">
<el-input v-model.number="configForm.net_ipv4_tcp_max_syn_backlog" type="number" />
</el-form-item>
<el-form-item label="net.ipv4.tcp_fin_timeout">
<el-input v-model.number="configForm.net_ipv4_tcp_fin_timeout" type="number" />
</el-form-item>
<el-form-item label="net.ipv4.tcp_keepalive_time">
<el-input v-model.number="configForm.net_ipv4_tcp_keepalive_time" type="number" />
</el-form-item>
<el-form-item>
<el-button type="primary" @click="saveConfig">保存配置</el-button>
<el-button @click="loadConfig">加载当前配置</el-button>
</el-form-item>
</el-form>
</el-tab-pane>
</el-tabs>
</el-card>
</div>
</template>
<script setup>
import { ref, onMounted } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'
const activeTab = ref('monitoring')
const inboundTraffic = ref('0 MB/s')
const outboundTraffic = ref('0 MB/s')
const inboundHistory = ref([])
const outboundHistory = ref([])
const interfaces = ref([])
const selectedInterface = ref('')
const interfaceHistory = ref([])
const testForm = ref({
target: 'google.com',
type: 'ping'
})
const testResult = ref('')
const configForm = ref({
net_core_somaxconn: 65535,
net_core_netdev_max_backlog: 65535,
net_ipv4_tcp_max_syn_backlog: 65535,
net_ipv4_tcp_fin_timeout: 30,
net_ipv4_tcp_keepalive_time: 1200
})
// 获取网络指标
const getNetworkMetrics = async () => {
try {
// 获取入站流量
const inboundResponse = await axios.get('/api/network/metrics', {
params: { metric_name: 'rate(node_network_receive_bytes_total[1m])' }
})
const inboundValue = inboundResponse.data.data.result[0].value[1]
inboundTraffic.value = `${(inboundValue / (1024 * 1024)).toFixed(2)} MB/s`
// 获取出站流量
const outboundResponse = await axios.get('/api/network/metrics', {
params: { metric_name: 'rate(node_network_transmit_bytes_total[1m])' }
})
const outboundValue = outboundResponse.data.data.result[0].value[1]
outboundTraffic.value = `${(outboundValue / (1024 * 1024)).toFixed(2)} MB/s`
} catch (error) {
ElMessage.error('获取网络指标失败')
console.error(error)
}
}
// 获取网络接口
const getInterfaces = async () => {
try {
const response = await axios.get('/api/network/interfaces')
interfaces.value = response.data
if (interfaces.value.length > 0) {
selectedInterface.value = interfaces.value[0]
}
} catch (error) {
ElMessage.error('获取网络接口失败')
console.error(error)
}
}
// 运行网络测试
const runTest = async () => {
try {
const response = await axios.post('/api/network/test', {
target: testForm.value.target,
type: testForm.value.type
})
if (response.data.stdout) {
testResult.value = response.data.stdout
} else if (response.data.error) {
testResult.value = `错误: ${response.data.error}`
}
} catch (error) {
ElMessage.error('运行网络测试失败')
console.error(error)
}
}
// 保存配置
const saveConfig = async () => {
try {
await axios.post('/api/network/config', configForm.value)
ElMessage.success('配置保存成功')
} catch (error) {
ElMessage.error('保存配置失败')
console.error(error)
}
}
// 加载配置
const loadConfig = async () => {
try {
const response = await axios.get('/api/network/config')
configForm.value = response.data
ElMessage.success('配置加载成功')
} catch (error) {
ElMessage.error('加载配置失败')
console.error(error)
}
}
// 初始加载
onMounted(() => {
getNetworkMetrics()
getInterfaces()
loadConfig()
// 定时刷新
setInterval(() => {
getNetworkMetrics()
}, 30000)
})
</script>
<style scoped>
.network-optimization {
padding: 20px;
}
.card-header {
display: flex;
justify-content: space-between;
align-items: center;
}
.monitoring-container {
margin-top: 20px;
}
.metric-card {
height: 250px;
}
.metric-header {
display: flex;
justify-content: center;
font-weight: bold;
}
.metric-value {
font-size: 24px;
font-weight: bold;
text-align: center;
margin: 20px 0;
color: #1E40AF;
}
.metric-chart {
height: 150px;
}
.interface-selector {
margin-bottom: 10px;
display: flex;
justify-content: center;
}
.test-container {
margin-top: 20px;
}
.test-form {
margin-bottom: 20px;
padding: 15px;
background-color: #f5f7fa;
border-radius: 8px;
}
.result-header {
display: flex;
justify-content: center;
font-weight: bold;
}
.test-result pre {
background-color: #f5f7fa;
padding: 15px;
border-radius: 8px;
max-height: 400px;
overflow-y: auto;
font-family: monospace;
font-size: 14px;
}
</style>五、平台集成与部署
5.1 平台集成
5.1.1 API 集成
python
# 性能优化平台 API 网关
from fastapi import FastAPI, APIRouter
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
# 配置 CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# 注册路由
from routers import monitoring, performance, database, network
app.include_router(monitoring.router, prefix="/api/monitoring", tags=["monitoring"])
app.include_router(performance.router, prefix="/api/performance", tags=["performance"])
app.include_router(database.router, prefix="/api/database", tags=["database"])
app.include_router(network.router, prefix="/api/network", tags=["network"])
# 健康检查
@app.get("/health")
async def health_check():
return {"status": "healthy"}5.1.2 前端集成
javascript
// api.js
import axios from 'axios'
const api = axios.create({
baseURL: '/api',
timeout: 10000,
headers: {
'Content-Type': 'application/json'
}
})
// 响应拦截器
api.interceptors.response.use(
response => response,
error => {
console.error('API Error:', error)
return Promise.reject(error)
}
)
export default api
// 性能优化平台主应用
import { createApp } from 'vue'
import App from './App.vue'
import router from './router'
import ElementPlus from 'element-plus'
import 'element-plus/dist/index.css'
const app = createApp(App)
app.use(ElementPlus)
app.use(router)
app.mount('#app')5.2 部署方案
5.2.1 Docker 部署
yaml
# docker-compose.yml
version: '3'
services:
backend:
build: ./backend
ports:
- "8000:8000"
depends_on:
- prometheus
- grafana
- influxdb
frontend:
build: ./frontend
ports:
- "80:80"
depends_on:
- backend
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
grafana:
image: grafana/grafana
ports:
- "3000:3000"
depends_on:
- prometheus
influxdb:
image: influxdb:2.0
ports:
- "8086:8086"
environment:
- DOCKER_INFLUXDB_INIT_MODE=setup
- DOCKER_INFLUXDB_INIT_USERNAME=admin
- DOCKER_INFLUXDB_INIT_PASSWORD=password
- DOCKER_INFLUXDB_INIT_ORG=myorg
- DOCKER_INFLUXDB_INIT_BUCKET=metrics
node_exporter:
image: prom/node-exporter
ports:
- "9100:9100"
blackbox_exporter:
image: prom/blackbox-exporter
volumes:
- ./blackbox.yml:/etc/blackbox_exporter/config.yml
ports:
- "9115:9115"5.2.2 Kubernetes 部署
yaml
# performance-platform-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: performance-platform-backend
spec:
replicas: 2
selector:
matchLabels:
app: performance-platform-backend
template:
metadata:
labels:
app: performance-platform-backend
spec:
containers:
- name: backend
image: performance-platform-backend:latest
ports:
- containerPort: 8000
env:
- name: PROMETHEUS_URL
value: "http://prometheus:9090"
- name: GRAFANA_URL
value: "http://grafana:3000"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: performance-platform-frontend
spec:
replicas: 2
selector:
matchLabels:
app: performance-platform-frontend
template:
metadata:
labels:
app: performance-platform-frontend
spec:
containers:
- name: frontend
image: performance-platform-frontend:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: performance-platform-backend
spec:
selector:
app: performance-platform-backend
ports:
- port: 8000
targetPort: 8000
---
apiVersion: v1
kind: Service
metadata:
name: performance-platform-frontend
spec:
selector:
app: performance-platform-frontend
ports:
- port: 80
targetPort: 80
type: LoadBalancer六、最佳实践
6.1 系统性能监控最佳实践
- 全面监控:覆盖系统的各个层面,包括 CPU、内存、磁盘、网络等
- 实时监控:对系统状态进行实时监控,及时发现异常
- 阈值告警:设置合理的阈值,当指标超过阈值时及时告警
- 历史分析:对监控数据进行历史分析,发现性能趋势
- 可视化:通过可视化仪表盘直观展示系统状态
6.2 应用性能分析最佳实践
- 端到端监控:从用户请求到后端服务的端到端监控
- 分布式追踪:使用分布式追踪技术,追踪请求在各个服务之间的流转
- 性能瓶颈分析:通过分析找到应用的性能瓶颈
- 代码级分析:对关键代码进行性能分析,找到优化点
- 持续优化:建立性能优化的持续迭代机制
6.3 数据库优化最佳实践
- 索引优化:合理创建和使用索引,提高查询性能
- 查询优化:优化 SQL 查询语句,减少查询时间
- 表结构优化:合理设计表结构,选择合适的数据类型
- 分区和分表:对大表进行分区或分表,提高查询性能
- 配置优化:根据实际情况优化数据库配置
6.4 网络优化最佳实践
- 网络监控:实时监控网络状态,及时发现网络问题
- 参数调优:根据实际情况调整网络参数
- CDN 加速:使用 CDN 加速静态资源访问
- 负载均衡:使用负载均衡技术,分散网络流量
- 连接优化:优化网络连接,减少连接建立时间
6.5 性能优化平台最佳实践
- 一体化平台:构建一体化的性能优化平台,整合各种工具和技术
- 自动化:实现性能监控、分析、优化的自动化
- 智能化:利用 AI 技术实现智能告警、智能分析、智能优化
- 可扩展性:平台设计具有良好的可扩展性,支持添加新的监控指标和分析方法
- 易用性:平台界面友好,操作简单,便于使用
七、总结
7.1 课程内容总结
本课程详细介绍了性能优化平台的设计与实现,包括:
- 系统性能监控:监控技术、监控系统设计、前端实现
- 应用性能分析:分析技术、应用性能分析系统设计、前端实现
- 数据库优化:优化技术、数据库优化系统设计、前端实现
- 网络优化:优化技术、网络优化系统设计、前端实现
- 平台集成与部署:API 集成、前端集成、Docker 部署、Kubernetes 部署
- 最佳实践:系统性能监控、应用性能分析、数据库优化、网络优化、性能优化平台的最佳实践
7.2 技术栈总结
- 前端:Vue.js + Element Plus + ECharts
- 后端:Python + FastAPI
- 监控:Prometheus + Grafana + Node Exporter
- 存储:InfluxDB + Elasticsearch
- 容器:Docker + Kubernetes
- 网络:Nginx + CDN
- 数据库:MySQL + PostgreSQL
7.3 学习收获
通过本课程的学习,学员可以:
- 掌握性能优化平台的完整设计与实现流程
- 熟悉各种性能监控技术和工具的使用
- 了解系统性能、应用性能、数据库性能、网络性能的优化方法
- 具备构建企业级性能优化平台的能力
- 学习到性能优化的最佳实践和行业标准
性能优化是现代企业 IT 运维的重要组成部分,通过本课程的学习,学员将能够为企业构建更加高效、稳定的 IT 系统,提升业务系统的性能和可靠性。