跳转到内容

198-性能优化平台实战

课程目标

  • 掌握性能优化平台的设计与实现
  • 熟悉系统性能监控技术和工具
  • 实现应用性能分析系统
  • 掌握数据库优化技术
  • 掌握网络优化技术
  • 开发性能优化平台的前端和后端

一、系统性能监控

1.1 监控技术

1.1.1 Prometheus + Grafana

bash
# 安装 Prometheus
sudo apt install prometheus

# 安装 Grafana
sudo apt install grafana

# 启动服务
sudo systemctl start prometheus
sudo systemctl start grafana-server
sudo systemctl enable prometheus
sudo systemctl enable grafana-server

# 访问 Grafana
# http://localhost:3000

1.1.2 Node Exporter

bash
# 安装 Node Exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar xvfz node_exporter-1.3.1.linux-amd64.tar.gz
cd node_exporter-1.3.1.linux-amd64
sudo cp node_exporter /usr/local/bin/

# 创建系统服务
sudo nano /etc/systemd/system/node_exporter.service
ini
[Unit]
Description=Node Exporter
After=network.target

[Service]
Type=simple
User=node_exporter
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
bash
# 启动服务
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter

1.1.3 Blackbox Exporter

bash
# 安装 Blackbox Exporternwget https://github.com/prometheus/blackbox_exporter/releases/download/v0.19.0/blackbox_exporter-0.19.0.linux-amd64.tar.gz
tar xvfz blackbox_exporter-0.19.0.linux-amd64.tar.gz
cd blackbox_exporter-0.19.0.linux-amd64
sudo cp blackbox_exporter /usr/local/bin/

# 创建配置文件
sudo nano /etc/blackbox_exporter/config.yml
yaml
modules:
  http_2xx:
    prober: http
    timeout: 5s
    http:
      valid_status_codes: [200]
      method: GET
  icmp:
    prober: icmp
    timeout: 5s
    icmp:
      preferred_ip_protocol: ipv4
bash
# 创建系统服务
sudo nano /etc/systemd/system/blackbox_exporter.service
ini
[Unit]
Description=Blackbox Exporter
After=network.target

[Service]
Type=simple
User=blackbox_exporter
ExecStart=/usr/local/bin/blackbox_exporter --config.file=/etc/blackbox_exporter/config.yml

[Install]
WantedBy=multi-user.target
bash
# 启动服务
sudo systemctl daemon-reload
sudo systemctl start blackbox_exporter
sudo systemctl enable blackbox_exporter

1.2 监控系统设计

1.2.1 架构设计

  • 前端:Vue.js + Element Plus + ECharts
  • 后端:Python + FastAPI
  • 监控:Prometheus + Grafana
  • 存储:InfluxDB
  • 告警:Alertmanager

1.2.2 后端实现

python
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from models import Monitor, Alert
from schemas import MonitorCreate, MonitorUpdate, MonitorResponse, AlertResponse
from database import get_db
import requests

app = FastAPI()

# 获取监控指标
@app.get("/monitoring/metrics")
async def get_metrics(
    metric_name: str = None,
    start_time: str = None,
    end_time: str = None,
    step: str = "1m"
):
    # 构建 Prometheus 查询 URL
    prometheus_url = "http://localhost:9090/api/v1"
    
    if start_time and end_time:
        # 范围查询
        url = f"{prometheus_url}/query_range"
        params = {
            "query": metric_name or "up",
            "start": start_time,
            "end": end_time,
            "step": step
        }
    else:
        # 即时查询
        url = f"{prometheus_url}/query"
        params = {
            "query": metric_name or "up"
        }
    
    try:
        response = requests.get(url, params=params)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get metrics: {str(e)}")

# 获取监控配置
@app.get("/monitoring/configs", response_model=list[MonitorResponse])
async def get_monitors(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
    monitors = db.query(Monitor).offset(skip).limit(limit).all()
    return monitors

# 创建监控配置
@app.post("/monitoring/configs", response_model=MonitorResponse)
async def create_monitor(monitor: MonitorCreate, db: Session = Depends(get_db)):
    db_monitor = Monitor(**monitor.dict())
    db.add(db_monitor)
    db.commit()
    db.refresh(db_monitor)
    return db_monitor

# 获取告警列表
@app.get("/monitoring/alerts", response_model=list[AlertResponse])
async def get_alerts(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
    alerts = db.query(Alert).offset(skip).limit(limit).all()
    return alerts

1.2.3 前端实现

vue
<template>
  <div class="system-monitoring">
    <el-card>
      <template #header>
        <div class="card-header">
          <span>系统性能监控</span>
          <el-button type="primary" @click="openCreateDialog">创建监控</el-button>
        </div>
      </template>
      
      <el-tabs v-model="activeTab">
        <el-tab-pane label="监控面板" name="dashboard">
          <div class="dashboard-container">
            <el-row :gutter="20">
              <el-col :span="8">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>CPU 使用率</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ cpuUsage }}%</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="cpuHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="8">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>内存使用率</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ memoryUsage }}%</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="memoryHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="8">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>磁盘使用率</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ diskUsage }}%</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="diskHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
            <el-row :gutter="20" style="margin-top: 20px;">
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>网络流量</span>
                    </div>
                  </template>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="networkHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>系统负载</span>
                    </div>
                  </template>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="loadHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
          </div>
        </el-tab-pane>
        <el-tab-pane label="监控配置" name="configs">
          <el-table :data="monitors" style="width: 100%">
            <el-table-column prop="id" label="ID" width="80" />
            <el-table-column prop="name" label="名称" />
            <el-table-column prop="metric" label="指标" />
            <el-table-column prop="threshold" label="阈值" width="120" />
            <el-table-column prop="status" label="状态" width="100">
              <template #default="{ row }">
                <el-tag :type="getStatusType(row.status)">{{ row.status }}</el-tag>
              </template>
            </el-table-column>
            <el-table-column label="操作" width="150">
              <template #default="{ row }">
                <el-button size="small" @click="editMonitor(row)">编辑</el-button>
                <el-button size="small" type="danger" @click="deleteMonitor(row.id)">删除</el-button>
              </template>
            </el-table-column>
          </el-table>
        </el-tab-pane>
        <el-tab-pane label="告警管理" name="alerts">
          <el-table :data="alerts" style="width: 100%">
            <el-table-column prop="id" label="ID" width="80" />
            <el-table-column prop="monitor_name" label="监控名称" />
            <el-table-column prop="metric" label="指标" />
            <el-table-column prop="value" label="值" width="100" />
            <el-table-column prop="threshold" label="阈值" width="100" />
            <el-table-column prop="status" label="状态" width="100">
              <template #default="{ row }">
                <el-tag :type="getStatusType(row.status)">{{ row.status }}</el-tag>
              </template>
            </el-table-column>
            <el-table-column prop="created_at" label="创建时间" width="180" />
          </el-table>
        </el-tab-pane>
      </el-tabs>
    </el-card>
    
    <!-- 创建监控对话框 -->
    <el-dialog v-model="dialogVisible" title="创建监控">
      <el-form :model="form" label-width="120px">
        <el-form-item label="名称">
          <el-input v-model="form.name" />
        </el-form-item>
        <el-form-item label="指标">
          <el-input v-model="form.metric" placeholder="例如: node_cpu_seconds_total" />
        </el-form-item>
        <el-form-item label="阈值">
          <el-input v-model.number="form.threshold" type="number" />
        </el-form-item>
        <el-form-item label="告警级别">
          <el-select v-model="form.severity">
            <el-option label="信息" value="info" />
            <el-option label="警告" value="warning" />
            <el-option label="严重" value="critical" />
          </el-select>
        </el-form-item>
      </el-form>
      <template #footer>
        <span class="dialog-footer">
          <el-button @click="dialogVisible = false">取消</el-button>
          <el-button type="primary" @click="createMonitor">创建</el-button>
        </span>
      </template>
    </el-dialog>
  </div>
</template>

<script setup>
import { ref, onMounted } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'

const activeTab = ref('dashboard')
const cpuUsage = ref(0)
const memoryUsage = ref(0)
const diskUsage = ref(0)
const cpuHistory = ref([])
const memoryHistory = ref([])
const diskHistory = ref([])
const networkHistory = ref([])
const loadHistory = ref([])
const monitors = ref([])
const alerts = ref([])
const dialogVisible = ref(false)
const form = ref({
  name: '',
  metric: '',
  threshold: 0,
  severity: 'warning'
})

// 获取系统指标
const getSystemMetrics = async () => {
  try {
    // 获取 CPU 使用率
    const cpuResponse = await axios.get('/api/monitoring/metrics', {
      params: { metric_name: '100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100' }
    })
    cpuUsage.value = Math.round(cpuResponse.data.data.result[0].value[1])
    
    // 获取内存使用率
    const memoryResponse = await axios.get('/api/monitoring/metrics', {
      params: { metric_name: '(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100' }
    })
    memoryUsage.value = Math.round(memoryResponse.data.data.result[0].value[1])
    
    // 获取磁盘使用率
    const diskResponse = await axios.get('/api/monitoring/metrics', {
      params: { metric_name: '(node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100' }
    })
    diskUsage.value = Math.round(diskResponse.data.data.result[0].value[1])
  } catch (error) {
    ElMessage.error('获取系统指标失败')
    console.error(error)
  }
}

// 获取历史数据
const getHistoryData = async () => {
  try {
    // 获取 CPU 历史数据
    const cpuResponse = await axios.get('/api/monitoring/metrics', {
      params: {
        metric_name: '100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100',
        start_time: new Date(Date.now() - 3600000).toISOString(),
        end_time: new Date().toISOString(),
        step: '1m'
      }
    })
    cpuHistory.value = cpuResponse.data.data.result[0].values.map(item => ({
      time: new Date(item[0] * 1000).toLocaleTimeString(),
      value: Math.round(item[1])
    }))
    
    // 类似获取其他指标的历史数据...
  } catch (error) {
    console.error('获取历史数据失败:', error)
  }
}

// 获取监控配置
const getMonitors = async () => {
  try {
    const response = await axios.get('/api/monitoring/configs')
    monitors.value = response.data
  } catch (error) {
    ElMessage.error('获取监控配置失败')
    console.error(error)
  }
}

// 获取告警列表
const getAlerts = async () => {
  try {
    const response = await axios.get('/api/monitoring/alerts')
    alerts.value = response.data
  } catch (error) {
    ElMessage.error('获取告警列表失败')
    console.error(error)
  }
}

// 创建监控
const createMonitor = async () => {
  try {
    await axios.post('/api/monitoring/configs', form.value)
    ElMessage.success('创建监控成功')
    dialogVisible.value = false
    getMonitors()
  } catch (error) {
    ElMessage.error('创建监控失败')
    console.error(error)
  }
}

// 编辑监控
const editMonitor = (monitor) => {
  form.value = { ...monitor }
  dialogVisible.value = true
}

// 删除监控
const deleteMonitor = async (id) => {
  try {
    await axios.delete(`/api/monitoring/configs/${id}`)
    ElMessage.success('删除监控成功')
    getMonitors()
  } catch (error) {
    ElMessage.error('删除监控失败')
    console.error(error)
  }
}

// 获取状态标签类型
const getStatusType = (status) => {
  const typeMap = {
    'ok': 'success',
    'warning': 'warning',
    'critical': 'danger',
    'info': 'info'
  }
  return typeMap[status] || 'info'
}

// 初始加载
onMounted(() => {
  getSystemMetrics()
  getHistoryData()
  getMonitors()
  getAlerts()
  
  // 定时刷新
  setInterval(() => {
    getSystemMetrics()
    getHistoryData()
    getMonitors()
    getAlerts()
  }, 30000)
})
</script>

<style scoped>
.system-monitoring {
  padding: 20px;
}

.card-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
}

.dashboard-container {
  margin-top: 20px;
}

.metric-card {
  height: 250px;
}

.metric-header {
  display: flex;
  justify-content: center;
  font-weight: bold;
}

.metric-value {
  font-size: 36px;
  font-weight: bold;
  text-align: center;
  margin: 20px 0;
  color: #1E40AF;
}

.metric-chart {
  height: 150px;
}

.dialog-footer {
  width: 100%;
  display: flex;
  justify-content: flex-end;
}
</style>

二、应用性能分析

2.1 分析技术

2.1.1 OpenTelemetry

bash
# 安装 OpenTelemetry Collector
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.58.0/otelcol_0.58.0_linux_amd64.tar.gz
tar xvfz otelcol_0.58.0_linux_amd64.tar.gz
cd otelcol_0.58.0_linux_amd64
sudo cp otelcol /usr/local/bin/

# 创建配置文件
sudo nano /etc/otelcol/config.yaml
yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

exporters:
  prometheus:
    endpoint: 0.0.0.0:8889
  jaeger:
    endpoint: localhost:14250
    tls:
      insecure: true

processors:
  batch:

pipelines:
  traces:
    receivers: [otlp]
    processors: [batch]
    exporters: [jaeger]
  metrics:
    receivers: [otlp]
    processors: [batch]
    exporters: [prometheus]
bash
# 创建系统服务
sudo nano /etc/systemd/system/otelcol.service
ini
[Unit]
Description=OpenTelemetry Collector
After=network.target

[Service]
Type=simple
User=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml

[Install]
WantedBy=multi-user.target
bash
# 启动服务
sudo systemctl daemon-reload
sudo systemctl start otelcol
sudo systemctl enable otelcol

2.1.2 Jaeger

bash
# 安装 Jaeger
docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 14250:14250 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.35

# 访问 Jaeger
# http://localhost:16686

2.2 应用性能分析系统设计

2.2.1 架构设计

  • 前端:Vue.js + Element Plus + ECharts
  • 后端:Python + FastAPI
  • 监控:OpenTelemetry + Jaeger
  • 存储:Elasticsearch

2.2.2 后端实现

python
# 应用性能分析 API
@app.get("/performance/traces")
async def get_traces(
    service_name: str = None,
    operation_name: str = None,
    start_time: str = None,
    end_time: str = None,
    limit: int = 100
):
    # 构建 Jaeger 查询 URL
    jaeger_url = "http://localhost:16686/api/traces"
    params = {
        "limit": limit
    }
    
    if service_name:
        params["service"] = service_name
    if operation_name:
        params["operation"] = operation_name
    if start_time:
        params["start"] = start_time
    if end_time:
        params["end"] = end_time
    
    try:
        response = requests.get(jaeger_url, params=params)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get traces: {str(e)}")

# 获取服务列表
@app.get("/performance/services")
async def get_services():
    jaeger_url = "http://localhost:16686/api/services"
    
    try:
        response = requests.get(jaeger_url)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get services: {str(e)}")

# 获取操作列表
@app.get("/performance/services/{service_name}/operations")
async def get_operations(service_name: str):
    jaeger_url = f"http://localhost:16686/api/services/{service_name}/operations"
    
    try:
        response = requests.get(jaeger_url)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get operations: {str(e)}")

# 获取性能统计
@app.get("/performance/stats")
async def get_performance_stats(
    service_name: str = None,
    start_time: str = None,
    end_time: str = None
):
    # 构建查询
    query = {
        "service_name": service_name,
        "start_time": start_time,
        "end_time": end_time
    }
    
    # 这里可以实现更复杂的性能统计逻辑
    # 例如计算平均响应时间、错误率等
    
    return {
        "average_response_time": 123.45,
        "p95_response_time": 234.56,
        "p99_response_time": 345.67,
        "error_rate": 0.02,
        "throughput": 1234
    }

2.2.3 前端实现

vue
<template>
  <div class="application-performance">
    <el-card>
      <template #header>
        <div class="card-header">
          <span>应用性能分析</span>
        </div>
      </template>
      
      <el-form :inline="true" :model="searchForm" class="search-form">
        <el-form-item label="服务">
          <el-select v-model="searchForm.service" placeholder="选择服务">
            <el-option v-for="service in services" :key="service" :label="service" :value="service" />
          </el-select>
        </el-form-item>
        <el-form-item label="操作">
          <el-select v-model="searchForm.operation" placeholder="选择操作">
            <el-option v-for="operation in operations" :key="operation" :label="operation" :value="operation" />
          </el-select>
        </el-form-item>
        <el-form-item label="时间范围">
          <el-date-picker
            v-model="searchForm.timeRange"
            type="daterange"
            range-separator="至"
            start-placeholder="开始时间"
            end-placeholder="结束时间"
            format="YYYY-MM-DD HH:mm:ss"
            value-format="YYYY-MM-DD HH:mm:ss"
          />
        </el-form-item>
        <el-form-item>
          <el-button type="primary" @click="searchTraces">查询</el-button>
          <el-button @click="resetForm">重置</el-button>
        </el-form-item>
      </el-form>
      
      <el-tabs v-model="activeTab">
        <el-tab-pane label="性能概览" name="overview">
          <div class="overview-container">
            <el-row :gutter="20">
              <el-col :span="6">
                <div class="stat-card">
                  <div class="stat-value">{{ stats.average_response_time }}ms</div>
                  <div class="stat-label">平均响应时间</div>
                </div>
              </el-col>
              <el-col :span="6">
                <div class="stat-card">
                  <div class="stat-value">{{ stats.p95_response_time }}ms</div>
                  <div class="stat-label">P95 响应时间</div>
                </div>
              </el-col>
              <el-col :span="6">
                <div class="stat-card">
                  <div class="stat-value">{{ stats.error_rate }}%</div>
                  <div class="stat-label">错误率</div>
                </div>
              </el-col>
              <el-col :span="6">
                <div class="stat-card">
                  <div class="stat-value">{{ stats.throughput }}/s</div>
                  <div class="stat-label">吞吐量</div>
                </div>
              </el-col>
            </el-row>
            <el-row :gutter="20" style="margin-top: 20px;">
              <el-col :span="12">
                <el-card class="chart-card">
                  <template #header>
                    <div class="chart-header">
                      <span>响应时间趋势</span>
                    </div>
                  </template>
                  <div class="chart-content">
                    <el-chart>
                      <el-line-chart :data="responseTimeTrend" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="12">
                <el-card class="chart-card">
                  <template #header>
                    <div class="chart-header">
                      <span>错误率趋势</span>
                    </div>
                  </template>
                  <div class="chart-content">
                    <el-chart>
                      <el-line-chart :data="errorRateTrend" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
          </div>
        </el-tab-pane>
        <el-tab-pane label="trace 详情" name="traces">
          <el-table :data="traces" style="width: 100%">
            <el-table-column prop="traceID" label="Trace ID" width="300" />
            <el-table-column prop="spans[0].operationName" label="操作" />
            <el-table-column prop="spans[0].duration" label="耗时(μs)" width="120" />
            <el-table-column prop="spans[0].startTime" label="开始时间" width="180" />
            <el-table-column label="操作" width="100">
              <template #default="{ row }">
                <el-button size="small" @click="viewTraceDetails(row.traceID)">查看</el-button>
              </template>
            </el-table-column>
          </el-table>
          <div class="pagination">
            <el-pagination
              v-model:current-page="currentPage"
              v-model:page-size="pageSize"
              :page-sizes="[10, 20, 50, 100]"
              layout="total, sizes, prev, pager, next, jumper"
              :total="total"
              @size-change="handleSizeChange"
              @current-change="handleCurrentChange"
            />
          </div>
        </el-tab-pane>
        <el-tab-pane label="服务依赖" name="dependencies">
          <div class="dependencies-container">
            <el-card class="chart-card">
              <template #header>
                <div class="chart-header">
                  <span>服务依赖图</span>
                </div>
              </template>
              <div class="chart-content">
                <el-chart>
                  <el-graph-chart :data="serviceDependencies" />
                </el-chart>
              </div>
            </el-card>
          </div>
        </el-tab-pane>
      </el-tabs>
    </el-card>
    
    <!-- Trace 详情对话框 -->
    <el-dialog v-model="traceDetailsVisible" title="Trace 详情" width="80%">
      <div class="trace-details">
        <el-tree :data="traceDetails" :props="treeProps" />
      </div>
    </el-dialog>
  </div>
</template>

<script setup>
import { ref, onMounted, watch } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'

const activeTab = ref('overview')
const services = ref([])
const operations = ref([])
const traces = ref([])
const stats = ref({
  average_response_time: 0,
  p95_response_time: 0,
  error_rate: 0,
  throughput: 0
})
const responseTimeTrend = ref([])
const errorRateTrend = ref([])
const serviceDependencies = ref([])
const traceDetails = ref([])
const currentPage = ref(1)
const pageSize = ref(10)
const total = ref(0)
const traceDetailsVisible = ref(false)
const searchForm = ref({
  service: '',
  operation: '',
  timeRange: []
})
const treeProps = {
  children: 'spans',
  label: 'operationName'
}

// 获取服务列表
const getServices = async () => {
  try {
    const response = await axios.get('/api/performance/services')
    services.value = response.data.data
  } catch (error) {
    ElMessage.error('获取服务列表失败')
    console.error(error)
  }
}

// 获取操作列表
const getOperations = async () => {
  if (!searchForm.value.service) {
    operations.value = []
    return
  }
  
  try {
    const response = await axios.get(`/api/performance/services/${searchForm.value.service}/operations`)
    operations.value = response.data.data
  } catch (error) {
    ElMessage.error('获取操作列表失败')
    console.error(error)
  }
}

// 搜索 traces
const searchTraces = async () => {
  try {
    const params = {
      service_name: searchForm.value.service,
      operation_name: searchForm.value.operation,
      start_time: searchForm.value.timeRange[0],
      end_time: searchForm.value.timeRange[1],
      limit: pageSize.value
    }
    const response = await axios.get('/api/performance/traces', { params })
    traces.value = response.data.data
    total.value = response.data.total
  } catch (error) {
    ElMessage.error('搜索 traces 失败')
    console.error(error)
  }
}

// 获取性能统计
const getPerformanceStats = async () => {
  try {
    const params = {
      service_name: searchForm.value.service,
      start_time: searchForm.value.timeRange[0],
      end_time: searchForm.value.timeRange[1]
    }
    const response = await axios.get('/api/performance/stats', { params })
    stats.value = response.data
  } catch (error) {
    ElMessage.error('获取性能统计失败')
    console.error(error)
  }
}

// 查看 trace 详情
const viewTraceDetails = async (traceId) => {
  try {
    const response = await axios.get(`/api/performance/traces/${traceId}`)
    traceDetails.value = response.data
    traceDetailsVisible.value = true
  } catch (error) {
    ElMessage.error('获取 trace 详情失败')
    console.error(error)
  }
}

// 重置表单
const resetForm = () => {
  searchForm.value = {
    service: '',
    operation: '',
    timeRange: []
  }
  operations.value = []
}

// 分页处理
const handleSizeChange = (size) => {
  pageSize.value = size
  searchTraces()
}

const handleCurrentChange = (current) => {
  currentPage.value = current
  searchTraces()
}

// 监听服务变化
watch(() => searchForm.value.service, () => {
  getOperations()
  getPerformanceStats()
  searchTraces()
})

// 初始加载
onMounted(() => {
  getServices()
  getPerformanceStats()
  searchTraces()
})
</script>

<style scoped>
.application-performance {
  padding: 20px;
}

.card-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
}

.search-form {
  margin-bottom: 20px;
  padding: 15px;
  background-color: #f5f7fa;
  border-radius: 8px;
}

.overview-container {
  margin-top: 20px;
}

.stat-card {
  background-color: #f5f7fa;
  border-radius: 8px;
  padding: 20px;
  text-align: center;
  box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}

.stat-value {
  font-size: 24px;
  font-weight: bold;
  color: #1E40AF;
}

.stat-label {
  font-size: 14px;
  color: #64748B;
  margin-top: 5px;
}

.chart-card {
  margin-top: 20px;
}

.chart-header {
  display: flex;
  justify-content: center;
  font-weight: bold;
}

.chart-content {
  height: 300px;
}

.pagination {
  margin-top: 20px;
  display: flex;
  justify-content: flex-end;
}

.trace-details {
  max-height: 600px;
  overflow-y: auto;
}

.dependencies-container {
  margin-top: 20px;
}
</style>

三、数据库优化

3.1 优化技术

3.1.1 索引优化

sql
-- 创建索引
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at);

-- 创建复合索引
CREATE INDEX idx_orders_user_id_created_at ON orders(user_id, created_at);

-- 查看索引
SHOW INDEX FROM users;

-- 分析索引使用情况
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';

-- 删除无用索引
DROP INDEX idx_users_old_index;

3.1.2 查询优化

sql
-- 避免 SELECT *
SELECT id, name, email FROM users WHERE active = 1;

-- 使用 LIMIT 限制结果集
SELECT * FROM users ORDER BY created_at DESC LIMIT 10;

-- 避免在 WHERE 子句中使用函数
-- 不好的做法
SELECT * FROM users WHERE YEAR(created_at) = 2023;
-- 好的做法
SELECT * FROM users WHERE created_at BETWEEN '2023-01-01' AND '2023-12-31';

-- 使用 JOIN 替代子查询
-- 子查询
SELECT * FROM users WHERE id IN (SELECT user_id FROM orders WHERE amount > 1000);
-- JOIN
SELECT u.* FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 1000 GROUP BY u.id;

-- 避免使用 OR,使用 IN 替代
-- 不好的做法
SELECT * FROM users WHERE status = 'active' OR status = 'pending';
-- 好的做法
SELECT * FROM users WHERE status IN ('active', 'pending');

3.1.3 表结构优化

sql
-- 选择合适的数据类型
-- 不好的做法
CREATE TABLE users (
  id INT,
  name VARCHAR(255),
  email VARCHAR(255),
  status VARCHAR(20),
  created_at DATETIME
);
-- 好的做法
CREATE TABLE users (
  id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(100),
  email VARCHAR(100) UNIQUE,
  status ENUM('active', 'pending', 'inactive'),
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 分区表
CREATE TABLE logs (
  id INT PRIMARY KEY AUTO_INCREMENT,
  level VARCHAR(20),
  message TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) PARTITION BY RANGE (YEAR(created_at)) (
  PARTITION p2023 VALUES LESS THAN (2024),
  PARTITION p2024 VALUES LESS THAN (2025),
  PARTITION p2025 VALUES LESS THAN (2026)
);

-- 分表
-- 创建用户表分表
CREATE TABLE users_0 (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(100) UNIQUE,
  status ENUM('active', 'pending', 'inactive'),
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE users_1 (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(100) UNIQUE,
  status ENUM('active', 'pending', 'inactive'),
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

3.1.4 配置优化

bash
# MySQL 配置优化
# /etc/mysql/my.cnf

[mysqld]
# 内存配置
innodb_buffer_pool_size = 4G
key_buffer_size = 256M

# 查询缓存
query_cache_type = 1
query_cache_size = 64M

# 连接配置
max_connections = 1000
wait_timeout = 60

# 日志配置
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 1

# 存储引擎
default-storage-engine = InnoDB

# InnoDB 配置
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 16M
innodb_log_file_size = 256M
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000

# 重启 MySQL
sudo systemctl restart mysql

3.2 数据库优化系统设计

3.2.1 架构设计

  • 前端:Vue.js + Element Plus + ECharts
  • 后端:Python + FastAPI
  • 数据库:MySQL + PostgreSQL
  • 监控:Prometheus + Grafana

3.2.2 后端实现

python
# 数据库监控 API
@app.get("/database/metrics")
async def get_database_metrics(
    database: str = None,
    metric_name: str = None,
    start_time: str = None,
    end_time: str = None,
    step: str = "1m"
):
    # 构建 Prometheus 查询 URL
    prometheus_url = "http://localhost:9090/api/v1"
    
    if start_time and end_time:
        # 范围查询
        url = f"{prometheus_url}/query_range"
        params = {
            "query": metric_name or "mysql_global_status_connections",
            "start": start_time,
            "end": end_time,
            "step": step
        }
    else:
        # 即时查询
        url = f"{prometheus_url}/query"
        params = {
            "query": metric_name or "mysql_global_status_connections"
        }
    
    try:
        response = requests.get(url, params=params)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get metrics: {str(e)}")

# 获取慢查询日志
@app.get("/database/slow-queries")
async def get_slow_queries(
    start_time: str = None,
    end_time: str = None,
    limit: int = 100
):
    # 这里可以实现慢查询日志解析逻辑
    # 例如从文件或数据库中读取慢查询日志
    
    return [
        {
            "id": 1,
            "query": "SELECT * FROM users WHERE email = 'user@example.com'",
            "duration": 1.23,
            "created_at": "2023-01-01T12:00:00Z"
        }
    ]

# 获取数据库表信息
@app.get("/database/tables")
async def get_tables(
    database: str = None
):
    # 这里可以实现获取数据库表信息的逻辑
    # 例如通过 SHOW TABLES 或 information_schema 查询
    
    return [
        {
            "name": "users",
            "rows": 1000000,
            "size": "100MB",
            "engine": "InnoDB"
        }
    ]

# 分析表
@app.post("/database/tables/{table_name}/analyze")
async def analyze_table(
    table_name: str,
    database: str = "default"
):
    # 这里可以实现表分析逻辑
    # 例如执行 ANALYZE TABLE 语句
    
    return {
        "message": f"Table {table_name} analyzed successfully",
        "recommendations": [
            "Add index on column 'email'",
            "Optimize query: SELECT * FROM users WHERE created_at > '2023-01-01'"
        ]
    }

3.2.3 前端实现

vue
<template>
  <div class="database-optimization">
    <el-card>
      <template #header>
        <div class="card-header">
          <span>数据库优化</span>
        </div>
      </template>
      
      <el-tabs v-model="activeTab">
        <el-tab-pane label="数据库监控" name="monitoring">
          <div class="monitoring-container">
            <el-row :gutter="20">
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>连接数</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ connectionCount }}</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="connectionHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>查询速率</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ queryRate }}/s</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="queryRateHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
            <el-row :gutter="20" style="margin-top: 20px;">
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>慢查询数</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ slowQueryCount }}</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="slowQueryHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>缓存命中率</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ cacheHitRate }}%</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="cacheHitRateHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
          </div>
        </el-tab-pane>
        <el-tab-pane label="慢查询分析" name="slow-queries">
          <el-table :data="slowQueries" style="width: 100%">
            <el-table-column prop="id" label="ID" width="80" />
            <el-table-column prop="query" label="查询语句">
              <template #default="{ row }">
                <div class="query-content">{{ row.query }}</div>
              </template>
            </el-table-column>
            <el-table-column prop="duration" label="执行时间(秒)" width="120" />
            <el-table-column prop="created_at" label="执行时间" width="180" />
            <el-table-column label="操作" width="150">
              <template #default="{ row }">
                <el-button size="small" @click="explainQuery(row.query)">分析</el-button>
                <el-button size="small" @click="optimizeQuery(row.query)">优化</el-button>
              </template>
            </el-table-column>
          </el-table>
        </el-tab-pane>
        <el-tab-pane label="表分析" name="tables">
          <el-table :data="tables" style="width: 100%">
            <el-table-column prop="name" label="表名" />
            <el-table-column prop="rows" label="行数" width="120" />
            <el-table-column prop="size" label="大小" width="120" />
            <el-table-column prop="engine" label="引擎" width="100" />
            <el-table-column label="操作" width="200">
              <template #default="{ row }">
                <el-button size="small" @click="analyzeTable(row.name)">分析</el-button>
                <el-button size="small" @click="optimizeTable(row.name)">优化</el-button>
                <el-button size="small" @click="viewIndexes(row.name)">查看索引</el-button>
              </template>
            </el-table-column>
          </el-table>
        </el-tab-pane>
        <el-tab-pane label="配置管理" name="config">
          <el-form :model="configForm" label-width="120px">
            <el-form-item label="innodb_buffer_pool_size">
              <el-input v-model="configForm.innodb_buffer_pool_size" />
            </el-form-item>
            <el-form-item label="max_connections">
              <el-input v-model.number="configForm.max_connections" type="number" />
            </el-form-item>
            <el-form-item label="long_query_time">
              <el-input v-model.number="configForm.long_query_time" type="number" step="0.1" />
            </el-form-item>
            <el-form-item label="query_cache_size">
              <el-input v-model="configForm.query_cache_size" />
            </el-form-item>
            <el-form-item>
              <el-button type="primary" @click="saveConfig">保存配置</el-button>
              <el-button @click="loadConfig">加载当前配置</el-button>
            </el-form-item>
          </el-form>
        </el-tab-pane>
      </el-tabs>
    </el-card>
    
    <!-- 查询分析对话框 -->
    <el-dialog v-model="explainDialogVisible" title="查询分析">
      <div class="explain-dialog">
        <el-table :data="explainResult" style="width: 100%">
          <el-table-column prop="id" label="ID" />
          <el-table-column prop="select_type" label="类型" />
          <el-table-column prop="table" label="表" />
          <el-table-column prop="type" label="访问类型" />
          <el-table-column prop="possible_keys" label="可能的索引" />
          <el-table-column prop="key" label="使用的索引" />
          <el-table-column prop="key_len" label="索引长度" />
          <el-table-column prop="ref" label="引用" />
          <el-table-column prop="rows" label="扫描行数" />
          <el-table-column prop="Extra" label="额外信息" />
        </el-table>
      </div>
    </el-dialog>
    
    <!-- 索引管理对话框 -->
    <el-dialog v-model="indexDialogVisible" title="索引管理">
      <div class="index-dialog">
        <el-table :data="indexes" style="width: 100%">
          <el-table-column prop="Table" label="表" />
          <el-table-column prop="Non_unique" label="非唯一" />
          <el-table-column prop="Key_name" label="索引名" />
          <el-table-column prop="Seq_in_index" label="序列" />
          <el-table-column prop="Column_name" label="列名" />
          <el-table-column prop="Collation" label="排序" />
          <el-table-column prop="Cardinality" label="基数" />
          <el-table-column prop="Sub_part" label="子部分" />
          <el-table-column prop="Packed" label="压缩" />
          <el-table-column prop="Null" label="可为空" />
          <el-table-column prop="Index_type" label="索引类型" />
          <el-table-column prop="Comment" label="注释" />
        </el-table>
      </div>
    </el-dialog>
  </div>
</template>

<script setup>
import { ref, onMounted } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'

const activeTab = ref('monitoring')
const connectionCount = ref(0)
const queryRate = ref(0)
const slowQueryCount = ref(0)
const cacheHitRate = ref(0)
const connectionHistory = ref([])
const queryRateHistory = ref([])
const slowQueryHistory = ref([])
const cacheHitRateHistory = ref([])
const slowQueries = ref([])
const tables = ref([])
const indexes = ref([])
const explainResult = ref([])
const explainDialogVisible = ref(false)
const indexDialogVisible = ref(false)
const configForm = ref({
  innodb_buffer_pool_size: '4G',
  max_connections: 1000,
  long_query_time: 1,
  query_cache_size: '64M'
})

// 获取数据库指标
const getDatabaseMetrics = async () => {
  try {
    // 获取连接数
    const connectionResponse = await axios.get('/api/database/metrics', {
      params: { metric_name: 'mysql_global_status_connections' }
    })
    connectionCount.value = connectionResponse.data.data.result[0].value[1]
    
    // 获取查询速率
    const queryResponse = await axios.get('/api/database/metrics', {
      params: { metric_name: 'rate(mysql_global_status_queries[1m])' }
    })
    queryRate.value = Math.round(queryResponse.data.data.result[0].value[1])
    
    // 获取慢查询数
    const slowQueryResponse = await axios.get('/api/database/metrics', {
      params: { metric_name: 'mysql_global_status_slow_queries' }
    })
    slowQueryCount.value = slowQueryResponse.data.data.result[0].value[1]
    
    // 获取缓存命中率
    const cacheHitResponse = await axios.get('/api/database/metrics', {
      params: { metric_name: '(mysql_global_status_qcache_hits / (mysql_global_status_qcache_hits + mysql_global_status_qcache_inserts)) * 100' }
    })
    cacheHitRate.value = Math.round(cacheHitResponse.data.data.result[0].value[1])
  } catch (error) {
    ElMessage.error('获取数据库指标失败')
    console.error(error)
  }
}

// 获取慢查询日志
const getSlowQueries = async () => {
  try {
    const response = await axios.get('/api/database/slow-queries')
    slowQueries.value = response.data
  } catch (error) {
    ElMessage.error('获取慢查询日志失败')
    console.error(error)
  }
}

// 获取数据库表信息
const getTables = async () => {
  try {
    const response = await axios.get('/api/database/tables')
    tables.value = response.data
  } catch (error) {
    ElMessage.error('获取数据库表信息失败')
    console.error(error)
  }
}

// 分析查询
const explainQuery = async (query) => {
  try {
    const response = await axios.post('/api/database/explain', { query })
    explainResult.value = response.data
    explainDialogVisible.value = true
  } catch (error) {
    ElMessage.error('分析查询失败')
    console.error(error)
  }
}

// 优化查询
const optimizeQuery = async (query) => {
  try {
    const response = await axios.post('/api/database/optimize-query', { query })
    ElMessage.success('查询优化建议已生成')
    // 显示优化建议
    console.log('Optimization suggestions:', response.data)
  } catch (error) {
    ElMessage.error('优化查询失败')
    console.error(error)
  }
}

// 分析表
const analyzeTable = async (tableName) => {
  try {
    const response = await axios.post(`/api/database/tables/${tableName}/analyze`)
    ElMessage.success(`表 ${tableName} 分析完成`)
    // 显示分析结果
    console.log('Analysis result:', response.data)
  } catch (error) {
    ElMessage.error('分析表失败')
    console.error(error)
  }
}

// 优化表
const optimizeTable = async (tableName) => {
  try {
    await axios.post(`/api/database/tables/${tableName}/optimize`)
    ElMessage.success(`表 ${tableName} 优化完成`)
  } catch (error) {
    ElMessage.error('优化表失败')
    console.error(error)
  }
}

// 查看索引
const viewIndexes = async (tableName) => {
  try {
    const response = await axios.get(`/api/database/tables/${tableName}/indexes`)
    indexes.value = response.data
    indexDialogVisible.value = true
  } catch (error) {
    ElMessage.error('查看索引失败')
    console.error(error)
  }
}

// 保存配置
const saveConfig = async () => {
  try {
    await axios.post('/api/database/config', configForm.value)
    ElMessage.success('配置保存成功')
  } catch (error) {
    ElMessage.error('保存配置失败')
    console.error(error)
  }
}

// 加载配置
const loadConfig = async () => {
  try {
    const response = await axios.get('/api/database/config')
    configForm.value = response.data
    ElMessage.success('配置加载成功')
  } catch (error) {
    ElMessage.error('加载配置失败')
    console.error(error)
  }
}

// 初始加载
onMounted(() => {
  getDatabaseMetrics()
  getSlowQueries()
  getTables()
  loadConfig()
  
  // 定时刷新
  setInterval(() => {
    getDatabaseMetrics()
    getSlowQueries()
  }, 30000)
})
</script>

<style scoped>
.database-optimization {
  padding: 20px;
}

.card-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
}

.monitoring-container {
  margin-top: 20px;
}

.metric-card {
  height: 250px;
}

.metric-header {
  display: flex;
  justify-content: center;
  font-weight: bold;
}

.metric-value {
  font-size: 36px;
  font-weight: bold;
  text-align: center;
  margin: 20px 0;
  color: #1E40AF;
}

.metric-chart {
  height: 150px;
}

.query-content {
  white-space: pre-wrap;
  font-family: monospace;
  font-size: 12px;
  background-color: #f5f7fa;
  padding: 10px;
  border-radius: 4px;
  max-height: 100px;
  overflow-y: auto;
}

.explain-dialog {
  max-height: 600px;
  overflow-y: auto;
}

.index-dialog {
  max-height: 600px;
  overflow-y: auto;
}
</style>

四、网络优化

4.1 优化技术

4.1.1 网络监控

bash
# 安装网络监控工具
sudo apt install net-tools iftop nethogs tcpdump wireshark

# 使用 iftop 监控网络流量
sudo iftop -i eth0

# 使用 nethogs 监控进程网络使用
sudo nethogs

# 使用 tcpdump 抓包
sudo tcpdump -i eth0 port 80

# 使用 ping 测试网络延迟
ping google.com

# 使用 traceroute 测试路由
traceroute google.com

# 使用 mtr 测试网络质量
sudo apt install mtr
sudo mtr google.com

4.1.2 网络参数优化

bash
# 优化 Linux 网络参数
# /etc/sysctl.conf

# 最大文件句柄数
fs.file-max = 65535

# 网络参数
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# TCP 参数
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_slow_start_after_idle = 0

# 应用配置
sudo sysctl -p

4.1.3 CDN 优化

bash
# 使用 Cloudflare CDN
# 1. 登录 Cloudflare 控制台
# 2. 添加域名
# 3. 配置 DNS 记录
# 4. 启用 CDN 功能

# 配置 Nginx 支持 CDN
# /etc/nginx/nginx.conf

http {
  # Gzip 压缩
  gzip on;
  gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
  gzip_comp_level 6;
  gzip_buffers 16 8k;
  gzip_min_length 256;
  gzip_vary on;
  
  # 缓存配置
  proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m use_temp_path=off;
  
  server {
    listen 80;
    server_name example.com;
    
    # CDN 配置
    location / {
      proxy_pass http://backend;
      proxy_cache my_cache;
      proxy_cache_valid 200 302 60m;
      proxy_cache_valid 404 1m;
      proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
      proxy_cache_background_update on;
      proxy_cache_lock on;
      
      # 缓存控制
      add_header X-Proxy-Cache $upstream_cache_status;
      expires 7d;
    }
    
    # 静态文件
    location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
      root /var/www/html;
      expires 30d;
      add_header Cache-Control "public, immutable";
    }
  }
}

4.1.4 负载均衡

bash
# Nginx 负载均衡配置
# /etc/nginx/nginx.conf

http {
  upstream backend {
    # 轮询
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    server 192.168.1.12:8080;
    
    # 权重
    # server 192.168.1.10:8080 weight=5;
    # server 192.168.1.11:8080 weight=3;
    # server 192.168.1.12:8080 weight=2;
    
    # IP 哈希
    # ip_hash;
    
    # 最少连接
    # least_conn;
  }
  
  server {
    listen 80;
    server_name example.com;
    
    location / {
      proxy_pass http://backend;
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
    }
  }
}

4.2 网络优化系统设计

4.2.1 架构设计

  • 前端:Vue.js + Element Plus + ECharts
  • 后端:Python + FastAPI
  • 监控:Prometheus + Grafana
  • 存储:InfluxDB

4.2.2 后端实现

python
# 网络监控 API
@app.get("/network/metrics")
async def get_network_metrics(
    interface: str = None,
    metric_name: str = None,
    start_time: str = None,
    end_time: str = None,
    step: str = "1m"
):
    # 构建 Prometheus 查询 URL
    prometheus_url = "http://localhost:9090/api/v1"
    
    if start_time and end_time:
        # 范围查询
        url = f"{prometheus_url}/query_range"
        params = {
            "query": metric_name or "node_network_receive_bytes_total",
            "start": start_time,
            "end": end_time,
            "step": step
        }
    else:
        # 即时查询
        url = f"{prometheus_url}/query"
        params = {
            "query": metric_name or "node_network_receive_bytes_total"
        }
    
    if interface:
        params["query"] += f"{{device='{interface}'}}"
    
    try:
        response = requests.get(url, params=params)
        response.raise_for_status()
        return response.json()
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get metrics: {str(e)}")

# 获取网络接口
@app.get("/network/interfaces")
async def get_network_interfaces():
    # 构建 Prometheus 查询 URL
    prometheus_url = "http://localhost:9090/api/v1/query"
    params = {
        "query": "count by(device) (node_network_receive_bytes_total)"
    }
    
    try:
        response = requests.get(prometheus_url, params=params)
        response.raise_for_status()
        interfaces = [result['metric']['device'] for result in response.json()['data']['result']]
        return interfaces
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to get interfaces: {str(e)}")

# 网络质量测试
@app.post("/network/test")
async def test_network(
    target: str,
    type: str = "ping"
):
    # 执行网络测试
    if type == "ping":
        try:
            result = subprocess.run(
                ["ping", "-c", "5", target],
                capture_output=True,
                text=True,
                timeout=10
            )
            return {
                "type": "ping",
                "target": target,
                "stdout": result.stdout,
                "stderr": result.stderr,
                "returncode": result.returncode
            }
        except Exception as e:
            return {
                "type": "ping",
                "target": target,
                "error": str(e)
            }
    elif type == "traceroute":
        try:
            result = subprocess.run(
                ["traceroute", target],
                capture_output=True,
                text=True,
                timeout=30
            )
            return {
                "type": "traceroute",
                "target": target,
                "stdout": result.stdout,
                "stderr": result.stderr,
                "returncode": result.returncode
            }
        except Exception as e:
            return {
                "type": "traceroute",
                "target": target,
                "error": str(e)
            }
    else:
        raise HTTPException(status_code=400, detail="Invalid test type")

4.2.3 前端实现

vue
<template>
  <div class="network-optimization">
    <el-card>
      <template #header>
        <div class="card-header">
          <span>网络优化</span>
        </div>
      </template>
      
      <el-tabs v-model="activeTab">
        <el-tab-pane label="网络监控" name="monitoring">
          <div class="monitoring-container">
            <el-row :gutter="20">
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>入站流量</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ inboundTraffic }}</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="inboundHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
              <el-col :span="12">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>出站流量</span>
                    </div>
                  </template>
                  <div class="metric-value">{{ outboundTraffic }}</div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="outboundHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
            <el-row :gutter="20" style="margin-top: 20px;">
              <el-col :span="24">
                <el-card class="metric-card">
                  <template #header>
                    <div class="metric-header">
                      <span>接口流量</span>
                    </div>
                  </template>
                  <div class="interface-selector">
                    <el-select v-model="selectedInterface" placeholder="选择接口">
                      <el-option v-for="interface in interfaces" :key="interface" :label="interface" :value="interface" />
                    </el-select>
                  </div>
                  <div class="metric-chart">
                    <el-chart>
                      <el-line-chart :data="interfaceHistory" />
                    </el-chart>
                  </div>
                </el-card>
              </el-col>
            </el-row>
          </div>
        </el-tab-pane>
        <el-tab-pane label="网络测试" name="test">
          <div class="test-container">
            <el-form :inline="true" :model="testForm" class="test-form">
              <el-form-item label="目标">
                <el-input v-model="testForm.target" placeholder="例如: google.com" />
              </el-form-item>
              <el-form-item label="测试类型">
                <el-select v-model="testForm.type">
                  <el-option label="Ping" value="ping" />
                  <el-option label="Traceroute" value="traceroute" />
                </el-select>
              </el-form-item>
              <el-form-item>
                <el-button type="primary" @click="runTest">执行测试</el-button>
              </el-form-item>
            </el-form>
            <div class="test-result">
              <el-card>
                <template #header>
                  <div class="result-header">
                    <span>测试结果</span>
                  </div>
                </template>
                <pre>{{ testResult }}</pre>
              </el-card>
            </div>
          </div>
        </el-tab-pane>
        <el-tab-pane label="参数优化" name="config">
          <el-form :model="configForm" label-width="120px">
            <el-form-item label="net.core.somaxconn">
              <el-input v-model.number="configForm.net_core_somaxconn" type="number" />
            </el-form-item>
            <el-form-item label="net.core.netdev_max_backlog">
              <el-input v-model.number="configForm.net_core_netdev_max_backlog" type="number" />
            </el-form-item>
            <el-form-item label="net.ipv4.tcp_max_syn_backlog">
              <el-input v-model.number="configForm.net_ipv4_tcp_max_syn_backlog" type="number" />
            </el-form-item>
            <el-form-item label="net.ipv4.tcp_fin_timeout">
              <el-input v-model.number="configForm.net_ipv4_tcp_fin_timeout" type="number" />
            </el-form-item>
            <el-form-item label="net.ipv4.tcp_keepalive_time">
              <el-input v-model.number="configForm.net_ipv4_tcp_keepalive_time" type="number" />
            </el-form-item>
            <el-form-item>
              <el-button type="primary" @click="saveConfig">保存配置</el-button>
              <el-button @click="loadConfig">加载当前配置</el-button>
            </el-form-item>
          </el-form>
        </el-tab-pane>
      </el-tabs>
    </el-card>
  </div>
</template>

<script setup>
import { ref, onMounted } from 'vue'
import { ElMessage } from 'element-plus'
import axios from 'axios'

const activeTab = ref('monitoring')
const inboundTraffic = ref('0 MB/s')
const outboundTraffic = ref('0 MB/s')
const inboundHistory = ref([])
const outboundHistory = ref([])
const interfaces = ref([])
const selectedInterface = ref('')
const interfaceHistory = ref([])
const testForm = ref({
  target: 'google.com',
  type: 'ping'
})
const testResult = ref('')
const configForm = ref({
  net_core_somaxconn: 65535,
  net_core_netdev_max_backlog: 65535,
  net_ipv4_tcp_max_syn_backlog: 65535,
  net_ipv4_tcp_fin_timeout: 30,
  net_ipv4_tcp_keepalive_time: 1200
})

// 获取网络指标
const getNetworkMetrics = async () => {
  try {
    // 获取入站流量
    const inboundResponse = await axios.get('/api/network/metrics', {
      params: { metric_name: 'rate(node_network_receive_bytes_total[1m])' }
    })
    const inboundValue = inboundResponse.data.data.result[0].value[1]
    inboundTraffic.value = `${(inboundValue / (1024 * 1024)).toFixed(2)} MB/s`
    
    // 获取出站流量
    const outboundResponse = await axios.get('/api/network/metrics', {
      params: { metric_name: 'rate(node_network_transmit_bytes_total[1m])' }
    })
    const outboundValue = outboundResponse.data.data.result[0].value[1]
    outboundTraffic.value = `${(outboundValue / (1024 * 1024)).toFixed(2)} MB/s`
  } catch (error) {
    ElMessage.error('获取网络指标失败')
    console.error(error)
  }
}

// 获取网络接口
const getInterfaces = async () => {
  try {
    const response = await axios.get('/api/network/interfaces')
    interfaces.value = response.data
    if (interfaces.value.length > 0) {
      selectedInterface.value = interfaces.value[0]
    }
  } catch (error) {
    ElMessage.error('获取网络接口失败')
    console.error(error)
  }
}

// 运行网络测试
const runTest = async () => {
  try {
    const response = await axios.post('/api/network/test', {
      target: testForm.value.target,
      type: testForm.value.type
    })
    if (response.data.stdout) {
      testResult.value = response.data.stdout
    } else if (response.data.error) {
      testResult.value = `错误: ${response.data.error}`
    }
  } catch (error) {
    ElMessage.error('运行网络测试失败')
    console.error(error)
  }
}

// 保存配置
const saveConfig = async () => {
  try {
    await axios.post('/api/network/config', configForm.value)
    ElMessage.success('配置保存成功')
  } catch (error) {
    ElMessage.error('保存配置失败')
    console.error(error)
  }
}

// 加载配置
const loadConfig = async () => {
  try {
    const response = await axios.get('/api/network/config')
    configForm.value = response.data
    ElMessage.success('配置加载成功')
  } catch (error) {
    ElMessage.error('加载配置失败')
    console.error(error)
  }
}

// 初始加载
onMounted(() => {
  getNetworkMetrics()
  getInterfaces()
  loadConfig()
  
  // 定时刷新
  setInterval(() => {
    getNetworkMetrics()
  }, 30000)
})
</script>

<style scoped>
.network-optimization {
  padding: 20px;
}

.card-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
}

.monitoring-container {
  margin-top: 20px;
}

.metric-card {
  height: 250px;
}

.metric-header {
  display: flex;
  justify-content: center;
  font-weight: bold;
}

.metric-value {
  font-size: 24px;
  font-weight: bold;
  text-align: center;
  margin: 20px 0;
  color: #1E40AF;
}

.metric-chart {
  height: 150px;
}

.interface-selector {
  margin-bottom: 10px;
  display: flex;
  justify-content: center;
}

.test-container {
  margin-top: 20px;
}

.test-form {
  margin-bottom: 20px;
  padding: 15px;
  background-color: #f5f7fa;
  border-radius: 8px;
}

.result-header {
  display: flex;
  justify-content: center;
  font-weight: bold;
}

.test-result pre {
  background-color: #f5f7fa;
  padding: 15px;
  border-radius: 8px;
  max-height: 400px;
  overflow-y: auto;
  font-family: monospace;
  font-size: 14px;
}
</style>

五、平台集成与部署

5.1 平台集成

5.1.1 API 集成

python
# 性能优化平台 API 网关
from fastapi import FastAPI, APIRouter
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

# 配置 CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# 注册路由
from routers import monitoring, performance, database, network

app.include_router(monitoring.router, prefix="/api/monitoring", tags=["monitoring"])
app.include_router(performance.router, prefix="/api/performance", tags=["performance"])
app.include_router(database.router, prefix="/api/database", tags=["database"])
app.include_router(network.router, prefix="/api/network", tags=["network"])

# 健康检查
@app.get("/health")
async def health_check():
    return {"status": "healthy"}

5.1.2 前端集成

javascript
// api.js
import axios from 'axios'

const api = axios.create({
  baseURL: '/api',
  timeout: 10000,
  headers: {
    'Content-Type': 'application/json'
  }
})

// 响应拦截器
api.interceptors.response.use(
  response => response,
  error => {
    console.error('API Error:', error)
    return Promise.reject(error)
  }
)

export default api

// 性能优化平台主应用
import { createApp } from 'vue'
import App from './App.vue'
import router from './router'
import ElementPlus from 'element-plus'
import 'element-plus/dist/index.css'

const app = createApp(App)
app.use(ElementPlus)
app.use(router)
app.mount('#app')

5.2 部署方案

5.2.1 Docker 部署

yaml
# docker-compose.yml
version: '3'
services:
  backend:
    build: ./backend
    ports:
      - "8000:8000"
    depends_on:
      - prometheus
      - grafana
      - influxdb
  frontend:
    build: ./frontend
    ports:
      - "80:80"
    depends_on:
      - backend
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    depends_on:
      - prometheus
  influxdb:
    image: influxdb:2.0
    ports:
      - "8086:8086"
    environment:
      - DOCKER_INFLUXDB_INIT_MODE=setup
      - DOCKER_INFLUXDB_INIT_USERNAME=admin
      - DOCKER_INFLUXDB_INIT_PASSWORD=password
      - DOCKER_INFLUXDB_INIT_ORG=myorg
      - DOCKER_INFLUXDB_INIT_BUCKET=metrics
  node_exporter:
    image: prom/node-exporter
    ports:
      - "9100:9100"
  blackbox_exporter:
    image: prom/blackbox-exporter
    volumes:
      - ./blackbox.yml:/etc/blackbox_exporter/config.yml
    ports:
      - "9115:9115"

5.2.2 Kubernetes 部署

yaml
# performance-platform-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: performance-platform-backend
spec:
  replicas: 2
  selector:
    matchLabels:
      app: performance-platform-backend
  template:
    metadata:
      labels:
        app: performance-platform-backend
    spec:
      containers:
      - name: backend
        image: performance-platform-backend:latest
        ports:
        - containerPort: 8000
        env:
        - name: PROMETHEUS_URL
          value: "http://prometheus:9090"
        - name: GRAFANA_URL
          value: "http://grafana:3000"

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: performance-platform-frontend
spec:
  replicas: 2
  selector:
    matchLabels:
      app: performance-platform-frontend
  template:
    metadata:
      labels:
        app: performance-platform-frontend
    spec:
      containers:
      - name: frontend
        image: performance-platform-frontend:latest
        ports:
        - containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
  name: performance-platform-backend
spec:
  selector:
    app: performance-platform-backend
  ports:
  - port: 8000
    targetPort: 8000

---
apiVersion: v1
kind: Service
metadata:
  name: performance-platform-frontend
spec:
  selector:
    app: performance-platform-frontend
  ports:
  - port: 80
    targetPort: 80
  type: LoadBalancer

六、最佳实践

6.1 系统性能监控最佳实践

  • 全面监控:覆盖系统的各个层面,包括 CPU、内存、磁盘、网络等
  • 实时监控:对系统状态进行实时监控,及时发现异常
  • 阈值告警:设置合理的阈值,当指标超过阈值时及时告警
  • 历史分析:对监控数据进行历史分析,发现性能趋势
  • 可视化:通过可视化仪表盘直观展示系统状态

6.2 应用性能分析最佳实践

  • 端到端监控:从用户请求到后端服务的端到端监控
  • 分布式追踪:使用分布式追踪技术,追踪请求在各个服务之间的流转
  • 性能瓶颈分析:通过分析找到应用的性能瓶颈
  • 代码级分析:对关键代码进行性能分析,找到优化点
  • 持续优化:建立性能优化的持续迭代机制

6.3 数据库优化最佳实践

  • 索引优化:合理创建和使用索引,提高查询性能
  • 查询优化:优化 SQL 查询语句,减少查询时间
  • 表结构优化:合理设计表结构,选择合适的数据类型
  • 分区和分表:对大表进行分区或分表,提高查询性能
  • 配置优化:根据实际情况优化数据库配置

6.4 网络优化最佳实践

  • 网络监控:实时监控网络状态,及时发现网络问题
  • 参数调优:根据实际情况调整网络参数
  • CDN 加速:使用 CDN 加速静态资源访问
  • 负载均衡:使用负载均衡技术,分散网络流量
  • 连接优化:优化网络连接,减少连接建立时间

6.5 性能优化平台最佳实践

  • 一体化平台:构建一体化的性能优化平台,整合各种工具和技术
  • 自动化:实现性能监控、分析、优化的自动化
  • 智能化:利用 AI 技术实现智能告警、智能分析、智能优化
  • 可扩展性:平台设计具有良好的可扩展性,支持添加新的监控指标和分析方法
  • 易用性:平台界面友好,操作简单,便于使用

七、总结

7.1 课程内容总结

本课程详细介绍了性能优化平台的设计与实现,包括:

  • 系统性能监控:监控技术、监控系统设计、前端实现
  • 应用性能分析:分析技术、应用性能分析系统设计、前端实现
  • 数据库优化:优化技术、数据库优化系统设计、前端实现
  • 网络优化:优化技术、网络优化系统设计、前端实现
  • 平台集成与部署:API 集成、前端集成、Docker 部署、Kubernetes 部署
  • 最佳实践:系统性能监控、应用性能分析、数据库优化、网络优化、性能优化平台的最佳实践

7.2 技术栈总结

  • 前端:Vue.js + Element Plus + ECharts
  • 后端:Python + FastAPI
  • 监控:Prometheus + Grafana + Node Exporter
  • 存储:InfluxDB + Elasticsearch
  • 容器:Docker + Kubernetes
  • 网络:Nginx + CDN
  • 数据库:MySQL + PostgreSQL

7.3 学习收获

通过本课程的学习,学员可以:

  1. 掌握性能优化平台的完整设计与实现流程
  2. 熟悉各种性能监控技术和工具的使用
  3. 了解系统性能、应用性能、数据库性能、网络性能的优化方法
  4. 具备构建企业级性能优化平台的能力
  5. 学习到性能优化的最佳实践和行业标准

性能优化是现代企业 IT 运维的重要组成部分,通过本课程的学习,学员将能够为企业构建更加高效、稳定的 IT 系统,提升业务系统的性能和可靠性。

评论区

专业的Linux技术学习平台,从入门到精通的完整学习路径