第3章：数据预处理基础

数据预处理是机器学习项目中最重要的步骤之一。现实世界的数据往往是"脏"的，包含缺失值、异常值、不同的量纲等问题。本章将详细介绍如何使用 Scikit-learn 进行数据预处理。

3.1 为什么需要数据预处理？

在实际项目中，原始数据通常存在以下问题：

缺失值：数据收集过程中的遗漏
异常值：测量错误或极端情况
量纲不同：不同特征的数值范围差异很大
数据类型不一致：数值型、类别型数据混合
重复数据：相同的记录出现多次

3.2 创建示例数据集

首先，让我们创建一个包含各种问题的示例数据集：

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import (
    StandardScaler, MinMaxScaler, RobustScaler,
    LabelEncoder, OneHotEncoder, 
    SimpleImputer, KNNImputer
)
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# 设置随机种子
np.random.seed(42)

# 创建示例数据
n_samples = 1000

# 生成基础数据
data = {
    'age': np.random.normal(35, 10, n_samples),
    'income': np.random.exponential(50000, n_samples),
    'education_years': np.random.normal(14, 3, n_samples),
    'credit_score': np.random.normal(650, 100, n_samples),
    'gender': np.random.choice(['Male', 'Female'], n_samples),
    'city': np.random.choice(['Beijing', 'Shanghai', 'Guangzhou', 'Shenzhen'], n_samples),
    'loan_approved': np.random.choice([0, 1], n_samples, p=[0.3, 0.7])
}

# 创建 DataFrame
df = pd.DataFrame(data)

# 人为引入一些问题
# 1. 缺失值
missing_indices = np.random.choice(df.index, size=int(0.1 * n_samples), replace=False)
df.loc[missing_indices[:50], 'income'] = np.nan
df.loc[missing_indices[50:], 'education_years'] = np.nan

# 2. 异常值
outlier_indices = np.random.choice(df.index, size=20, replace=False)
df.loc[outlier_indices, 'age'] = np.random.uniform(100, 120, 20)

# 3. 负值（不合理的数据）
negative_indices = np.random.choice(df.index, size=10, replace=False)
df.loc[negative_indices, 'credit_score'] = np.random.uniform(-100, 0, 10)

print("原始数据集信息：")
print(df.info())
print("\n数据前5行：")
print(df.head())

3.3 数据探索和问题识别

3.3.1 基本统计信息

python

# 查看数据统计信息
print("数据统计摘要：")
print(df.describe())

# 查看缺失值情况
print("\n缺失值统计：")
missing_stats = df.isnull().sum()
missing_percent = (missing_stats / len(df)) * 100
missing_df = pd.DataFrame({
    '缺失数量': missing_stats,
    '缺失比例(%)': missing_percent
})
print(missing_df[missing_df['缺失数量'] > 0])

# 查看数据类型
print("\n数据类型：")
print(df.dtypes)

3.3.2 可视化数据分布

python

# 创建可视化图表
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
fig.suptitle('数据分布可视化', fontsize=16)

# 数值型特征的分布
numeric_cols = ['age', 'income', 'education_years', 'credit_score']
for i, col in enumerate(numeric_cols):
    row = i // 2
    col_idx = i % 2
    
    # 直方图
    axes[row, col_idx].hist(df[col].dropna(), bins=30, alpha=0.7, edgecolor='black')
    axes[row, col_idx].set_title(f'{col} 分布')
    axes[row, col_idx].set_xlabel(col)
    axes[row, col_idx].set_ylabel('频次')

# 类别型特征的分布
axes[1, 2].pie(df['gender'].value_counts(), labels=df['gender'].value_counts().index, autopct='%1.1f%%')
axes[1, 2].set_title('性别分布')

# 目标变量分布
axes[1, 3] = fig.add_subplot(2, 3, 6)
df['loan_approved'].value_counts().plot(kind='bar', ax=axes[1, 3])
axes[1, 3].set_title('贷款批准情况')
axes[1, 3].set_xlabel('贷款批准 (0=否, 1=是)')

plt.tight_layout()
plt.show()

3.3.3 异常值检测

python

# 使用箱线图检测异常值
plt.figure(figsize=(12, 8))
numeric_cols = ['age', 'income', 'education_years', 'credit_score']

for i, col in enumerate(numeric_cols, 1):
    plt.subplot(2, 2, i)
    plt.boxplot(df[col].dropna())
    plt.title(f'{col} 箱线图')
    plt.ylabel(col)

plt.tight_layout()
plt.show()

# 使用 IQR 方法识别异常值
def detect_outliers_iqr(data, column):
    Q1 = data[column].quantile(0.25)
    Q3 = data[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    
    outliers = data[(data[column] < lower_bound) | (data[column] > upper_bound)]
    return outliers, lower_bound, upper_bound

# 检测各列的异常值
for col in numeric_cols:
    outliers, lower, upper = detect_outliers_iqr(df, col)
    print(f"\n{col} 异常值检测：")
    print(f"正常范围: [{lower:.2f}, {upper:.2f}]")
    print(f"异常值数量: {len(outliers)}")
    if len(outliers) > 0:
        print(f"异常值示例: {outliers[col].head().tolist()}")

3.4 处理缺失值

3.4.1 简单填充策略

python

# 创建数据副本用于处理
df_processed = df.copy()

# 方法1：使用均值填充数值型特征
print("处理前缺失值：")
print(df_processed.isnull().sum())

# 使用 SimpleImputer
numeric_imputer = SimpleImputer(strategy='mean')
numeric_cols_with_missing = ['income', 'education_years']

df_processed[numeric_cols_with_missing] = numeric_imputer.fit_transform(
    df_processed[numeric_cols_with_missing]
)

print("\n使用均值填充后：")
print(df_processed.isnull().sum())

# 方法2：使用众数填充类别型特征
categorical_imputer = SimpleImputer(strategy='most_frequent')
categorical_cols = ['gender', 'city']

# 如果类别型特征有缺失值
if df_processed[categorical_cols].isnull().sum().sum() > 0:
    df_processed[categorical_cols] = categorical_imputer.fit_transform(
        df_processed[categorical_cols]
    )

3.4.2 高级填充策略

python

# 方法3：使用 KNN 填充
df_knn = df.copy()

# 先对类别型变量进行编码
le_gender = LabelEncoder()
le_city = LabelEncoder()

df_knn['gender_encoded'] = le_gender.fit_transform(df_knn['gender'])
df_knn['city_encoded'] = le_city.fit_transform(df_knn['city'])

# 选择数值型特征进行 KNN 填充
numeric_features = ['age', 'income', 'education_years', 'credit_score', 'gender_encoded', 'city_encoded']
knn_imputer = KNNImputer(n_neighbors=5)

df_knn[numeric_features] = knn_imputer.fit_transform(df_knn[numeric_features])

print("KNN填充后的统计信息：")
print(df_knn[['income', 'education_years']].describe())

3.4.3 填充效果比较

python

# 比较不同填充方法的效果
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# 原始数据（去除缺失值）
axes[0].hist(df['income'].dropna(), bins=30, alpha=0.7, label='原始数据')
axes[0].set_title('原始收入分布')
axes[0].set_xlabel('收入')
axes[0].set_ylabel('频次')

# 均值填充
axes[1].hist(df_processed['income'], bins=30, alpha=0.7, label='均值填充', color='orange')
axes[1].set_title('均值填充后收入分布')
axes[1].set_xlabel('收入')

# KNN填充
axes[2].hist(df_knn['income'], bins=30, alpha=0.7, label='KNN填充', color='green')
axes[2].set_title('KNN填充后收入分布')
axes[2].set_xlabel('收入')

plt.tight_layout()
plt.show()

3.5 处理异常值

3.5.1 移除异常值

python

# 方法1：直接移除异常值
def remove_outliers_iqr(data, columns):
    """使用IQR方法移除异常值"""
    data_clean = data.copy()
    
    for col in columns:
        Q1 = data_clean[col].quantile(0.25)
        Q3 = data_clean[col].quantile(0.75)
        IQR = Q3 - Q1
        lower_bound = Q1 - 1.5 * IQR
        upper_bound = Q3 + 1.5 * IQR
        
        # 移除异常值
        data_clean = data_clean[
            (data_clean[col] >= lower_bound) & 
            (data_clean[col] <= upper_bound)
        ]
    
    return data_clean

# 移除年龄异常值
df_no_outliers = remove_outliers_iqr(df_processed, ['age'])
print(f"移除异常值前样本数: {len(df_processed)}")
print(f"移除异常值后样本数: {len(df_no_outliers)}")

3.5.2 限制异常值

python

# 方法2：将异常值限制在合理范围内
def cap_outliers(data, column, lower_percentile=5, upper_percentile=95):
    """将异常值限制在指定百分位数范围内"""
    lower_bound = data[column].quantile(lower_percentile / 100)
    upper_bound = data[column].quantile(upper_percentile / 100)
    
    data[column] = np.clip(data[column], lower_bound, upper_bound)
    return data

# 处理信用评分的负值
df_capped = df_processed.copy()
df_capped = cap_outliers(df_capped, 'credit_score', 1, 99)

print("处理异常值前后对比：")
print(f"处理前信用评分范围: [{df_processed['credit_score'].min():.2f}, {df_processed['credit_score'].max():.2f}]")
print(f"处理后信用评分范围: [{df_capped['credit_score'].min():.2f}, {df_capped['credit_score'].max():.2f}]")

3.6 特征缩放

3.6.1 标准化 (StandardScaler)

python

# 选择需要缩放的数值特征
numeric_features = ['age', 'income', 'education_years', 'credit_score']

# 标准化：均值为0，标准差为1
scaler_standard = StandardScaler()
df_standard = df_capped.copy()
df_standard[numeric_features] = scaler_standard.fit_transform(df_capped[numeric_features])

print("标准化后的统计信息：")
print(df_standard[numeric_features].describe())

3.6.2 最小-最大缩放 (MinMaxScaler)

python

# 最小-最大缩放：缩放到[0,1]范围
scaler_minmax = MinMaxScaler()
df_minmax = df_capped.copy()
df_minmax[numeric_features] = scaler_minmax.fit_transform(df_capped[numeric_features])

print("最小-最大缩放后的统计信息：")
print(df_minmax[numeric_features].describe())

3.6.3 鲁棒缩放 (RobustScaler)

python

# 鲁棒缩放：使用中位数和四分位距，对异常值不敏感
scaler_robust = RobustScaler()
df_robust = df_capped.copy()
df_robust[numeric_features] = scaler_robust.fit_transform(df_capped[numeric_features])

print("鲁棒缩放后的统计信息：")
print(df_robust[numeric_features].describe())

3.6.4 缩放方法比较

python

# 可视化不同缩放方法的效果
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
fig.suptitle('不同缩放方法对比', fontsize=16)

# 原始数据
axes[0, 0].boxplot([df_capped[col] for col in numeric_features], labels=numeric_features)
axes[0, 0].set_title('原始数据')
axes[0, 0].tick_params(axis='x', rotation=45)

# 标准化
axes[0, 1].boxplot([df_standard[col] for col in numeric_features], labels=numeric_features)
axes[0, 1].set_title('标准化')
axes[0, 1].tick_params(axis='x', rotation=45)

# 最小-最大缩放
axes[1, 0].boxplot([df_minmax[col] for col in numeric_features], labels=numeric_features)
axes[1, 0].set_title('最小-最大缩放')
axes[1, 0].tick_params(axis='x', rotation=45)

# 鲁棒缩放
axes[1, 1].boxplot([df_robust[col] for col in numeric_features], labels=numeric_features)
axes[1, 1].set_title('鲁棒缩放')
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

3.7 类别型特征编码

3.7.1 标签编码 (Label Encoding)

python

# 标签编码：将类别转换为数字
df_encoded = df_standard.copy()

# 对性别进行标签编码
le_gender = LabelEncoder()
df_encoded['gender_encoded'] = le_gender.fit_transform(df_encoded['gender'])

print("性别标签编码：")
gender_mapping = dict(zip(le_gender.classes_, le_gender.transform(le_gender.classes_)))
print(gender_mapping)

# 对城市进行标签编码
le_city = LabelEncoder()
df_encoded['city_encoded'] = le_city.fit_transform(df_encoded['city'])

print("\n城市标签编码：")
city_mapping = dict(zip(le_city.classes_, le_city.transform(le_city.classes_)))
print(city_mapping)

3.7.2 独热编码 (One-Hot Encoding)

python

# 独热编码：为每个类别创建二进制特征
df_onehot = df_standard.copy()

# 使用 pandas 进行独热编码
df_onehot = pd.get_dummies(df_onehot, columns=['gender', 'city'], prefix=['gender', 'city'])

print("独热编码后的特征：")
print(df_onehot.columns.tolist())
print(f"特征数量: {len(df_onehot.columns)}")

# 查看独热编码结果
print("\n独热编码示例（前5行）：")
onehot_cols = [col for col in df_onehot.columns if col.startswith(('gender_', 'city_'))]
print(df_onehot[onehot_cols].head())

3.7.3 编码方法比较

python

# 比较不同编码方法对模型性能的影响
def compare_encoding_methods():
    """比较标签编码和独热编码的效果"""
    
    # 准备数据
    X_label = df_encoded[['age', 'income', 'education_years', 'credit_score', 'gender_encoded', 'city_encoded']]
    X_onehot = df_onehot.drop(['gender', 'city', 'loan_approved'], axis=1, errors='ignore')
    y = df_encoded['loan_approved']
    
    results = {}
    
    # 标签编码
    X_train, X_test, y_train, y_test = train_test_split(X_label, y, test_size=0.2, random_state=42)
    model_label = RandomForestClassifier(random_state=42)
    model_label.fit(X_train, y_train)
    acc_label = model_label.score(X_test, y_test)
    results['标签编码'] = acc_label
    
    # 独热编码
    X_train, X_test, y_train, y_test = train_test_split(X_onehot, y, test_size=0.2, random_state=42)
    model_onehot = RandomForestClassifier(random_state=42)
    model_onehot.fit(X_train, y_train)
    acc_onehot = model_onehot.score(X_test, y_test)
    results['独热编码'] = acc_onehot
    
    return results

encoding_results = compare_encoding_methods()
print("编码方法性能比较：")
for method, accuracy in encoding_results.items():
    print(f"{method}: {accuracy:.4f}")

3.8 完整的预处理管道

3.8.1 创建预处理管道

python

from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer

def create_preprocessing_pipeline():
    """创建完整的数据预处理管道"""
    
    # 定义数值型和类别型特征
    numeric_features = ['age', 'income', 'education_years', 'credit_score']
    categorical_features = ['gender', 'city']
    
    # 数值型特征预处理管道
    numeric_pipeline = Pipeline([
        ('imputer', SimpleImputer(strategy='median')),  # 填充缺失值
        ('scaler', StandardScaler())  # 标准化
    ])
    
    # 类别型特征预处理管道
    categorical_pipeline = Pipeline([
        ('imputer', SimpleImputer(strategy='most_frequent')),  # 填充缺失值
        ('onehot', OneHotEncoder(drop='first', sparse_output=False))  # 独热编码
    ])
    
    # 组合预处理器
    preprocessor = ColumnTransformer([
        ('num', numeric_pipeline, numeric_features),
        ('cat', categorical_pipeline, categorical_features)
    ])
    
    return preprocessor

# 创建和使用预处理管道
preprocessor = create_preprocessing_pipeline()

# 准备数据
X = df[['age', 'income', 'education_years', 'credit_score', 'gender', 'city']]
y = df['loan_approved']

# 分割数据
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 应用预处理
X_train_processed = preprocessor.fit_transform(X_train)
X_test_processed = preprocessor.transform(X_test)

print(f"预处理后训练集形状: {X_train_processed.shape}")
print(f"预处理后测试集形状: {X_test_processed.shape}")

3.8.2 完整的机器学习管道

python

# 创建包含预处理和模型的完整管道
full_pipeline = Pipeline([
    ('preprocessor', preprocessor),
    ('classifier', RandomForestClassifier(random_state=42))
])

# 训练模型
full_pipeline.fit(X_train, y_train)

# 预测和评估
y_pred = full_pipeline.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"完整管道准确率: {accuracy:.4f}")

# 获取特征名称
feature_names = (
    ['age', 'income', 'education_years', 'credit_score'] +  # 数值特征
    [f'gender_{cat}' for cat in preprocessor.named_transformers_['cat']['onehot'].categories_[0][1:]] +  # 性别特征
    [f'city_{cat}' for cat in preprocessor.named_transformers_['cat']['onehot'].categories_[1][1:]]  # 城市特征
)

print(f"\n处理后的特征: {feature_names}")

3.9 数据预处理最佳实践

3.9.1 处理顺序

python

def preprocessing_best_practices():
    """数据预处理最佳实践示例"""
    
    print("数据预处理最佳实践：")
    print("1. 数据探索和理解")
    print("2. 处理重复值")
    print("3. 处理缺失值")
    print("4. 处理异常值")
    print("5. 特征编码")
    print("6. 特征缩放")
    print("7. 特征选择（可选）")
    
    # 示例：完整的预处理流程
    df_clean = df.copy()
    
    # 1. 移除重复值
    df_clean = df_clean.drop_duplicates()
    print(f"\n移除重复值后样本数: {len(df_clean)}")
    
    # 2. 处理明显错误的数据
    df_clean = df_clean[df_clean['age'] > 0]  # 年龄必须为正
    df_clean = df_clean[df_clean['credit_score'] >= 300]  # 信用评分最低300
    print(f"移除错误数据后样本数: {len(df_clean)}")
    
    # 3. 应用预处理管道
    X_clean = df_clean[['age', 'income', 'education_years', 'credit_score', 'gender', 'city']]
    y_clean = df_clean['loan_approved']
    
    return X_clean, y_clean

X_clean, y_clean = preprocessing_best_practices()

3.9.2 避免数据泄露

python

def avoid_data_leakage_example():
    """避免数据泄露的正确做法"""
    
    # 错误做法：在分割数据前进行预处理
    print("❌ 错误做法：")
    X_wrong = df[['age', 'income', 'education_years', 'credit_score', 'gender', 'city']]
    y_wrong = df['loan_approved']
    
    # 先预处理整个数据集（错误！）
    scaler_wrong = StandardScaler()
    X_wrong_scaled = scaler_wrong.fit_transform(X_wrong.select_dtypes(include=[np.number]))
    
    # 再分割数据
    X_train_wrong, X_test_wrong, y_train_wrong, y_test_wrong = train_test_split(
        X_wrong_scaled, y_wrong, test_size=0.2, random_state=42
    )
    
    print("这种做法会导致数据泄露，因为测试集的信息被用于训练集的预处理")
    
    # 正确做法：先分割数据，再分别预处理
    print("\n✅ 正确做法：")
    X_correct = df[['age', 'income', 'education_years', 'credit_score', 'gender', 'city']]
    y_correct = df['loan_approved']
    
    # 先分割数据
    X_train_correct, X_test_correct, y_train_correct, y_test_correct = train_test_split(
        X_correct, y_correct, test_size=0.2, random_state=42
    )
    
    # 在训练集上拟合预处理器
    preprocessor_correct = create_preprocessing_pipeline()
    X_train_processed = preprocessor_correct.fit_transform(X_train_correct)
    
    # 在测试集上应用预处理器（不重新拟合）
    X_test_processed = preprocessor_correct.transform(X_test_correct)
    
    print("这种做法避免了数据泄露，确保测试集完全独立")

avoid_data_leakage_example()

3.10 练习题

练习1：缺失值处理

创建一个包含30%缺失值的数据集
比较均值、中位数、众数和KNN填充的效果
分析哪种方法最适合你的数据

练习2：异常值检测

实现Z-score方法检测异常值
比较IQR和Z-score方法的差异
可视化异常值检测结果

练习3：特征缩放

创建一个包含不同量纲特征的数据集
比较不使用缩放和使用不同缩放方法的模型性能
分析哪种缩放方法最适合不同的算法

练习4：编码方法

创建一个包含高基数类别特征的数据集
比较标签编码、独热编码和目标编码的效果
分析不同编码方法对模型性能和训练时间的影响

3.11 小结

在本章中，我们学习了：

核心概念

数据质量问题：缺失值、异常值、量纲差异
预处理的重要性：提高模型性能和稳定性
避免数据泄露：正确的预处理时机

主要技术

缺失值处理：SimpleImputer, KNNImputer
异常值处理：IQR方法、限制方法
特征缩放：StandardScaler, MinMaxScaler, RobustScaler
类别编码：LabelEncoder, OneHotEncoder
管道构建：Pipeline, ColumnTransformer

最佳实践

先分割数据，再预处理
选择合适的填充策略
根据算法选择缩放方法
考虑类别特征的基数

关键要点

数据预处理是机器学习成功的关键
不同的预处理方法适用于不同的场景
管道化可以避免数据泄露并提高代码复用性
预处理决策应该基于数据特性和业务理解

3.12 下一步

现在你已经掌握了数据预处理的核心技能！在下一章线性回归详解中，我们将深入学习第一个重要的机器学习算法——线性回归，了解如何预测连续值。

章节要点回顾：

✅ 掌握了识别和处理数据质量问题的方法
✅ 学会了使用 Scikit-learn 的预处理工具
✅ 理解了不同预处理方法的适用场景
✅ 掌握了构建预处理管道的技能
✅ 了解了避免数据泄露的最佳实践

第3章：数据预处理基础 ​

3.1 为什么需要数据预处理？ ​

3.2 创建示例数据集 ​

3.3 数据探索和问题识别 ​

3.3.1 基本统计信息 ​

3.3.2 可视化数据分布 ​

3.3.3 异常值检测 ​

3.4 处理缺失值 ​

3.4.1 简单填充策略 ​

3.4.2 高级填充策略 ​

3.4.3 填充效果比较 ​

3.5 处理异常值 ​

3.5.1 移除异常值 ​

3.5.2 限制异常值 ​

3.6 特征缩放 ​

3.6.1 标准化 (StandardScaler) ​

3.6.2 最小-最大缩放 (MinMaxScaler) ​

3.6.3 鲁棒缩放 (RobustScaler) ​

3.6.4 缩放方法比较 ​

3.7 类别型特征编码 ​

3.7.1 标签编码 (Label Encoding) ​

3.7.2 独热编码 (One-Hot Encoding) ​

3.7.3 编码方法比较 ​

3.8 完整的预处理管道 ​

3.8.1 创建预处理管道 ​

3.8.2 完整的机器学习管道 ​

3.9 数据预处理最佳实践 ​

3.9.1 处理顺序 ​

3.9.2 避免数据泄露 ​

3.10 练习题 ​

练习1：缺失值处理 ​

练习2：异常值检测 ​

练习3：特征缩放 ​

练习4：编码方法 ​

3.11 小结 ​

核心概念 ​

主要技术 ​

最佳实践 ​

关键要点 ​

3.12 下一步 ​

第3章：数据预处理基础

3.1 为什么需要数据预处理？

3.2 创建示例数据集

3.3 数据探索和问题识别

3.3.1 基本统计信息

3.3.2 可视化数据分布

3.3.3 异常值检测

3.4 处理缺失值

3.4.1 简单填充策略

3.4.2 高级填充策略

3.4.3 填充效果比较

3.5 处理异常值

3.5.1 移除异常值

3.5.2 限制异常值

3.6 特征缩放

3.6.1 标准化 (StandardScaler)

3.6.2 最小-最大缩放 (MinMaxScaler)

3.6.3 鲁棒缩放 (RobustScaler)

3.6.4 缩放方法比较

3.7 类别型特征编码

3.7.1 标签编码 (Label Encoding)

3.7.2 独热编码 (One-Hot Encoding)

3.7.3 编码方法比较

3.8 完整的预处理管道

3.8.1 创建预处理管道

3.8.2 完整的机器学习管道

3.9 数据预处理最佳实践

3.9.1 处理顺序

3.9.2 避免数据泄露

3.10 练习题

练习1：缺失值处理

练习2：异常值检测

练习3：特征缩放

练习4：编码方法

3.11 小结

核心概念

主要技术

最佳实践

关键要点

3.12 下一步