加入新推出的
Discord 社区,展开实时讨论,获得同行支持,并直接与 Meridian 团队互动!
运行模型诊断
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
构建模型后,您必须评估收敛性,根据需要调试模型,然后评估模型拟合度。
评估收敛性
您可以评估模型收敛性,以帮助确保模型的完整性。
visualizer.ModelDiagnostics()
下的 plot_rhat_boxplot
命令会汇总并计算关于链收敛的 Gelman 和 Rubin(1992 年)潜在规模缩减,通常称为 R-hat。这种收敛诊断可以衡量链间(平均值)方差超出链同分布时预期方差的程度。
每个模型形参都有一个 R-hat 值。该箱线图汇总了各个索引中 R-hat 值的分布情况。例如,与 beta_gm
x 轴标签对应的箱体汇总了地理位置索引 g
和渠道索引 m
中 R-hat 值的分布情况。
值接近 1.0 表示收敛。R-hat 值小于 1.2 表示近似收敛,对于许多问题来说,这是一个合理阈值(Brooks 和 Gelman,1998 年)。不收敛通常有以下两个原因:未能根据数据正确指定模型,这个问题可能存在于似然(模型规范)或先验中;或者是预选不足,也就是说 n_adapt + n_burnin
不够大。
如果您在实现收敛方面遇到问题,请参阅实现 MCMC 收敛。
生成 R-hat 箱线图
运行以下命令可生成 R-hat 箱线图:
model_diagnostics = visualizer.ModelDiagnostics(meridian)
model_diagnostics.plot_rhat_boxplot()
输出示例:

生成轨迹图和密度图
您可以为马尔可夫链蒙特卡洛 (MCMC) 样本生成轨迹图和密度图,帮助评估各链的收敛性和稳定性。轨迹图中的每个轨迹都表示 MCMC 算法在探索形参空间时生成的值序列。它展示了算法在连续迭代过程中如何处理不同的形参值。在轨迹图中,应尽量避免出现扁平区域,即链在很长时间内保持同一状态或在一个方向上有过多连续的步。
左侧的密度图直观呈现了通过 MCMC 算法获得的一个或多个形参的抽样值密度分布。在密度图中,您希望看到链已经收敛到平稳的密度分布。
以下示例展示了如何生成轨迹图和密度图:
parameters_to_plot=["roi_m"]
for params in parameters_to_plot:
az.plot_trace(
meridian.inference_data,
var_names=params,
compact=False,
backend_kwargs={"constrained_layout": True},
)
输出示例:

检查先验分布和后验分布
如果数据中的信息很少,先验和后验会相似。如需了解详情,请参阅后验与先验相同。
低支出渠道尤其容易出现与投资回报率先验相似的投资回报率后验。为了解决此问题,我们建议您在为 MMM 准备数据时舍弃支出非常低的渠道,或者将这些渠道与其他渠道合并。
运行以下命令可绘制每个媒体渠道的投资回报率后验分布与投资回报率先验分布对比图:
model_diagnostics = visualizer.ModelDiagnostics(meridian)
model_diagnostics.plot_prior_and_posterior_distribution()
输出示例:(点击图片可放大。)

默认情况下,plot_prior_and_posterior_distribution()
会生成投资回报率后验和先验。不过,您可以将特定模型形参传递给 plot_prior_and_posterior_distribution()
,如以下示例所示:
model_diagnostics.plot_prior_and_posterior_distribution('beta_m')
评估模型拟合度
优化模型收敛性后,应评估模型拟合度。如需了解详情,请参阅“建模后”阶段中的评估模型拟合度。
使用营销组合建模分析 (MMM) 时,您必须依赖间接衡量方式来评估因果推理并寻找合理的结果。可以通过以下两种不错的方式实现这一点:
- 运行命令来生成 R 平方、平均绝对百分比误差 (MAPE) 和加权平均绝对百分比误差 (wMAPE) 指标
- 生成预期与实际收入或 KPI(具体取决于
kpi_type
以及是否有 revenue_per_kpi
)对比图。
运行命令来生成 R 平方、MAPE 和 wMAPE 指标
拟合优度指标可用作置信度检查指标,用于检查模型结构是否合适以及是否过度形参化。ModelDiagnostics
可计算 R-Squared
、MAPE
和 wMAPE
拟合优度指标。如果在 Meridian 中设置了 holdout_id
,还会对 Train
和 Test
子集计算 R-squared
、MAPE
和 wMAPE
。请注意,拟合优度指标用于衡量预测准确率,而预测准确率通常不是 MMM 的目标。不过,这些指标仍可作为实用的置信度检查指标。
运行以下命令可生成 R 平方、MAPE 和 wMAPE 指标:
model_diagnostics = visualizer.ModelDiagnostics(meridian)
model_diagnostics.predictive_accuracy_table()
输出示例:
生成预期与实际对比图
借助预期与实际对比图,可以间接评估模型拟合度。
国家:预期与实际对比图
您可以绘制国家级实际收入(或 KPI)与模型的预期收入(或 KPI)对比图,帮助评估模型拟合度。基准值是指在没有媒体执行的情况下,模型对收入(或 KPI)的反事实估计值。估计收入尽可能接近实际收入并不一定是 MMM 的目标,但它确实可以作为实用的置信度检查指标。
运行以下命令可根据国家级数据绘制实际收入(或 KPI)与预期收入(或 KPI)对比图:
model_fit = visualizer.ModelFit(meridian)
model_fit.plot_model_fit()
输出示例:

地理位置:预期与实际对比图
您可以创建地理位置级预期与实际对比图,帮助评估模型拟合度。由于地理位置可能很多,您可能希望只显示最大的地理位置。
运行以下命令可根据地理位置级数据绘制实际收入(或 KPI)与预期收入(或 KPI)对比图:
model_fit = visualizer.ModelFit(meridian)
model_fit.plot_model_fit(n_top_largest_geos=2,
show_geo_level=True,
include_baseline=False,
include_ci=False)
输出示例:

对模型拟合度感到满意后,可以分析模型结果。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-04。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-04。"],[[["\u003cp\u003eModel evaluation involves assessing convergence, debugging, and determining the model fit after the model is built.\u003c/p\u003e\n"],["\u003cp\u003eConvergence is assessed using the R-hat diagnostic, where values close to 1.0 indicate convergence, and values less than 1.2 suggest approximate convergence.\u003c/p\u003e\n"],["\u003cp\u003eTrace and density plots help assess convergence and stability by showing the sequence of values generated by the MCMC algorithm and visualizing the density distribution of sampled values.\u003c/p\u003e\n"],["\u003cp\u003eModel fit is evaluated through indirect measures, including running metrics like R-squared, MAPE, and wMAPE, and generating expected versus actual plots for revenue or KPI at national and geo levels.\u003c/p\u003e\n"],["\u003cp\u003eComparing prior and posterior distributions helps determine the influence of the data, especially noting when the posterior distribution resembles the prior, such as with low-spend channels.\u003c/p\u003e\n"]]],["After building the model, assess convergence using R-hat values, aiming for values close to 1.0. If convergence is poor, check for model misspecification or insufficient burn-in. Generate trace and density plots for additional checks. Compare prior and posterior distributions, especially for channels with low spend. Evaluate model fit using R-squared, MAPE, and wMAPE metrics, and by generating plots that show expected versus actual revenue or KPI at national and geo levels.\n"],null,["# Run model diagnostics\n\nAfter the model is built, you must assess convergence, debug the model if\nneeded, and then assess the model fit.\n\nAssess convergence\n------------------\n\nYou assess the model convergence to help ensure the integrity of your model.\n\nThe `plot_rhat_boxplot` command under `visualizer.ModelDiagnostics()` summarizes\nand calculates the [Gelman \\& Rubin\n(1992)](http://www.stat.columbia.edu/%7Egelman/research/published/itsim.pdf)\npotential scale reduction for chain convergence, commonly referred to as R-hat.\nThis convergence diagnostic measures the degree to which the variance (of the\nmeans) between the chains exceeds what you would expect if the chains were\nidentically distributed.\n\nThere is a single R-hat value for each model parameter. The box plot summarizes\nthe distribution of R-hat values across indexes. For example, the box\ncorresponding to the `beta_gm` x-axis label summarizes the distribution of R-hat\nvalues across both the geo index `g` and the channel index `m`.\n\nValues close to 1.0 indicate convergence. R-hat \\\u003c 1.2 indicates approximate\nconvergence and is a reasonable threshold for many problems ([Brooks \\& Gelman,\n1998](https://www2.stat.duke.edu/%7Escs/Courses/Stat376/Papers/ConvergeDiagnostics/BrooksGelman.pdf)).\nA lack of convergence typically has one of two culprits. Either the model is\nvery poorly misspecified for the data, which can be in the likelihood (model\nspecification) or in the prior. Or, there is not enough burnin, meaning\n`n_adapt + n_burnin` is not large enough.\n\nIf you have difficulty getting convergence, see [Getting MCMC convergence](/meridian/docs/advanced-modeling/model-debugging#getting-mcmc-convergence).\n\n### Generate an R-hat boxplot\n\nRun the following commands to generate an R-hat boxplot: \n\n model_diagnostics = visualizer.ModelDiagnostics(meridian)\n model_diagnostics.plot_rhat_boxplot()\n\n**Example output:**\n\n### Generate trace and density plots\n\nYou can generate trace and density plots for Markov Chain Monte Carlo (MCMC)\nsamples to help assess convergence and stability across chains. Each trace in\nthe trace plot represents the sequence of values generated by the MCMC algorithm\nas it explores the parameter space. It shows how the algorithm moves through\ndifferent values of the parameters over successive iterations. In the trace\nplots, try to avoid flat areas, where the chain stays in the same state for too\nlong or has too many consecutive steps in one direction.\n\nThe density plots on the left visualizes the density distribution of sampled\nvalues for one or more parameters obtained through the MCMC algorithm. In the\ndensity plot, you want to see that the chains have converged to a stable density\ndistribution.\n\nThe following example shows how to generate trace and density plots: \n\n parameters_to_plot=[\"roi_m\"]\n for params in parameters_to_plot:\n az.plot_trace(\n meridian.inference_data,\n var_names=params,\n compact=False,\n backend_kwargs={\"constrained_layout\": True},\n )\n\n**Example output:**\n\nCheck prior and posterior distributions\n---------------------------------------\n\nWhen there is little information in the data, the prior and the posterior will\nbe similar. For more information, see [When the posterior is the same as the\nprior](/meridian/docs/advanced-modeling/model-debugging#posterior-same-as-prior).\n\nChannels with low spend are particularly susceptible to have an ROI posterior\nsimilar to the ROI prior. To remediate the issue, we recommend either dropping\nthe channels with very low spend or combining them with other channels when\npreparing the data for MMM.\n\nRun the following commands to plot the ROI posterior distribution against the\nROI prior distribution for each media channel: \n\n model_diagnostics = visualizer.ModelDiagnostics(meridian)\n model_diagnostics.plot_prior_and_posterior_distribution()\n\n**Example output:** (*Click the image to enlarge.*)\n\nBy default, `plot_prior_and_posterior_distribution()` generates the posterior\nand prior for ROI. However, you can pass specific model parameters to\n`plot_prior_and_posterior_distribution()`, as shown in the following example: \n\n model_diagnostics.plot_prior_and_posterior_distribution('beta_m')\n\nAssess model fit\n----------------\n\nAfter your model convergence is optimized, assess the model fit. For more\ninformation, see [Assess the model fit](/meridian/docs/advanced-modeling/model-fit)\nin *Post-modeling*.\n\nWith marketing mix modeling (MMM), you must rely on indirect measures to assess\ncausal inference and look for results that make sense. Two good ways to do this\nare by:\n\n- Running metrics for R-squared, Mean Absolute Percentage Error (MAPE), and Weighted Mean Absolute Percentage Error (wMAPE)\n- Generating plots for expected versus actual revenue or KPI, contingent upon the `kpi_type` and the availability of `revenue_per_kpi`.\n\n| **Note:** wMAPE is weighted by actual revenue.\n\n### Run R-squared, MAPE, and wMAPE metrics\n\nGoodness of fit metrics can be used as a confidence check that the model\nstructure is appropriate and not overparameterized. `ModelDiagnostics` calculate\n`R-Squared`, `MAPE`, and `wMAPE` goodness of fit metrics. If `holdout_id` is set\nin Meridian, then `R-squared`, `MAPE` and `wMAPE` are also calculated\nfor the `Train` and `Test` subsets. Be aware that goodness of fit metrics are a\nmeasure of predictive accuracy, which is not typically the goal of an MMM.\nHowever, these metrics still serve as a useful confidence check.\n\nRun the following commands to generate R-squared, MAPE, and wMAPE metrics: \n\n model_diagnostics = visualizer.ModelDiagnostics(meridian)\n model_diagnostics.predictive_accuracy_table()\n\n**Example output:**\n\n### Generate expected versus actual plots\n\nUsing expected versus actual plots can be helpful as an indirect method to\nassess the model fit.\n\n#### National: Expected versus actual plots\n\nYou can plot the actual revenue or KPI alongside the model's expected revenue\nor KPI at the national level to help assess the model fit. *Baseline* is the\nmodel's counterfactual estimation for revenue (or KPI) if there was no media\nexecution. Estimating revenue to be as close to actual revenue as possible isn't\nnecessarily the goal of an MMM, but it does serve as a useful confidence check.\n\nRun the following commands to plot actual revenue (or KPI) versus expected\nrevenue (or KPI) for national data: \n\n model_fit = visualizer.ModelFit(meridian)\n model_fit.plot_model_fit()\n\n**Example output:**\n\n#### Geo: Expected versus actual plots\n\nYou can create the expected versus actual plots at the geo level to access model\nfit. Because there can be many geos, you might want to only show the largest\ngeos.\n\nRun the following commands to plot actual revenue (or KPI) versus expected\nrevenue (or KPI) for geos: \n\n model_fit = visualizer.ModelFit(meridian)\n model_fit.plot_model_fit(n_top_largest_geos=2,\n show_geo_level=True,\n include_baseline=False,\n include_ci=False)\n\n**Example output:**\n\nAfter you are satisfied with your model fit, [analyze the model\nresults](/meridian/docs/user-guide/model-results-overview)."]]