[Question] How do you evaluate your data?

neha · April 29, 2024, 10:03pm

This is thread to discuss how to evaluate your synthetic data. The DataCebo team will provide some guidance and code snippets below.

neha · April 29, 2024, 10:03pm

From @Plamen:

Here is a code snippet on how to evaluate your synthetic data, the run_diagnostic will run some basic checks to ensure that the synthetic data is usable meanwhile the evaluate_quality evaluates how well your synthetic data captures mathematical properties from your real data.

from sdv.evaluation.single_table import run_diagnostic, evaluate_quality

diagnostic_report = run_diagnostic(
    real_data=your_real_data,
    synthetic_data=your_synthetic_data,
    metadata=your_metadata)

quality_report = evaluate_quality(
  real_data=your_real_data,
  synthetic_data=your_synthetic_data,
  metadata=your_metadata)

kalyan · April 12, 2024, 9:32am

@ashok.kumar.muthimen just highlighting this thread. Wondering if you are able to run quality report to see statistical quality?

For convinience, we put the code snippet above.

Topic		Replies	Views
About the Evaluation and Benchmarking category Evaluation and Benchmarking	0	26	January 26, 2026
Measuring significant correlations Inside the Vault reports , quality	0	48	February 4, 2026
Welcome to the DataCebo Forum! Announcements	0	141	January 28, 2026
Distributions of features in synthetic data Evaluation and Benchmarking quality	16	153	February 4, 2025
About the Synthetic Data Creation category Synthetic Data Creation	0	27	January 26, 2026

[Question] How do you evaluate your data?

Related topics