[Question] How do you evaluate your data?

This is thread to discuss how to evaluate your synthetic data. The DataCebo team will provide some guidance and code snippets below.

1 Like

From @Plamen:

Here is a code snippet on how to evaluate your synthetic data, the run_diagnostic will run some basic checks to ensure that the synthetic data is usable meanwhile the evaluate_quality evaluates how well your synthetic data captures mathematical properties from your real data.

from sdv.evaluation.single_table import run_diagnostic, evaluate_quality

diagnostic_report = run_diagnostic(
    real_data=your_real_data,
    synthetic_data=your_synthetic_data,
    metadata=your_metadata)

quality_report = evaluate_quality(
  real_data=your_real_data,
  synthetic_data=your_synthetic_data,
  metadata=your_metadata)

@ashok.kumar.muthimen just highlighting this thread. Wondering if you are able to run quality report to see statistical quality?

For convinience, we put the code snippet above.