What’s New?
We’ve significantly enhanced the SegmentSynthesizer. This specialty synthesizer understands that there could be different segments in the same dataset, and that each may have completely different patterns. Under-the-hood, it learns different patterns for each segment to ensure a high degree of realism in the synthetic data.
Algorithmically determine segments. Maybe you’re aware that your data contains segments, but you don’t know what they are. By default, SegmentSynthesizer algorithmically determines the segments based on your data.
Supply your own segments. If you know what the segments are, you can supply them. This is especially useful for ML training datasets, where each outcome that you want to predict (eg. 0 vs. 1) is a different segment. You can also specify the algorithm to use under-the-hood for each segment.
# learn different patterns based on whether a user has made a purchase or not
synthesizer = SegmentSynthesizer(
metadata,
segmentation_params={
'method': 'exact_values',
'column_name': 'made_purchase'
}
)
Conditionally sample segments. Use the conditional sampling feature to sample from the different segments. This is especially useful for rebalancing imbalanced datasets.
Get Started
These release notes correspond to SDV Enterprise v0.40.0. The SegmentSynthesizer is available as part of the XSynthesizers bundle, an optional add-on to SDV Enterprise.