Release Date: May 21, 2024
This month’s release allows for a better experience getting started with SDV and creating more realistic data. We’ve also prioritized bug fixes that affected our SDV Enterprise customers.
↔️ Subset multi-table data while maintaining connections. If your multi-table dataset is too large, you can now use utility functions to sample a smaller subset for use with SDV. This feature ensures that the subsetting will maintain referential integrity – aka valid connections between your tables.
[Beta!] Read & write from Excel sheets. If your data is available in a local, Excel file, you can now import it directly, and later export your synthetic data back into an Excel spreadsheet using the ExcelHandler.
Random ID generation. SDV Enterprise users now have access to a premium feature to completely randomize ID generation for primary or foreign keys. This makes your synthetic data look even more realistic than before.
Additional updates
- Importing and exporting CSV data is also now more streamlined with the new CSVHandler
- You can now use SDV with data where your column names are integers (0, 1, 2, …)
- We’ve fixed issues in the HSASynthesizer that caused it to print out many warnings during sampling, or a crash if you had too few rows.