Hi @neha , facing issue while creating datafarme from synthsized data.
hsas = HSASynthesizer(metadata)
hsas.fit(data)
Traceback (most recent call last):
File “”, line 1, in
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/multi_table/base.py”, line 385, in fit
processed_data = self.preprocess(data)
File “packaging/sdv_enterprise/sdv/multi_table/hsa/hsa.pyx”, line 29, in sdv_enterprise.sdv.multi_table.hsa.hsa.expirable.wrapper
File “packaging/sdv_enterprise/sdv/multi_table/hsa/hsa.pyx”, line 89, in sdv_enterprise.sdv.multi_table.hsa.hsa.HSASynthesizer.preprocess
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/multi_table/base.py”, line 338, in preprocess
processed_data[table_name] = synthesizer._preprocess(table_data)
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/single_table/base.py”, line 347, in _preprocess
self._data_processor.fit(data)
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/data_processing/data_processor.py”, line 754, in fit
self._fit_hyper_transformer(constrained)
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/data_processing/data_processor.py”, line 669, in _fit_hyper_transformer
self._hyper_transformer.fit(data)
File “/sasdata/python3.8/lib/python3.8/site-packages/rdt/hyper_transformer.py”, line 749, in fit
data = self._fit_field_transformer(data, field, self.field_transformers[field])
File “/sasdata/python3.8/lib/python3.8/site-packages/rdt/hyper_transformer.py”, line 671, in _fit_field_transformer
transformer.fit(data, field)
File “/sasdata/python3.8/lib/python3.8/site-packages/rdt/transformers/base.py”, line 55, in wrapper
return function(self, *args, **kwargs)
File “/sasdata/python3.8/lib/python3.8/site-packages/rdt/transformers/base.py”, line 390, in fit
self._fit(columns_data)
File “packaging/sdv_enterprise/rdt/transformers/email/email.pyx”, line 189, in sdv_enterprise.rdt.transformers.email.email.DomainBasedAnonymizer._fit
File “packaging/sdv_enterprise/rdt/transformers/email/utils.pyx”, line 50, in sdv_enterprise.rdt.transformers.email.utils.validate_email_address
rdt.errors.InvalidDataError: The input data must be email addresses. Data contains (‘Y’, ‘Y’, + 887 more).
hsas.save(‘/saswork/sample_single.pkl’)
synthesizer = HSASynthesizer.load(‘/saswork/sample_single.pkl’)
synthetic_data = synthesizer.sample(scale=1)
Traceback (most recent call last):
File “”, line 1, in
File “packaging/sdv_enterprise/sdv/multi_table/hsa/hsa.pyx”, line 29, in sdv_enterprise.sdv.multi_table.hsa.hsa.expirable.wrapper
File “packaging/sdv_enterprise/sdv/multi_table/hsa/hsa.pyx”, line 113, in sdv_enterprise.sdv.multi_table.hsa.hsa.HSASynthesizer.sample
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/multi_table/base.py”, line 414, in sample
sampled_data = self._sample(scale=scale)
File “/sasdata/python3.8/lib/python3.8/site-packages/sdv/sampling/independent_sampler.py”, line 148, in _sample
num_rows = int(self._table_sizes[table] * scale)
KeyError: ‘master’