We are currently working on models for CICD inside of a databricks environment. When we run scripts in databricks we need to initialize the environment by installing the various libraries. On our team we often install the model workflow from our github repo using a .pkl file, and then do install -r requirements.txt for the libraries. However I see for the enterprise version of sdv in order to install it you have to provide the username and password. In some basic testing it’s not entirely clear how to do this in databricks.
Is there any argparse arguments available, e.g. pip install sdv.enterprise --username --password? If you have any guidance on getting this working in a databricks environment, that would be very helpful.
@giovanni.circo, am I correct in understanding your workflow as follows?
- You have developed a synthetic model and saved it as a pickle file.
- You have a CI/CD pipeline in Databricks in which you want to install a fresh SDV enterprise each time a CI/CD action is triggered, and want to sample from the .pkl model?
On a related note, have you already built an SDV model successfully?
Yes, I have already created and used multiple SDV models locally. My primary question is what is the suggested method to install and run the enterprise version inside of a databricks environment.
If the above workflow description is correct, you can install SDV Enterprise in Databricks with the following command:
python -m pip install sdv_enterprise --index-url https://EMAIL_ESCAPED:LICENSE_KEY@pypi.datacebo.com --timeout 600
Please replace EMAIL_ESCAPED with your email address (escaped) and LICENSE_KEY with your actual license key.
For example with gaurav@datacebo.com:
python -m pip install --index-url https://gaurav%40datacebo.com:LICENSE_KEY@pypi.datacebo.com --timeout 600
- Notice the @ symbol was changed to %40
Tested this in our databricks environment, and it is working. Thank you!
@giovanni.circo glad to hear that. I am closing this discussion.
Feel free to open a new discussion if you run into any issues.