-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update for cluster config and template #941
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change alone will not work - you need to deploy the sample e2e and test:
- The deployment itself (running deploy.sh) - because when the deployment tries to install the libraries in the cluster they will fail on this version because DBFS will no longer be available on the cluster.
- The ADF pipelines will fail because the libraries will not be properly installed. My suggestion is deploy the solution with the Spark 15.4 version and start fixing in the interface the problems - when you know what to change then integrate the automation and run the deployment again e2e and make sure is working before submitting the PR.
Also, please update the metadata in the PR properly and describe the Type of PR - you left all the bullets, include the validation steps and which issues will be closed or reference when the PR closes. The body of the PR looks just like the template. Thanks!
Documenting the issues @ydaponte listed Create JSON file for library installationjson_file="./databricks/config/libs.config.json" Databricks recommends using the Unity Catalog or workspace instead. All of the notebooks reference a dbfs location (still testing if this will work) That way files and init scripts are in Unity Catalog. |
@thesqlpro - yes, precisely, that was a temporary solution until we have the Unity Catalog + new Spark version in place. I've mention that on the very first sprint why I've mentioned that the dbfs file system will not be available anymore with the 15.4 version. |
Type of PR
Template change for Databricks Cluster Configuration
Purpose
Update cluster configuration and template file. Newer spark version and change to auto termination (reduced to 10 minutes).
Does this introduce a breaking change? If yes, details on what can break
Configurations and notebooks that reference the use of DBFS (internal databricks file system) - Investigation notes in comments below
Author pre-publish checklist
Issues Closed or Referenced