Skip to content

Latest commit

 

History

History
47 lines (31 loc) · 2.69 KB

File metadata and controls

47 lines (31 loc) · 2.69 KB

PREVIEW capability

Automated ML now supports Azure Databricks as a local compute to perform training (public preview). Azure Databricks is a managed Spark offering on Azure and customers already use it for advanced analytics. It provides a collaborative Notebook based environment with CPU or GPU based compute cluster.

  • Customers who use Azure Databricks for advanced analytics can now use the same cluster to run automated machine learning experiments.
  • You can keep the data within the same cluster.
  • You can leverage the local worker nodes with autoscale and auto termination capabilities.
  • You can use multiple cores of your Azure Databricks cluster to perform simultenous training.
  • You can further tune the model generated by automated machine learning if you chose to.
  • Every run (including the best run) is available as a pipeline.
  • The model from the pipeline can be registered in Azure ML SDK workspace and then deployed to Azure managed compute (ACI or AKS) using the Azure Machine learning SDK.

Create Azure Databricks Cluster:

Select New Cluster and fill in following detail:

  • Cluster name: yourclustername
  • Cluster Mode: Any. High Concurrency preferred
  • Databricks Runtime: Any 4.x runtime.
  • Python version: 3
  • Workers: 2 or higher.
  • Max. number of concurrent iterations in Automated ML settings is <= to the number of worker nodes in your Databricks cluster.
  • Worker node VM types: Memory optimized VM preferred.
  • Uncheck Enable Autoscaling

It will take few minutes to create the cluster. Please ensure that the cluster state is running before proceeding further.

Install Azure ML with Automated ML SDK on your Azure Databricks cluster

  • Select Import library

  • Source: Upload Python Egg or PyPI

  • PyPi Name: azureml-sdk[automl_databricks]

  • Click Install Library

  • Do not select Attach automatically to all clusters. In case you have selected earlier then you can go to your Home folder and deselect it.

  • Select the check box Attach next to your cluster name

(More details on how to attach and detach libs are here - https://docs.databricks.com/user-guide/libraries.html#attach-a-library-to-a-cluster )

  • Ensure that there are no errors until Status changes to Attached. It may take a couple of minutes.

Note - If you have the old build the please deselect it from cluster’s installed libs > move to trash. Install the new build and restart the cluster. And if still there is an issue then detach and reattach your cluster.

Now you can run the Automated ML sample notebook on your Azure Databricks cluster. Please let us know your feedback.