This file contains code from the demos in Cloud Academy's Running Spark on Azure Databricks course.
%fs ls
%fs ls databricks-datasets
%fs head --maxBytes=1000 dbfs:/databricks-datasets/Rdatasets/data-001/csv/Ecdat/Computers.csv
DROP TABLE IF EXISTS computers;
CREATE TABLE computers
USING csv
OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/Ecdat/Computers.csv", header "true", inferSchema "true")
MNIST notebook: https://docs.databricks.com/_static/notebooks/decision-trees.html
Print decision tree accuracy:
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
val evaluator = new MulticlassClassificationEvaluator().setLabelCol("indexedLabel").setMetricName("weightedPrecision")
val prediction = model.transform(test)
println(s"accuracy = ${evaluator.evaluate(prediction)}")
The archive file containing sample AzureML notebooks that was previously at https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/azure-databricks/Databricks_AMLSDK_1-4_6.dbc is no longer available. You can now find the individual sample notebooks at https://github.com/cloudacademy/azure-databricks/tree/master/amlsdk.
Azure Databricks documentation: https://docs.azuredatabricks.net/
Support: support@cloudacademy.com