/DL-1

Primary LanguageJupyter Notebook

import pandas as pd: This line imports the pandas library and assigns it the alias pd. Pandas is a powerful library for data manipulation and analysis.

from sklearn.model_selection import train_test_split: This line imports the train_test_split function from the model_selection module in scikit-learn. This function is used to split datasets into random train and test subsets.

X = df.loc[:, df.columns != 'MEDV']: Here, the variable X is assigned the subset of the DataFrame df that contains all columns except the 'MEDV' column. This subset represents the input features of the dataset.

y = df.loc[:, df.columns == 'MEDV']: Similarly, the variable y is assigned the subset of the DataFrame df that contains only the 'MEDV' column. This subset represents the target variable we aim to predict.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=123): The train_test_split function is used to split the dataset into training and testing sets. It takes the input features X and the target variable y as its arguments. The test_size parameter is set to 0.3, indicating that 30% of the data will be used for testing, while the remaining 70% will be used for training. The random_state parameter is set to 123 to ensure reproducibility of the split. The function returns four sets of data: X_train (training set of input features), X_test (testing set of input features), y_train (training set of target variable), and y_test (testing set of target variable).

from sklearn.preprocessing import MinMaxScaler: This line imports the MinMaxScaler class from the preprocessing module in scikit-learn. The MinMaxScaler is a preprocessing technique used to scale numeric features to a specific range, typically between 0 and 1.

mms = MinMaxScaler(): Here, an instance of the MinMaxScaler class is created and assigned to the variable mms. This instance will be used to scale the data.

mms.fit(X_train): The fit method of the MinMaxScaler instance is called with X_train as the argument. This computes the minimum and maximum values of each feature in X_train and stores them as attributes in mms.

X_train = mms.transform(X_train): The transform method of the MinMaxScaler instance is applied to X_train. This scales the values of the features in X_train based on the minimum and maximum values computed in the previous step. The transformed values are then assigned back to X_train.

X_test = mms.transform(X_test): Similarly, the transform method is applied to X_test, scaling the features in the test set based on the minimum and maximum values computed from the training set. This ensures that the test set is scaled consistently with the training set.

from sklearn.preprocessing import MinMaxScaler: This line imports the MinMaxScaler class from the preprocessing module in scikit-learn. The MinMaxScaler is a preprocessing technique used to scale numeric features to a specific range, typically between 0 and 1.

mms = MinMaxScaler(): This line creates an instance of the MinMaxScaler class and assigns it to the variable mms. This instance will be used to scale the data.

mms.fit(X_train): The fit method of the MinMaxScaler instance is called with X_train as the argument. This computes and stores the minimum and maximum values of each feature in X_train. These minimum and maximum values will be used to scale the data.

X_train = mms.transform(X_train): The transform method of the MinMaxScaler instance is applied to X_train. This method scales the values of the features in X_train based on the minimum and maximum values computed in the previous step. The transformed values are then assigned back to X_train, replacing the original unscaled values.

X_test = mms.transform(X_test): Similarly, the transform method is applied to X_test, scaling the features in the test set based on the minimum and maximum values computed from the training set. This ensures that the test set is scaled consistently with the training set.

from tensorflow.keras.models import Sequential: This line imports the Sequential class from the models module in the keras package of TensorFlow. The Sequential class is a container that allows us to build models layer by layer.

from tensorflow.keras.layers import Dense: This line imports the Dense class from the layers module in the keras package of TensorFlow. The Dense class represents a fully connected layer in a neural network.

model = Sequential(): This line creates an instance of the Sequential class and assigns it to the variable model. This instance will be used to build the neural network model.

model.add(Dense(128, input_shape=(13, ), activation='relu', name='dense_1')): The add method of the model is used to add a Dense layer to the model. This specific layer has 128 units/neurons and takes an input shape of (13,), indicating that it expects input data with 13 features. The activation function used is ReLU (Rectified Linear Unit), which introduces non-linearity to the network. The name parameter assigns a name to this layer for later reference.

model.add(Dense(64, activation='relu', name='dense_2')): Another Dense layer is added to the model, this time with 64 units/neurons. It does not require an input shape since it follows the previous layer. The activation function used is again ReLU, and a name is assigned to this layer.

model.add(Dense(1, activation='linear', name='dense_output')): The final Dense layer is added, consisting of a single unit/neuron. It represents the output layer of the model. The activation function is linear, indicating that it will produce a continuous output. A name is assigned to this layer as well.

model.compile(optimizer='adam', loss='mse', metrics=['mae']): The compile method of the model is called to configure the learning process. The 'adam' optimizer is used, which is a popular optimization algorithm. The loss function is set to 'mse' (mean squared error), which is commonly used for regression problems. Additionally, the 'mae' (mean absolute error) metric is specified to evaluate the model's performance.

model.summary(): This line prints a summary of the model, showing the layers, the number of parameters, and other useful information. It provides an overview of the model's architecture and helps to verify that it has been constructed correctly.

X_train and y_train: These are the input features and target values, respectively, used for training the model. X_train contains the input data samples, and y_train contains the corresponding target values.

epochs=100: This parameter specifies the number of times the entire training dataset will be passed through the neural network during training. In this case, the training process will iterate over the dataset 100 times.

validation_split=0.05: This parameter indicates the portion of the training data that will be used for validation during training. In this case, 5% of the training data will be set aside for validation, while the remaining 95% will be used for actual training.

verbose=1: This parameter controls the verbosity of the training process. A value of 1 indicates that training progress and performance metrics will be displayed during training, providing updates on the loss and metrics values for each epoch.

mse_nn, mae_nn = model.evaluate(X_test, y_test): The evaluate() method of the model is called with the test dataset (X_test and y_test) as input. This method evaluates the model's performance on the test data and computes the specified metrics. In this case, it returns the calculated MSE and MAE values, which are assigned to the variables mse_nn and mae_nn, respectively.

print('Mean squared error on test data: ', mse_nn): This line prints the MSE metric on the test data. The mse_nn variable contains the calculated MSE value, which is displayed in the output.

print('Mean absolute error on test data: ', mae_nn): This line prints the MAE metric on the test data. The mae_nn variable contains the calculated MAE value, which is displayed in the output.