Quiz-Data-Analytics

Code to remove duplicate records from a dataset named df A. df.drop_duplicates()
Code to slice list to get the first 3 values holidays= ['Thanksgiving', 'Christmas', 'Halloween', 'Easter', 'Hanunukah', 'Cinco De Mayo', 'Kwanzaa'] A. holidays[0:4]
code to give number of rows and columns of data A. df.shape
Write the code to import the travelDestinations.csv dataset into a data frame.

Your Answer: import pandas as pd import numpy as np

location = "travelDestinations.csv" df = pd.read_csv(location) df

Code to list column names along with number of records with missing values in those columns A. df.isnull().sum()
What does the melt function do?

Your Answer: reshapes data from wide to long

7.What does the y-intercept tell you?

Your Answer: The dependent variable, also where the line intersects the y-axis

What can violin plots tell us?

Your Answer: The range of the dependent variable on the y-axis plotted as points and also the range of the independent variable along the x-axis plotted as points

We can have columns with string values when building a linear regression model. A. False
What can a correlation table tell you?

Your Answer: whether 2 variables have a positive correlation (when both variables near +1 and both variables near -1) or negative correlation (when one increases and the other decreases). Correlation coeffcient's >0.5 are good to look at.

Which library is this line of code from? lm.fit(X, df.price)

A. sklearn

Which library in Python is the seaborn library built on top of?

A. matplotlib

Which library do we need to create a boxplot as such: sns.boxplot(data=df)

A. seaborn

What library do we need in order to use decision trees?

A. sklearn

Which library do we need to import a spreadsheet?

A. pandas

df.head(50)

A. Your Answer: print out the first 50 rows of the table

What does this code do?

df.fillna('awesome')

Your Answer: Fills the word "awesome" in the space with missing values

What does this code do?

df.groupby(['gender'])['age'].mean() Your Answer: It groups by average for gender and age

What does this line of code output?

df.corr() Your Answer: correlation coefficient

How big is our training data set here?

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=90)

A. 75%

cnewton1428/Quiz-Data-Analytics

Quiz-Data-Analytics