venysssssssssss/PandasOverviewCommands

more info in:

PandasOverviewCommands

Contents

1. Installation
- 1.1 How install
- 1.2 How Import
2. Explorating data-methods-attributes
3. Indexing and selection
4. Data cleaning
5. Data manipulation

1. Installation

1.1. How install

Install Pandas: You can install Pandas using pip or conda in your command prompt or terminal:

pip install pandas
conda install pandas

1.2 How import

Import Pandas: To use Pandas in your Python script, you first need to import it:

import pandas as pd
The pd alias is commonly used to refer to Pandas.

2. Explorating data-methods-attributes

Load Data: You can load data into a Pandas DataFrame using various methods, such as read_csv(), read_excel(), read_sql(), etc. Here's an example using read_csv():

df = pd.read_csv('data.csv')
This will load the data from the data.csv file into a Pandas DataFrame called df.

Exploring Data: You can explore the data in your DataFrame using various methods and attributes. Some useful ones are:

df.head() # to display the first few rows of data
df.tail() # to display the last few rows of data
df.info() # to display information about the DataFrame
df.describe() # to display summary statistics for the DataFrame

3. Indexing and selection

Indexing and Selection: You can select data from your DataFrame using various methods, such as indexing by position, indexing by label, boolean indexing, etc. Here are some examples:

df.iloc[0] # select the first row of data
df.loc[0] # select the row with the label '0'
df[df['column'] > 0] # select rows where 'column' is greater than 0

4. Data cleaning

Data Cleaning: You can clean your data by handling missing values, removing duplicates, renaming columns, etc. Here are some examples:

df.dropna() # remove rows with missing values
df.drop_duplicates() # remove duplicate rows
df.rename(columns={'old_name': 'new_name'}) # rename columns

5. Data manipulation

Data Manipulation: You can manipulate your data by adding, deleting, or modifying columns, grouping data, sorting data, etc. Here are some examples:

df['new_column'] = df['column1'] + df['column2'] # add a new column
del df['column'] # delete a column
df.groupby('column').mean() # group data by 'column' and calculate the mean for each group
df.sort_values('column', ascending=False) # sort data by 'column' in descending order