/pandas_zoo

A zoo for pandas methods, functions, usages!

Primary LanguageJupyter Notebook

Pandas Zoo

drawing

drawing

๐Ÿผ A zoo for pandas methods, functions, usages! ๐Ÿผ

Search for any feature you need by key words.

A short code snippet will be attached to the features so you can easily apply it to your work.

before copy & paste the snippet, make sure you imported packages as below.

import pandas as pd
import numpy as np

ํŒ๋‹ค์Šค์˜ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ์†Œ๊ฐœํ•˜๋Š” repo ์ž…๋‹ˆ๋‹ค!

์›ํ•˜๋Š” ๊ธฐ๋Šฅ์„ ๋ฐ”๋กœ ์ฐพ์•„ ์“ธ ์ˆ˜ ์žˆ๋„๋ก ์ •๋ฆฌํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉํ•˜๊ณ ์ž ํ•˜๋Š” ๋ชฉ์ ์— ๋งž๊ฒŒ ๋ฐ”๋กœ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก, ์ฝ”๋“œ ์Šค๋‹ˆํŽซ์„ ์ฒจ๋ถ€ํ•ฉ๋‹ˆ๋‹ค.

์ฝ”๋“œ ๋ณต๋ถ™ ์ „์—, ํŒ๋‹ค์Šค์™€ ๋„˜ํŒŒ์ด๋ฅผ ์ž˜ ํ˜ธ์ถœ ํ•˜์…จ๋Š”์ง€ ํ™•์ธํ•ด ์ฃผ์„ธ์š”.


Example

  1. ์‚ฌ์šฉ์ž ์ง€์ • ํ•จ์ˆ˜ ์ ์šฉํ•˜๊ธฐ (Applying lambda function to Pandas Series)

[ํ‚ค์›Œ๋“œ key words]
  • ๋žŒ๋‹ค, ํ•จ์ˆ˜, ๋งตํ•‘
  • apply, mapping, custom function, column
# daily bamboo and water consumption in pandas zoo

panda_consumption = {'Day' : [1, 2, 3, 4, 5, 6],
                'Cum_Bamboo': [20, 30, 60, 100, 120, 150],
                'Cum_Water': [15, 30, 45, 70, 90, 120],
                }

# make dataframe
DF = pd.DataFrame.from_dict(panda_consumption)

# add 50 to every consumption of Water
DF.Cum_Water.apply(lambda x: x+50)

drawing


Contents

  1. ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„ DataFrame
  2. ์กฐํšŒํ•˜๊ธฐ Descriptive
  3. ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ Missing Values and Imputation
  4. ํ–‰/์—ด์— ์กฐ๊ฑด ๊ฑธ๊ธฐ Filter
  5. ์ธ๋ฑ์‹ฑ Indexing
  6. ํ•จ์ˆ˜ ์ ์šฉํ•˜๊ธฐ Apply a Function
  7. ๊ทธ๋ฃน๋ณ„ ์กฐ์ž‘ํ•˜๊ธฐ Groupby / Aggregating
  8. ์‹œ๊ณ„์—ด Time Series
  9. ์‹œ๊ฐํ™” Plot
  10. ๊ธฐํƒ€ Etc

1. ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„ ๋งŒ๋“ค๊ธฐ Create and Load click here!


  1. ๋ฆฌ์ŠคํŠธ๋กœ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๋งŒ๋“ค๊ธฐ (Creating new DataFrame from lists)

  2. ๋”•์…”๋„ˆ๋ฆฌ๋กœ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„ ๋งŒ๋“ค๊ธฐ (Creating new DataFrame from dict of lists)

  3. ๋ณต์ˆ˜์˜ ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„ ๋ณ‘ํ•ฉํ•˜๊ธฐ (merging multiple(more than 2) DataFrames)

  4. ๊ธฐ์กด์˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ์ƒˆ๋กœ์šด ํ–‰ ์ถ”๊ฐ€ํ•˜๊ธฐ (Appending a new row to an existing DataFrame)

  5. ์ƒˆ๋กœ์šด ์—ด์„ ์ถ”๊ฐ€ํ•˜์ง€ ์•Š๊ณ  ๊ธฐ์กด์˜ ์—ด์— ์ƒˆ๋กœ์šด ๊ฐ’์„ ๋ณ‘ํ•ฉํ•˜๊ธฐ (replacing columns using existing DF)
    pd.update


2. ์กฐํšŒํ•˜๊ธฐ Descriptive click here!


  1. ๊ธฐ์ˆ  ํ†ต๊ณ„๋Ÿ‰ ์กฐํšŒํ•˜๊ธฐ (Getting desriptive statistics from dataframe)
  2. ์—ด ๋ณ„๋กœ ๋‹ค์–‘ํ•œ ๊ธฐ์ˆ  ํ†ต๊ณ„๋Ÿ‰ ํ™•์ธํ•˜๊ธฐ (Checking specific statistics from columns)
  3. ์—ด ๋‚ด์˜ ์œ ๋‹ˆํฌํ•œ ์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„๋กœ ๊ฐœ์ˆ˜/๋น„์œจ ๊ตฌํ•˜๊ธฐ (Counting numbers of sample / Calculting proportion per category in a column)

3. ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ Missing Values and Imputation click here!


  1. ์—ด ๋ณ„๋กœ ๊ฒฐ์ธก์น˜๊ฐ€ ํฌํ•จ๋œ ๋ ˆ์ฝ”๋“œ์˜ ์ˆ˜ ์„ธ๊ธฐ (Determining the number of NA values in columns)

  2. ๊ฒฐ์ธก ํฌํ•จ๋œ ํ–‰/์—ด ๋‚ ๋ฆฌ๊ธฐ (Droping rows/cols with NA values) pd.dropna


4.ํ–‰/์—ด์— ์กฐ๊ฑด ๊ฑธ๊ธฐ Filter click here!


  1. ํŠน์ • ๋‹จ์–ด๋ฅผ ํฌํ•จํ•œ row๋งŒ ๊ณจ๋ผ๋‚ด๊ธฐ (Getting rows that contain specific text/word/string)
    pd.Series.str.contains

  2. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์—ด(์‹œ๋ฆฌ์ฆˆ) ๋‚ด์—์„œ ๋ฆฌ์ŠคํŠธ ๋‚ด์˜ ํŠน์ • ๊ฐ’์ด ์žˆ๋Š” ํ–‰ ๊ณจ๋ผ๋‚ด๊ธฐ (Using a list of values to select rows from Data Frame) pd.Series.isin

  3. ์ค‘๋ณต๋˜๋Š” ํ–‰ ์ œ๊ฑฐํ•˜๊ธฐ, ์—ด(๋ณต์ˆ˜ ๊ฐ€๋Šฅ) ๊ธฐ์ค€ ํ˜น์€ ์ „์ฒด ํ–‰ ๊ธฐ์ค€ (Droping duplicated/repeated/redundant rows, in respect to column(s) or full row) pd.drop_duplicates

5. ์ธ๋ฑ์‹ฑ Indexing click here!


  1. loc๋กœ ์ƒˆ๋กœ์šด ์นผ๋Ÿผ ํ• ๋‹นํ•˜๊ธฐ (Assigning a new column using loc method)

  2. ์›ํ•˜๋Š” ๋ฐ์ดํ„ฐ ํƒ€์ž…์˜ ์นผ๋Ÿผ๋งŒ ์„ ํƒํ•˜๊ธฐ (Selecting columns by specific data types)


6. ํ•จ์ˆ˜ ์ ์šฉํ•˜๊ธฐ Apply click here!


  1. ์‚ฌ์šฉ์ž ์ง€์ • ํ•จ์ˆ˜ ์ ์šฉํ•˜๊ธฐ (Applying lambda function to Pandas Series)
  2. ๋ฌธ์ž์—ด ํ–‰์—์„œ ๋งˆ์ง€๋ง‰ n๊ฐœ ๋‹จ์–ด ์ทจํ•˜๊ธฐ (getting last n characters from a string column)
  3. ์—ด์— if else ์กฐ๊ฑด๋ฌธ ์ ์šฉํ•˜๊ธฐ (applying if else statement to a column)
  4. apply์— ์—ฌ๋Ÿฌ๊ฐœ ์ธ์ž ๋„ฃ์–ด์ฃผ๊ธฐ (multiple argument for apply statement in DF)

7. ๊ทธ๋ฃน๋ณ„ ์กฐ์ž‘ํ•˜๊ธฐ Groupby click here!


  1. ๊ทธ๋ฃน ๋ณ„๋กœ ํ‰๊ท  ๊ฐ’ ๊ตฌํ•˜๊ธฐ (Getting average value of each column per group)

  2. ๊ทธ๋ฃน ๋ณ„๋กœ Multi level one-hot encoding ํ•˜๊ธฐ (one-hot encoding for multi-level column data)


8. ์‹œ๊ณ„์—ด Time Series click here!


  1. ํ–‰ ๋ณ„ ์ฐจ์ด ๊ตฌํ•˜๊ธฐ, ์ฐจ๋ถ„ (Getting difference between rows)

9. ์‹œ๊ฐํ™” Plot click here!


  1. ํŒ๋‹ค์Šค ์‹œ๋ฆฌ์ฆˆ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”ํ•˜๊ธฐ (Drawing a quick plot using pandas series)

10. ๊ธฐํƒ€ etc click here!


  1. ์ถœ๋ ฅ ๊ฒฐ๊ณผ ํ™•์žฅํ•˜๊ธฐ (Expanding pandas output to be shown)