/Kaggle-Dacon

Data Analysis, 데이콘 경진대회 1등 솔루션

Primary LanguageJupyter Notebook

Kaggle-Dacon

Data Analysis

Data analysis refers to a process of processing given data to obtain the desired information and conclusions.


Data analysis usually consists of the following steps.

  • Topic selection
  • Data structure identification
  • Data preprocessing
  • Data analysis implementation

Topic selection

Set the purpose of data analysis, such as which data to select, what hypotheses to make from the data, and to start the analysis, and what conclusions you want.


Understanding the data structure

In order to analyze the data, it is necessary to know in advance the type, data type, and variable name where the data is stored. Or, by applying a statistical function to a data frame, you can determine the distribution or propensity of the data.


Data preprocessing

Before data is analyzed, only necessary variables are extracted or new variables are calculated using existing variables. If there are missing and outliers in the data, you must remove them correctly at this stage to properly verify the data analysis results.


Data analysis

This is a step to implement hypothesis or obtain desired information by calculating and processing data using numpy and pandas based on the hypothesis established in the topic selection stage. Visualization is also used to effectively show the information obtained.