Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 515 Bytes

File metadata and controls

17 lines (14 loc) · 515 Bytes

Data analysis using Python programming

Data cleaning and preprocessing

drop unimportant columns and drop duplicates

Data Handling

  • Reduce the number of specializations. By using the logic, put every specialization into a category called "other", if the specialization is below 10
  • Identify the outliers by scatter plot
  • Assign null values with their mean values

Data visualization

perform by,

  • histogram
  • scatterplot
  • count plot
  • pie chart
  • heat map-represent the correlation between variables