Data Analytics Question

for this assignment is a modified version of the Superstore dataset we used in the class. This dataset contains information about orders placed, including sales, profits, and customer details.

Your Colab Notebook should meet the following Learning Objectives:

  1. Comment on the structure of the dataset (10)
  2. Identify and handle missing values: Detect missing data and apply appropriate strategies to handle them. (10)
  3. Detect and comment on outliers: Identify potential outliers in numerical columns and comment on them whether you should address them or not. (10)
  4. Identify and remove duplicates: Find and eliminate duplicate rows/records in the dataset. (5)
  5. Modify data types: Change the data type of date columns. (5)
  6. Remove unnecessary columns: Eliminate columns that are not needed for analysis. (5)
  7. Summarize both numerical and categorical variables – and comment on the statistical summary. (5+5+5)
  8. Apply filtering: Filter the dataset to create a subset – where “Profit is more than 75%“. (10)
  9. Show the top 5 most profitable transactions. (15)
  10. What do you observe from the top 5 most profitable transactions from the Segment, City, Product Category, and other different perspectives? (write down your observations in a Text cell) (1
  11. the assignment is done through GOOGLE COLAB
× How can I help you?