AIT 580 – Big Data Analytics Final Project
Write Final Analytics project report. Please make sure to include all the sections below with detailed briefing in your report.
Select a “sufficiently large” and “complex” dataset from publicly available sources
See potential public sources listed in Course Resources on Blackboard)
“Sufficiently large” means enough data to provide reasonably detailed analysis o But not too large for your computing resources (desktop, laptop, AWS etc.)
o Approximately size between 200 MB – 500 MB
“Complex” means contains variety of data types like categorical, nominal, interval, ratio etc.
Introduction (about overall project)
Briefly explain the project and its scope
Briefly describe the dataset
Who (company, agency, organization) collected the data?
Who they are, what do they do?
What is their role/purpose?
Why did they collect this data?
What potential questions could be answered by studying this data?
o List some specific questions, and be sure to answer them in your analysis
Is there any privacy, quality, or other issues with this data?
Requirements and Resources needed
• What software and hardware resources are needed to study this data?
Dataset Description
Explore the dataset using relevant tools discussed in the course
Prepare and describe relevant metadata (types of data in the dataset)
Include schema for SQL if you are using any database. If you are using dataset in csv, xls or txt file, then you need to describe all the columns/attributes in detail. That will be considered as your dataset schema.
Prepare relevant descriptive statistics and visualizations for selected data o Don’t need to analyze all the items in the dataset
Graphics must follow good visualization practices discussed in course lectures
Interpret the results o what conclusions can be supported?
o This should reflect answers to the specific questions specified in the “Need” section.
Explain/define terms
Include explanation of any technical terms relevant to the project
Provide appropriate citations and references o Include citation for the dataset o Submission
Submit the report and code files only
Provide the link to download the dataset used in the report in case instructor needs to download it. Do not submit the dataset (due to large size) with report on blackboard.
If you have a specific area of interest (security, social media, system performance, commerce, etc.) then try to choose a dataset in your area of interest

