The P assignments are designed for consolidating some of the principal concepts covered in the respective course modules.
The assignment P1 has four questions and answer to each is about 1~1.5 pages. Each of the four questions has three subquestions, and your answer to each subquestion should be about a half page on average. This length is intentionally set expecting that your submission will include illustrative diagrams, pictures, tables, etc. You can use any of your favorite text editors, such as MS word or google doc, and use single space and font size 12 in Arial or Times New Roman.
The total for the assignment P1 is limited to a maximum of 8 pages (excluding references). The recommended per-section lengths intentionally add to less than 8 pages to leave you room to decide where to delve into more detail. 8 pages is a maximum, not a target. Any content beyond 8 pages will not be considered for a grade. You are required to cite the articles you read from the reading list and the public domains, but reference does not count towards the 8-page maximum.
If you would like to include additional information beyond the page limit, you may include it in appendices. These materials will not be used in grading your assignment, but they may help you get better feedback from your classmates and grader.
Question 1 (from Module 1 Block 2): 1~1.5 pages
Everyone has direct experiences with many Internet-fueled innovations. We can classify such innovations into three broad categories: (1) Transforming physical world to cyber world; (2) Combining physical world and cyber world; and (3) Enriching physical world.
First, describe three Internet fueled innovations (one from each of the three categories) using your own experience to date and discuss the impact of such innovations on the human society and the physical world in which we live.
Second, discuss what do they have in common? how these three innovations differ from one another? For example, you may discuss and compare these innovations in terms of usability, user communities, Internet computing infrastructure, communication, and storage technologies used to support each of the three innovations you describe.
Finally, compare the insights you gained from each of the previous discussions and elaborate on what sorts of improvements that are commonly recognized or expected in the future? And what sorts of improvements would you suggest, especially if your suggestions are somewhat different.
Question 2 (from Module 2 Block 3): 1~1.5 pages
World Wide Web (Web) is a disruptive killer application of the Internet. As a user of both surface (shallow) Web and deep Web, you may have extensive experiences (technical or non-technical).
First, select one activity you perform with the surface Web and one activity you perform with the deep Web, and describe what these two activities have in common and what are different and why?
Second, discuss why search engine today cannot search deep Web, and you are encouraged to find some example search activities to illustrate your points.
Finally, discuss your idea of designing a deep Web search engine: is it possible, why or why not? And list up to three challenges you perceive as a designer of a deep Web search engine.
Question 3 (from Module 2 Block 3): 1~1.5 pages
We have talked about Web crawler with an example Web crawler PULSE written by a student from this course many years ago (slides 32~42).
First, discuss why the example crawler PULSE is nave and inefficient for crawling the Web.
Second, describe how would you like to design a crawler if you are asked to crawl the Georgia Tech domain ( to an external site.), including what performance metrics are most important for your design. Example metrics are crawling speed (# pages/per time unit), minimizing duplicate pages, storage cost, etc.
Finally, describe how fast your crawler would need to crawl the Web (i.e., #pages per hour) if your target crawling speed is to crawl 1 billion pages per year. [Hint: make an assumption on your crawler in terms of crawling throughput #X pages/second, you may vary X to answer this question.]
Question 4 (from Module 2 Block 4): 1~1.5 pages
This last question is about search engine ranking algorithms. You are recommended to use concrete search examples to illustrate your answers.
First, describe why ranking algorithms are important for a search engine from usability perspective and efficiency perspective.
Second, compare query dependent ranking methods and query independent ranking methods in terms of advantages and disadvantages. You are encouraged to use figures or tables to summarize your comparison.
Finally, describe when google PageRank is useful and when it may fall short based on your own search experience.
