====== L: 06/11/2020 ====== **Master in Informatics and Computing Engineering\\ Information Description, Storage and Retrieval\\ Instance: 2020/2021** \\ --- \\ ====== Lecture #7 :: 06/11/2020 ====== ===== Goals ===== By the end of this class, the student should be able to: * enumerate the features of the web as a document collection; * explain the principles of link analysis; * describe the PageRank algorithm; * describe the HITS algorithm; ===== Topics ===== - Web information retrieval * Web characteristics * Ranking * Link analysis ===== Materials ===== * {{.:dapi2021-web-ir.pdf|Information Retrieval on the Web}} * Ricardo Baeza-Yates, Berthier Ribeiro-Neto, //Modern Information Retrieval: The Concepts and Technology behind Search//, [[http://grupoweb.upf.es/mir2ed/pdf/chapter11.pdf|Chapter 11: Web Retrieval]], accessed November 2020 * Ricardo Baeza-Yates, Berthier Ribeiro-Neto, //Modern Information Retrieval: The Concepts and Technology behind Search//, [[http://grupoweb.upf.es/mir2ed/pdf/slides_chap11.pdf|Slides for Chapter 11: Web Retrieval]], accessed November 2020 * Ricardo Baeza-Yates, Berthier Ribeiro-Neto, //Modern Information Retrieval: The Concepts and Technology behind Search//, [[http://grupoweb.upf.es/mir2ed/pdf/slides_chap12.pdf|Slides for Chapter 12: Web Crawling]], accessed November 2020 * Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, //Introduction to Information Retrieval//, [[http://nlp.stanford.edu/IR-book/html/htmledition/irbook.html|Chapter 19, 20, 21]], accessed November 2020 * D. Easley, J. Kleinberg, [[http://www.cs.cornell.edu/home/kleinber/networks-book/|Networks, Crowds, and Markets: Reasoning About a Highly Connected World]]. Cambridge University Press, 2010 ===== Tasks ===== * Exercises in link analysis (PageRank and HITS). * Milestone #2: experiment and evaluate retrieval results. ===== Summary ===== * Web information retrieval. Ranking. Link analysis. * Work on Milestone #2, "Information Retrieval". * Indexing and retrieval experiments on the working collection. --- //MCR, JCL, SSN// [[06|« Previous]] | [[index|Index]] | [[08|Next »]]