Showing posts with label Link Analysis. Show all posts
Showing posts with label Link Analysis. Show all posts

Wednesday, April 5, 2023

Link Analysis

Link analysis for most IR functionality thus far based purely on text

Scoring and ranking
Link-based clustering – topical structure from links
Links as features in classification – documents that link to one another are likely to be on the same subject

Assumptions
1.Reputed Sites
2.Annotation of Target


Page Rank 
 Scoring measure based on the link structure of web pages
 PageRank = long-term visit rate = steady state probability.
 
Markov chains
 Markov chain is a discrete-time stochastic process: a process that occurs in a series of time-steps in each of which a random choice is made.
 A Markov chain consists of N states, plus an N × N transition probability matrix P.
 state = page
 At each step, we are on exactly one of the pages.



Teleporting – to get us out of dead ends






HITS – Hyperlink-Induced Topic Search

Hubs:  A hub page is a good list of links to pages answering the information need
Authorities:  An authority page is a direct answer to the information need

A good hub page for a topic links to many authority pages for that topic
A good authority page for a topic is linked to by many hub pages for that topic




---------------------------------------------------------------------------- 
All the messages below are just forwarded messages if some one feels hurt about it please add your comments we will remove the post. Host/author is not responsible for these posts.