A First Look at References from the Dark to Surface Web World

Document Type


Publication Date



Tor is one of the most well-known networks that protects the identity of both content providers and their clients against any tracking or tracing on the Internet. So far, most research attention has been focused on investigating the security and privacy concerns of Tor and characterizing the topic or hyperlink structure of its hidden services. However, there is still lack of knowledge about the information leakage attributed to the linking from Tor hidden services to the surface Web. This work addresses this gap by presenting a broad evaluation of the network of referencing from Tor to surface Web and investigates to what extent Tor hidden services are vulnerable against this type of information leakage. The analyses also consider how linking to surface websites can change the overall hyperlink structure of Tor hidden services. They also provide reports regarding the type of information and services provided by Tor domains. Results recover the dark-to-surface network as a single massive connected component where over 90% of Tor hidden services have at least one link to the surface world despite their interest in being isolated from surface Web tracking. We identify that Tor directories have closest proximity to all other Web resources and significantly contribute to both communication and information dissemination through the network which emphasizes on the main application of Tor as information provider to the public. Our study is the product of crawling near 2 million pages from 23,145 onion seed addresses, over a three-month period.