Dave Sifry has begun a series of articles explaining the inner workings of his brilliant blog-mining service, Technorati.
1) We spider weblogs, and correlate each weblog's outbound links to any page on your blog/site
2) Technorati works on any URL – not just URLs for weblogs. For example, you can see what people are saying about an interesting article or favorite company, and get an instant read on the conversations going on around that article or site.
3) The simplest way get your weblog included in the Technorati index is to ping us whenever you update your weblog. That puts you in the high-priority queue for indexing. You can save the page as a bookmark, or you can program your weblog software to do it automatically.
4) To calculate the inbound blog list, we use the outbound links from the blog homepage, not from the archives
5) We do process RSS feeds an other metadata, but that doesn't affect your inbound blog stats. As long as you produce HTML, you're OK.
6) Nightly, we go through the database and re-calculate the number of inbound blogs and links to every weblog we track, which helps us double-check our work and also allows us to create the interesting newcomers list, the interesting recent blogs list, etc.