Posted on 17th November 2023

Resurrecting The Website Directory

Remember those good old days? You know the ones I mean, where you could go to web sites that had directories of other web sites, all neatly categorised. Unfortunately most web directories have long since disappeared and even been expunged from search engines in which they were once intertwined.

tbd
DMOZ was an online directory which, surprisingly, was not shut down until 2017

Perhaps you're old enough to remember when there was a special book in your home. You would open it up and it had hundreds of pages of information about local businesses. It was carefully categorised, alphebetised and indexed to help you find what you wanted in a variety of ways. That book was how you found someone who could fix your car. It was how a parent discovered where they could buy their child's first bicycle. It was how a budding pianist would find; a piano for sale, someone to tune it, and someone to teach them how to play.

Physical directories of categorised and indexed information existed eons before web search was ever a thing. The closest thing to search when printed directories arrived in every home was perhaps a human you spoke to on the phone who could tell you the telephone number or address of a business. Guess where they looked up that information?

Is a directory useful anymore?

When I was analysing the evolution of search engines recently I was reminded that one thing we seem to have left behind is the display of hierarchical categorisation. The capability to browse rather search online has been diminished now that directories have all but vanished from search engines and the web in general.

I wondered how we might take advantage of this kind of taxonomic approach again and if it could still prove useful.

The biggest problem with categorising large amounts of information is paradoxically it's biggest advantage. When the diversity and volume of data is so big it becomes a resource intensive task to build a directory. But if you can achieve it, you can make sifting that information easier for everyone else.

Let's think back to the good old days, when search engines offered directories to browse: Personally I would use those to find specific types of sites. Being quite young and lacking a lot of general knowledge this was useful. For example I might have wanted to find news websites, but I didn't know about Reuters or Bloomberg, so I would use the directory. And because there were many levels of 'sub-catergories' if I was interested in a more specific type of news website (such as a particular topic or location), I could find a website that specialised in it.

The emotional aspect of browsing information

A quick aside that I won't dwell too long on as it's a gargantuan topic. But perhaps like me you visited libraries as a child. Do you remember when you used to hunt through the books to find something interesting to read? I remember the experience of visiting the library well and I distinctly remember enjoying it. Being immersed in that information and having the time to explore it is a highly valuable experience. So it could be argued there is a benefit to being able to browse information online in this way too.

Clearly people have different preferences and engage in different modes of finding information at different times. Why then are we prioritising a highly goal, rather than journey, focused approach? Especially when you consider how different approaches can tap into peoples' varied mental models and emotional state to make for a more meaningful experience. Like wandering through a library and picking up books and looking at them... just because it was fun. That we learned something from it was a happy by-product.

Web site discovery vs. web page search

A website represents an identity, whereas a web page represents a specific slice of information. So web directories are useful for finding similar organisations, people and products that match the criteria we are looking for, not a specific webpage out of all the pages in the world. We can then explore those sites by going to their home page, which provides a common starting point that could be bookmarked. And sometimes when we're using search, that's actually what we want to achieve.

It is also the case that by not driving more users to website homepages we lose a certain amount of 'discovery' capability and users do not benefit from the information architecture of the homepages that website owners have invested in. It's entirely possible a particular website is better optimised for the user's goals than a generalised search engine.

I think even in the age of smart search engines with AI capabilities we are missing a piece of functionality which, as the following examples will illustrate, is not fulfilled by modern search engines like Google. For particular use cases these search engines fail.

Examples of what happens when search gets it wrong

When I search for "websites like Bloomberg" I mostly get annoying articles on different websites telling me "the top 10 alternatives to Bloomberg", none of which are particularly trustworthy or high quality sources. All I want in this case is an easy way to find a list of sites representing similar entities. Having to wade through these articles to find this information is a poor user experience. And I don't think the answer is just better SEO or weeding out AI generated content (though that would definitely help).

I think there's value in being able to browse categories of information and that it would also be useful if this can be integrated into search again.

Consider top sections of the search results page for the query 'google':

tbd

Old-school Google here gives us a match for the category related to the search, as well as a match for the category the websites displayed belong to. That's pretty clever if you ask me, especially for a search engine from the 90s. Of course it relies on a lot of human input, but maybe that's not such a bad thing. Maybe it's those rose-tinted glasses but I think at the time it worked well.

tbd

A current Google search gives us some filters. These buttons could be quite useful in many contexts. But what it seems incapable of doing is recognising what the things you've just searched actually are (how a human would categorise them) and allowing you to see similar things. Even if I click the 'Search' filter - I just get a duplication of the same website over and over again:

tbd

This is the result of applying ever more complex algorithms to present filters that the search engine has tried to determine that you're interested in.

Hacking a search engine to emulate a directory

A small tweak that could be made to a search engine is to display the best estimation of the category of websites the query represents and the categories the search result websites belong to. Then you could click a category to get a list of similar things. In the above example from Google, a category called 'Search Engines' is all we need. When you select that category, a list of websites that match this category is shown.

What's required is a differentiation between:

and

Note: in the context of this article I'm using category to refer to the kind of category we might find in a directory. In this case I'm assuming all websites can be defined by small number of categories that they belong to, or ideally just a single category. These categories would take precedence over and above many possible concepts a website may be related to.

Obviously this is somewhat subjective but that doesn't mean we can't attempt to implement it, after all Google already presents a subjectively determined and prioritised list of filter buttons which narrow the list of results displayed.

I'm sure most search engines already have algorithmically generated categories for each website in the background and while it may not be perfect, giving the user visibility of this and the choice to find all websites that 'belong to' a category would enable directory-like behaviour within the search results.

For example my search for 'google' might match the category 'Tech Companies' or it might match the category 'Search Engines', or if we allow multiple categories it could match both. This seems quite a simple ask, but weirdly neither of these filters are offered to the user by a current Google search for 'google'.

Here's an updated version of my previous attempt at an improved search engine interface, which incorporates the hybrid directory approach:

tbd

In this instance I've opted to use the terminology 'Site Categories' as in my search prototype here, we are not hiding the concept of a 'site' or 'page' from the user: I'm taking the stance that websites and pages are tangible things. So this allows us to distinguish between categories used for filtering pages and categories used to filter websites. There are of course other ways to make this distinction.

This fix I propose is essentially hacking search to discover categories, creating a 'hybrid' directory if you will, which could enhance the search engine with better 'discovery' capabilities. This a step forward in my opinion, but it doesn't address how we could create a comprehensive browseable directory of everything.

A browsable directory of everything!?

I have a theory which, if it works, would require minimal effort to build. If you create a directory that shows you the top 'X' websites for the top 'Y' categories. Let's say you decide your directory should have only 200 categories maximum, which can be hierarchical. Within each category we only display the top 10 websites and also any sub-categories. Your directory would only end up with a maximum of 2000 websites which of course would mean the most popular few thousand websites (and categories of information) on the internet will have a bit of a monopoly on it. I'll leave it up to your imagination what that might look like!

So firstly you do a little manual moderation to sift out the naughty stuff. Then within each category you add a 'show more' button that allows you to find more websites similar to those in that category. Most search engines are now smart enough under the hood to do a reasonable job of finding similar things. Regardless of whether your search engine has already categorised every website indexed, if you manually categorise the top few thousand websites all you need to do is hook up the 'show more' button to the search engines 'find similar' behaviour and make sure it's configured to only show the home pages of websites.

Even a less advanced search engine could at least offer search results for the category by simply plugging it in as a search query when the user presses "show more" when viewing the websites inside a particular category.

Searchable categories and customised feeds

For bonus points you can make the categories themselves searchable and matching categories can appear with the search results or be separately searchable. You can also use various techniques to further refine and optimise the categories over time. You could even allow users to follow particular categories which might be used to create a customised feed or home page... basically they could then build their own directory.

And that's how The Website Directory can be resurrected

With a tweak of the search behaviour I think you can make search work with categories again. And beyond that you only need to categorise and moderate a relatively small number of websites, which could even be done using machine learning to save time, and you've built something which behaves very much like a directory of everything on the web. You can then use this directory to build new features that make use of the categories in your directory, even giving users more control over the type of content they see.

Of course I'm being a little overly simplistic, but my main point is there are options available to search engine developers which could bring back the good old days of the website directory if they wanted to. The question is how much value there is in doing so, but that's a discussion for another day. As I've already mentioned, the different methods available to people to find information can not only broaden their world view, but they can also elicit an emotional response.

Search engines have a powerful role in our lives and a tangible effect on our mental state. Surely it is worth questioning how they currently operate, and considering the alterations and alternatives that could improve how we connect with the world?