There are usually much more links between pages of one site than links from pages from one site to pages of another. Our contingency table in some places will be densely filled with units, but most of it will be empty. We could fold the table row by row to the list of pages that this link goes to.
On the example of our table, it would look like this:
Main page → Category 1, Category 2, Page 2
Category 1 → Main page, Page 1, Page 2
Category 2 → Main page
Page 1 → Main page, Page 2
Page 2 → Main page, Page 1
We would get the so-called direct index. But let’s look at this visual and try to answer the question that excites many SEO-specialists: which pages link to Page 2? We will have to go through all the lists and see if Page 2 is among them. This is easy to do when there are five such lists. But there are billions of pages on the Internet and checking so many lists turns into a very time-consuming task.
To get an answer to the question that worries us so much, we can fold the contingency table by columns. As a result, we get lists of pages that link to the page:
Main page ← Category 1, Category 2, Page 1, Page 2
Category 1 ← Main page
Category 2 ← Main page
Page 1 ← Category 1, Page 2
Page 2 ← Main page, Category 1, Page 1
Now, to find the answer, it’s enough for us to find among all the lists only the list we need for the Page 2, and we don’t need to go through the contents of each list. So we got the backlink index. It is in this form that Serpstat stores link data to your site. This is a very simplified model, but the basic principles in it are correct.