6 min read
Personal Hierarchical Tag System

Modern file systems are dominated by hierarchical file structures. They fail to deliver many users’ needs based on the mapping of real-world objects. For example, movies can be categorized by genre, director, year, etc., but you can only organize them according to a single dimension.

Many have tried implementing the tag system on a virtual file system or as an auxiliary app (Tagsistant, TMSU). A semantic file system, which makes use of tags and other metadata for structuring files, might be an answer to this problem, but it is still a niche in the market. That being said, tags are already a very indispensable part of most documentation applications, like Confluence, Notion, Obsidian, and Apple Notes.

I bring them up because they are a very good opening gambit in practically implementing a mini tag system for personal use. I intentionally do not call it a “tag file system” to keep the discussion on file management and metadata management level.

The documentation applications I mentioned still view the hierarchical structure as the essential way to organize files, with tags as a deputy. There are many reasons behind that, and I think two of which are the most obvious:

  1. Hierarchical structure is easy to maintain. The hierarchical structure itself maintains a single tag, no matter how you call it, the path, or the hierarchy. If we want to implement a real tag system, we should be able to freely add tags and remove tags. It is foreseeable that the management will be a disaster for personal use.
  2. Hierarchical structure is a mapping of the real-world artifacts, say, the library and drawer.

Things are different when we dig into these two reasons.

The first one is solved partially by machine learning. We may still need to manage which tags should exist in a system, but machine learning can do the rest of the dirty work. Classification is easy, especially because most documentation applications use a unified file format or a limited number of file formats.

The second one is an example of how tools shape our mental model, but the brain does not necessarily work that way. It is better to design tools that follow the mental world, instead of shaping our mental model into tools.

I will discuss the practices I use to tag files and how to implement the tag system in Notion.

Definitions

  1. Database: A collection of interrelated records.
  2. Record: A collection of fields, possibly of different data types, typically in a fixed number and sequence. It is called Page in Notion.
  3. Field: It can be text, references to tags, list, or any data. It is called Property in Notion.
  4. Tag: I don’t know the commonly agreed definition, but I think it is simply a set containing nothing or record(s).
  5. Tag Tree: Will be explained in Principles.2.
  6. Path to Tag: Will be explained in Implementation in Notion.2.

Principles

  1. Single Source of Truth: There is one main database. Every other database is one derived view of the main database.

  2. Tree Structure: Tags are stored in a tree. The entry to a record can be tracked by multiple tags.

  3. Hierarchical: Tags are hierarchical. That is to say, the only relationships that exist between one tag and another are “child of” and “parent of”. If the entry to a record shows in one tag, it also belongs to the tag’s parent tags. For example, in the following tag system, record 1 is tracked in Tag E, so it is also tracked in Tag D, Tag A.

    Example of Tag Tree Structure
    Example of Tag Tree Structure
  4. Adapt to Changes: The position of tags can be changed in the tag tree. Tags can be added and removed.

    1. Example of adding a parent tag: We already have a tag Book. A new tag Entertainment is added, which is the parent of Book. All the records that are tagged Book should also be tagged with Entertainment after the update.
    2. Example of adding a child tag: We already have a tag Book. A new tag Sci-fi is added, which is the child of Book. All the records that are tagged Book should also be tested if they are fitted into Sci-fi, or only leave it in Book only.

Implementation in Notion

  1. Add a Database. It contains property Tags.
  2. The tags are the paths to the records in a tag tree. For example, in the Figure Example of Tag Tree Structure, record 1 should be tagged with Tag A/Tag D/Tag E and Tag A/Tag B, and because of principle 3, it should also be tagged with Tag A, Tag A/Tag D. Many wiki systems use this format to store tags, like Wiki.js.
  3. Once all possible paths in a tag tree are added to the property Tags options, it is possible to autofill new records with Notion AI.
Adding Paths to the Tags in Notion
Adding Paths to the Tags in Notion
  1. However, AI is not always accurate. I noticed that Notion AI tends to always include the most accurate tag while neglecting its parent tags. For example, the record named Call of Duty: Modern Warfare is tagged with Entertainment/Game/FPS but not with Entertainment and Entertainment/Games. We need to solve this problem by creating a formula property storing all the tags separately. The following formula can be used to extract all tags from path to tags:
unique(flat(map(flat(prop("Tags")), split(current, "/"))))
Creating a Formula Property Storing All Tags
Creating a Formula Property Storing All Tags
  1. In view, we can now finally customize with the filter on Formula property.
Configuring View Filter
Configuring View Filter