A Syntax for Self-Tracking

For a while now, I have been doing self-tracking in a text file. The reason is that I want to track not only one aspect of life, like fitness or health or nutrition, but anything that I suspect might be interesting to analyze later. Like how the time I wake up impacts my mood. Or which eye drops work best to prevent dry eyes. Or how the temperature of my bedroom impacts my energy the next day. Did I wake up with a headache because I ate pizza late at night? Or because I slept with the window wide open? Maybe I should avoid one of these in the future?

If a completely context-free self-tracking app exists, I am not aware of it. Every tracking app seems to apply only to a certain narrow topic - often sport or food. And then all the apps send the data to a central server, which makes me uncomfortable.

To get started experimenting with context-free self-tracking, I tried it in a simple text file. As it turns out, it is surprisingly doable, and it has led me to a bunch of interesting results already.

It has led to a data structure that I find useful to do context-free self-tracking. A bit like the Git data structure: you can manipulate it with simple tools, and if you are nerd enough you might stick with those. I, for example, use only raw Git to do version control. And I only use Vim to operate on the self-tracking data structure I describe in this article. The general population would probably prefer higher-level tools (i.e., an app), which I will probably write later on. But first I want to get the data structure right.

I started with a simple space-separated approach of "date time event":

2020-05-28 18:41 Eat Pizza
2020-05-29 09:00 Slept with the window open
2020-05-29 09:00 Headaches

Everything is freeform. There is just one rule:

1: Every line starts with the date and time

This gives me the complete freedom to log whatever data I want. But I knew there would be more than just free text logging in the future. Structured logging of quantities (how many kilometers I ran, how many hours I slept), fine-grain data on multiple levels (How strong is the headache? Is it on the right or left side?), proper A/B tests, etc. I wanted to keep the option to introduce all this into the syntax without losing the freedom of logging free text. So I made this second rule:

2: Everything until [^a-z ] describes an observation

(A non-technical way to put it: Everything that only consists of the characters A-Z plus the space sign describes an observation.)

To do away with any confusion about capitalization, I decided to make everything case-insensitive:

3: Uppercase characters equal their lowercase counterparts

This way, I can write anything into the log that comes to my mind without thinking about syntax at all. As long as I stick to the characters A-Z and the space, I can log anything in any way I like.

Another early design decision about my self-tracking was that it is OK to write events into the log at any time. So I wrote all three log entries above at 09:00 in the morning, even the one about 18:41 of the last day. It is impossible to proactively track everything that could be of interest later. This is different to scientific studies where you would usually define upfront which causes to measure against which outcomes. But I think it is still useful to retroactively log data in the hopes that you can later make sense of it. That is how the human mind works. It's normal to think about past events when you try to find causes for the current situation. And I think a proper lifelong log can help us with this, even if we do not set up A/B tests - but we will, further down in this article.

As you can see, I also took the liberty to log observations about the past. "Slept with the window open." An acceptable alternative is to retroactively put the beginning of the event into the log, like I did with the pizza:

2020-05-28 18:41 Eat Pizza
2020-05-28 23:20 Go to bed with the window open
2020-05-29 09:00 Headaches

The idea is to log a lot of data quickly when I feel like logging it. The data can be dirty. No problem - clean it up later. Keep it in Git to have a track record how it changed.

A nice convenience in Vim is that you can make it suggest the next word by pressing CTRL+N. This makes logging very fast. Instead of typing "Headache" you can just type H and CTRL+N and it will give you a list of every word with H you already have in your log. It also prevents typos and makes the data cleaner.

For even greater convenience, I added another rule to my syntax:

4: _ equals space.

This means that instead of writing "Slept with the window open" I can write "Slept_with_the_window_open". From a data perspective, the two are equivalent. But for typing, now all I have to do is type S-CTRL+N and I get the whole event suggested by Vim "Slept_with_the_window_open". Which makes typing this event a matter of three keystrokes and keeps the data clean as I will always write it the same way.

At this point, writing the events was already super fast. The most cumbersome part of logging was typing the date and time manually. So I added a shortcut to Vim:

nnoremap <space>t o<C-r>=strftime("%F %H:%M ")<cr>

Now all I have to do to add a log line is to hit space+t and I will be on a line that already has the date and time. So I can directly start typing the event that I want to log. Making a log entry now usually only takes about three seconds as the date/time is automatically inserted and the event is usually suggested too after I type the first few characters.

After dabbling with freeform log events for a while, I wanted multiple levels of an observation. So instead of

2020-06-01 18:41 Meeting with Hugo Mayer

I started writing:

2020-06-01 17:00 Meeting: Hugo Mayer

So the colon has a special meaning:

5: A colon begins another level of the observation

I use this for measurements all the time:

2020-06-02 11:00 Temperature: 22°C
2020-06-02 11:00 Humidity: 43%

And also for subjective measurements:

2020-06-02 12:15 Mood: Very Good
2020-06-04 13:20 Sore eyes: Medium

There can be multiple colons in one line. For example, the following would log that I had sore eyes and felt it mostly in my left eye:

2020-06-04 13:20 Sore eyes: Medium: Mostly Left

A/B Tests

What about A/B tests? Maybe it is not the pizza that causes headaches the next day, but that eating pizza and having a headache the next day have the same root cause, like not eating enough for breakfast?

Here comes the question mark:

2020-06-05 22:30 Eat: Pizza? Yes

A question mark marks a coin flip. So if I say to myself, "I am hungry, but should I really eat pizza at this time?" then I write down the thing I am about to do and add a question mark. This means I will now have to do a coin flip and decide between Yes and No. Yes means the event left to the coin flip took place. No means it did not.

To make this easier, I added this shortcut to my bashrc:

alias coindecide='if (( RANDOM % 2 == 0 )); then echo Yes; else echo No; fi'

So now I can just type "coindecide<enter>" to get a decision by coin flip. And since bash has autocompletion, I usually just type "coi<tab><enter>" and have my decision. Super fast.

I could have put a coindecide macro into Vim, of course. But it is a nice tool in many situations, not only when writing. So I added it to the shell instead.

Additional information

To be able to put more info into the log even in lines structured according to the aforementioned six rules, I use parenthesis:

2020-07-08 12:30 Take a walk? Yes (60min)

7: Parenthesis can be used to add additional information

Order

Since I want a human-friendly format, the time is only tracked by the minute. This means that the order of events that are happening within the same minute is defined by their order in the log. This is important when using tools on the log to convert, filter, or merge it with other logs. Or when importing it into a database. Order always has to be preserved.

8: Order is important

So these are the eight rules I have been developing over the last five months of self-tracking:

1: Every line starts with the date and time
2: Everything until [^a-z ] describes an observation
3: Uppercase characters equal their lowercase counterparts
4: _ equals space
5: A colon begins another level of the observation
6: A question mark indicates a coin flip
7: Parenthesis can be used to add more information
8: Order is important

Discussions about this text are taking place on Twitter and Lobste.rs