Why Loss and Accuracy Metrics Conflict?

Posted by Jussi Huotari on 17 January 2018, 9:18 am

A loss function is used to optimize a machine learning algorithm. An accuracy metric is used to measure the algorithm’s performance (accuracy) in an interpretable way. It goes against my intuition that these two sometimes conflict: loss is getting better while accuracy is getting worse, or vice versa.

I’m working on a classification problem and once again got these conflicting results on the validation set.

Loss vs Accuracy

Accuracy (orange) finally rises to a bit over 90%, while the loss (blue) drops nicely until epoch 537 and then starts deteriorating. Around epoch 50 there’s a strange drop in accuracy even though the loss is smoothly and quickly getting better.

My loss function here is categorical cross-entropy that is used to predict class probabilities. The target values are one-hot encoded so the loss is the best when the model’s output is very close to 1 for the right category and very close to 0 for other categories. The loss is a continuous variable.

Accuracy or more precisely categorical accuracy gets a discrete true or false value for a particular sample.

It’s easy to construct a concrete test case showing conflicting values. Case 1 has less error but worse accuracy than case 2:

Loss vs Accuracy example spreadsheet

For reference, calculating categorical cross-entropy in Keras for one sample:

truth = K.variable([[1., 0., 0.]])
prediction = K.variable([[.50, .25, .25]])
loss = K.eval(K.categorical_crossentropy(truth, prediction))

In what kind of situations does loss-vs-accuracy discrepancy occur?

When the predictions get more confident, loss gets better even though accuracy stays the same. The model is thus more robust, as there’s a wider margin between classes.
If the model becomes over-confident in its predictions, a single false prediction will increase the loss unproportionally compared to the (minor) drop in accuracy. An over-confident model can have good accuracy but bad loss. I’d assume over-confidence equals over-fitting.
Imbalanced distributions: if 90% of the samples are “apples”, then the model would have good accuracy score if it simply predicts “apple” every time.

Accuracy metric is easier to interprete, at least for categorical training data. Accuracy however isn’t differentiable so it can’t be used for back-propagation by the learning algorithm. We need a differentiable loss function to act as a good proxy for accuracy.

Spell Out Convolution 1D (in CNN’s)

Posted by Jussi Huotari on 20 December 2017, 5:20 pm

I’m working on a text analysis problem and got slightly better results using a CNN than RNN. The CNN is also (much) faster than a recurrent neural net. I wanted to tune it further but had difficulties understanding the Conv1D on the nuts and bolts level. There are multiple great resources explaining 2D convolutions, see for example CS231n Convolutional Neural Networks for Visual Recognition, but I couldn’t find a really simple 1D example. So here it is.

1D Convolution in Numpy

import numpy as np
conv1d_filter = np.array([1,2])
data = np.array([0, 3, 4, 5])
result = []
for i in range(3):
 print(data[i:i+2], "*", conv1d_filter, "=", data[i:i+2] * conv1d_filter)
 result.append(np.sum(data[i:i+2] * conv1d_filter))
print("Conv1d output", result)

[0 3] * [1 2] = [0  6]
[3 4] * [1 2] = [3  8]
[4 5] * [1 2] = [4 10]

Conv1d output [6, 11, 14]

The input data is four items. The 1D convolution slides a size two window across the data without padding. Thus, the result is an array of three values. In Keras/Tensorflow terminology I believe the input shape is (1, 4, 1) i.e. one sample of four items, each item having one channel (feature). The Convolution1D shape is (2, 1) i.e. one filter of size 2.

1D Convolution Kernel Size 2

The Same 1D Convolution Using Keras

Set up a super simple model with some toy data. The convolution weights are initialized to random values. After fitting, the convolution weights should be the same as above, i.e. [1, 2].

from keras import backend as K
from keras.models import Sequential
from keras.optimizers import Adam
from keras.layers import Convolution1D
K.clear_session()
toyX = np.array([0, 3, 4, 5]).reshape(1,4,1)
toyY = np.array([6, 11, 14]).reshape(1,3,1)

toy = Sequential([
 Convolution1D(filters=1, kernel_size=2, strides=1, padding='valid', use_bias=False, input_shape=(4,1), name='c1d')
])
toy.compile(optimizer=Adam(lr=5e-2), loss='mae')
print("Initial random guess conv weights", toy.layers[0].get_weights()[0].reshape(2,))

Initial random guess conv weights [-0.99698746 -0.00943983]

Fit the model and print out the convolution layer weights on every 20th epoch.

for i in range(200):
  h = toy.fit(toyX, toyY, verbose=0)
  if i%20 == 0:
    print("{:3d} {} \t {}".format(i, toy.layers[0].get_weights()[0][:,0,0], h.history))

  0 [-0.15535446  0.85394686] 	 {'loss': [7.5967063903808594]}
 20 [ 0.84127212  1.85057342] 	 {'loss': [1.288176417350769]}
 40 [ 0.96166265  1.94913495] 	 {'loss': [0.14810483157634735]}
 60 [ 0.9652133   1.96624792] 	 {'loss': [0.21764929592609406]}
 80 [ 0.98313904  1.99099088] 	 {'loss': [0.0096222562715411186]}
100 [ 1.00850654  1.99999714] 	 {'loss': [0.015038172714412212]}
120 [ 1.00420749  1.99828601] 	 {'loss': [0.02622222900390625]}
140 [ 0.99179339  1.9930582 ] 	 {'loss': [0.040729362517595291]}
160 [ 1.00074089  2.00894833] 	 {'loss': [0.019978681579232216]}
180 [ 0.99800795  2.01140881] 	 {'loss': [0.056981723755598068]}

Looks good. The convolution weights gravitate towards the expected values.

1D Convolution and Channels

Let’s add another dimension: ‘channels’. In the beginning this was confusing me. Why is it 1D conv if input data is 2D? In 2D convolutions (e.g. image classification CNN’s) the channels are often R, G, and B values for each pixel. In 1D text case the channels could be e.g. word embedding dimension: a 300-dimensional word embedding would introduce 300 channels in the data and the input shape for single ten words long sentence would be (1, 10, 300).

K.clear_session()
toyX = np.array([[0, 0], [3, 6], [4, 7], [5, 8]]).reshape(1,4,2)
toyy = np.array([30, 57, 67]).reshape(1,3,1)
toy = Sequential([
 Convolution1D(filters=1, kernel_size=2, strides=1, padding='valid', use_bias=False, input_shape=(4,2), name='c1d')
])
toy.compile(optimizer=Adam(lr=5e-2), loss='mae')
print("Initial random guess conv weights", toy.layers[0].get_weights()[0].reshape(4,))

Initial random guess conv weights [-0.08896065 -0.1614058   0.04483104  0.11286306]

And fit the model. We are expecting the weights to be [[1, 3], [2, 4]].

# Expecting [1, 3], [2, 4]
for i in range(200):
  h = toy.fit(toyX, toyy, verbose=0)
  if i%20 == 0:
    print("{:3d} {} \t {}".format(i, toy.layers[0].get_weights()[0].reshape(4,), h.history))

  0 [-0.05175393 -0.12419909  0.08203775  0.15006977] 	 {'loss': [51.270969390869141]}
 20 [ 0.93240339  0.85995835  1.06619513  1.13422716] 	 {'loss': [34.110202789306641]}
 40 [ 1.94146633  1.8690213   2.07525849  2.14329076] 	 {'loss': [16.292699813842773]}
 60 [ 2.87350631  2.8022306   3.02816415  3.09959674] 	 {'loss': [2.602280855178833]}
 80 [ 2.46597505  2.39863443  2.96766996  3.09558153] 	 {'loss': [1.5677350759506226]}
100 [ 2.30635262  2.25579095  3.12806559  3.31454086] 	 {'loss': [0.59721755981445312]}
120 [ 2.15584421  2.15907145  3.18155575  3.42609954] 	 {'loss': [0.39315733313560486]}
140 [ 2.12784624  2.19897866  3.14164758  3.41657996] 	 {'loss': [0.31465086340904236]}
160 [ 2.08049321  2.22739816  3.12482786  3.44010139] 	 {'loss': [0.2942861020565033]}
180 [ 2.0404942   2.26718307  3.09787416  3.45212555] 	 {'loss': [0.28936195373535156]}
...
  n [ 0.61243659  3.15884042  2.47074366  3.76123118] 	 {'loss': [0.0091807050630450249]}

Converges slowly, and looks like it found another fitting solution to the problem.

1D Convolution and Multiple Filters

Another dimension to consider is the number of filters that the conv1d layer will use. Each filter will create a separate output. The neural net should learn to use one filter to recognize edges, another filter to recognize curves, etc. Or that’s what they’ll do in the case of images. This excercise resulted from me thinking that it would be nice to figure out what the filters recognize in the 1D text data.

K.clear_session()
toyX = np.array([0, 3, 4, 5]).reshape(1,4,1)
toyy = np.array([[6, 12], [11, 25], [14, 32]]).reshape(1,3,2)
toy = Sequential([
 Convolution1D(filters=2, kernel_size=2, strides=1, padding='valid', use_bias=False, input_shape=(4,1), name='c1d')
])
toy.compile(optimizer=Adam(lr=5e-2), loss='mae')
print("Initial random guess conv weights", toy.layers[0].get_weights()[0].reshape(4,))

Initial random guess conv weights [-0.67918062  0.06785989 -0.33681798  0.25181985]

After fitting, the convolution weights should be [[1, 2], [3, 4]].

for i in range(200):
  h = toy.fit(toyX, toyy, verbose=0)
  if i%20 == 0:
    print("{:3d} {} \t {}".format(i, toy.layers[0].get_weights()[0][:,0,0], h.history))

  0 [-0.62918061 -0.286818  ] 	 {'loss': [17.549871444702148]}
 20 [ 0.36710593  0.70946872] 	 {'loss': [11.24349308013916]}
 40 [ 1.37513924  1.71750224] 	 {'loss': [4.8558430671691895]}
 60 [ 1.19629359  1.83141077] 	 {'loss': [1.5090690851211548]}
 80 [ 1.00554276  1.95577395] 	 {'loss': [0.55822056531906128]}
100 [ 0.97921425  2.001688  ] 	 {'loss': [0.18904542922973633]}
120 [ 1.01318741  2.00818276] 	 {'loss': [0.064717374742031097]}
140 [ 1.01650512  2.01256871] 	 {'loss': [0.085219539701938629]}
160 [ 0.986902    1.98773074] 	 {'loss': [0.022377887740731239]}
180 [ 0.98553228  1.99929678] 	 {'loss': [0.043018341064453125]}

Okay, looks like first filter weights got pretty close to [1, 2]. How about the 2nd filter?

# Feature 2 weights should be 3 and 4
toy.layers[0].get_weights()[0][:,0,1]

array([ 3.00007081,  3.98896456], dtype=float32)

Okay, looks like the simple excercise worked. Now back to the real work.

Less Sexism in Finnish

Posted by Jussi Huotari on 18 September 2017, 10:22 pm

Machine learning models are only as good as the data you use to train them. AI sexism and racist machines have made the news in 2017.

Yesterday, Slava Akhmechet tweeted about his word2vec language model test. According to the tweet, the test was trained on the Google News dataset that contains one billion words from news articles in English.

Word2vec can be used to find similar meanings between words. For example: (Man | King) would be analogous to (Woman | ________) … Can you guess? Yes, “Queen”.

Image from Tensorflow Tutorial

However, Slava’s tweet shows that according to news text, while “he” is “persuasive”, “she” is “seductive”, and so on:

Got word2vec trained on the Google News dataset working on my laptop. Holy fuck… pic.twitter.com/1kl43gtuWi

— Slava Akhmechet (@spakhm) September 17, 2017

I’ve ran some word2vec tests in Finnish. I thought it would be interesting to see if a similar bias exists in Finnish as well. You know, Finland being one the most gender-equal countries, where we speak a language that has only gender-neutral pronouns. In Finnish, both “he” and “she” are the same word: “hän”.

Results

Finnish is more equal than English.

For example, in Finnish, both “man” and “woman” have similarity to “gynecologist” and “general practitioner”. (Top-10 similarity for woman included “nurse”, which was missing for man.)

(Man+persuasive) is mostly equal to (Woman+persuasive): both are “credible”, “dashing”, “impassioned”.

Mostly equal in traffic as well. (Man+biker) has similarity score 0.54; (Woman+biker) = 0.51.

Personal qualities:
 (Man+credible)    = 0.24 vs (Woman+credible)  = 0.25
 (Man+dependable)  = 0.21 vs (Woman+dependable)= 0.22

It’s not all that rosy, though.

 (Man+manager)     = 0.24 vs (Woman+manager)   = 0.17
 (Man+sensitive)   = 0.23 vs (Woman+sensitive) = 0.30

The training data for this word2vec model is the Finnish Internet, i.e. articles, news, discussions, and online forums in Finnish (by Turku BioNLP Group)

You Should Use Deep Learning for That!

Posted by Jussi Huotari on 17 August 2017, 9:31 am

I have been studying machine learning and artificial intelligence lately. Then run into this tweet that resonated deeply. 😉

The image is a clever update of XKCD’s original obnoxious physicist.

Computer Backup Using Restic on Mac

Posted by Jussi Huotari on 14 January 2017, 3:40 pm

Restic is an open source backup program. I really like Restic’s design goals, so I chose to try Restic for backing up my computer at home. Restic is still a young project and it was (for me) somewhat difficult to figure out the right way the utilize it. I wrote down my approach – maybe it’ll be useful for someone else in the future. So here’s a how-to describing a Restic setup on a Mac. This is neither a recommended setup nor a guideline, but may give useful ideas for your own setup.

My objective is to have a reliable, automatic backup process for my Mac at home. There’s around 500GB of data to backup to a network drive on the same LAN. I also want to make an off-site backup over WAN but, to keep things simple, the plan is to use rsync to copy the Restic repository, instead of utilizing Restic in that part of the process.

What To Back Up?

I want to backup data only, such as photos that I’ve taken and that can’t be replaced if lost. I chose not to do a full backup, so I’ll exclude e.g. the operating system, installed applications, and purchased media that can be replaced or re-downloaded from the vendor. I’ll backup the data that resides in my home folder:

/Users/jussi
- /Applications (for the few app-specific settings)
- /Documents
- /Movies (my GoPro stuff lives here)
- /Pictures
- /Music/GarageBand (most of Music is .mp3’s but I’m only backing up my own projects)

At first I thought I’d make an include-list containing the folders I want in the backup. Restic supports includes and excludes, but the way the support is currently done, it’s easier to include the whole home folder and then exclude some folders. This is probably better anyway, because new subfolders will be included in the backup by default, and you don’t need to remember to add the new folder to the includes list for backups.

I found it the easiest to list the home dir contents and use the result as the basis for the excludes file, which turned out along these lines:

$ ls -a1 ~/
*.DS_Store
*.swp
.Trash
.Xauthority
.bash_sessions

[...] 

Music/**/*.mp3
Music/**/*.jpg
Music/**/*.png

Step 1: Initialize

Restic’s user manual is a great source, and describes clearly how to install the program.

My network drive is mounted simply as an SMB share. I decided to create a folder called restic for the backups. Then I figured that because Restic supports SFTP repositories, I might share that space with friends for their off-site backup needs. So I made a subfolder for my own backups: restic/jussi. My first idea was to name the backup folder by device, so I would have something like /laptop, /imac, and /raspberry but then I realised it’s better to back up all devices to the same folder. This is because Restic’s repository structure already distinguishes different hosts, and because there may be even greater de-duplication if all data goes to the same repository. So here we go:

$ ./restic init --repo /Volumes/restic/jussi

Step 2: Environment

Restic encrypts backup data so repository access requires a key. To avoid inputing the password or configuring it in multiple scripts, I simply stored my password to a local plaintext file. The password secures the backed up data in the (remote) repository. In the local system, if one can access the plaintext password file, they can access the filesystem anyway. So even if it feels wrong to store passwords as plaintext, I assume it’s ok here.

$ cat mypassword > repo_pwd.txt

Step 3: Testing

Let’s try it out. Back up a bit of data:

$ ./restic -r /Volumes/restic/jussi -p repo_pwd.txt backup --exclude-file exclude.txt ~/Music/GarageBand

See what went to the repository:

$ ./restic -r /Volumes/restic/jussi -p repo_pwd.txt snapshots

Ok, looks good. I would have liked to browse the repository but didn’t have FUSE installed and Restic’s WebDAV support is still pending (restic#485) so I just did a restore:

$ ./restic -r /Volumes/restic/jussi -p repo_pwd.txt restore bcd46723 --target ~/temp/test

Step 4: Full Initial Backup

$ nice -n 10 ./restic -r /Volumes/restic/jussi -p repo_pwd.txt backup --exclude-file exclude.txt --one-file-system --tag initial /Users/jussi

I probably won’t be deleting much of the data once it’s backed up. I tend to save all pics and videos for some obscure future use. Therefore I used the initial tag for the snapshot to help later, when devising a policy for deleting old backups.

This step took many hours. I think the bottleneck was my LAN capacity. At this point, Restic reports that there are 411GiB to back up. After the process finished, I did

$ du -hc -d 1 /Volumes/restic/jussi

to find out the repository size but got weird numbers well over terabytes (possibly because it’s an SMB share?). So, instead, I opened a Finder window with total size field turned on, and now the repository size was reported as 301GB. Interesting.

Step 5: Housekeeping i.e. Forget and Prune

To save disk space, I set up a process for deleting old backups periodically. My mindset here is that if a file is changed on my computer, it’s also changed in the backup. There won’t be multiple versions of the file or kind of undo-feature for deleted files. However, I think it saves so much time and bandwidth that it’s worthwhile to keep the initial backup and build on that.

The Restic flow is to first forget snapshots from the index and then actually delete the data from disk using prune.

$ /restic -r /Volumes/restic/jussi -p repo_pwd.txt forget --keep-tag initial --keep-weekly 2

$ ./restic -r /Volumes/restic/jussi -p repo_pwd.txt prune

Step 6: Automate

Restic is (currently) optimized for fast incremental backups. But prune may be slow. So I’ll do this in two parts: a daily backup and a weekly housekeeping, using launchd.

The weekly housekeeping, or forget and prune, requires running two parts so I made a separate bash script for it. Launchd is then configured to run this script.

$ launchctl load ~/Library/LaunchAgents/local.restic_backup.plist
$ launchctl load ~/Library/LaunchAgents/local.restic_housekeeping.plist
$ launchctl list | grep -v com.apple. # To see if they loaded properly
...
$ launchctl start local.restic_backup # Do a test run

Manually starting the process is a good way to catch launchd errors. If all is well, you should see logging in /tmp/restic.log.

TODO: these processes only log to /tmp. It would be much better to alert if things go wrong, and maybe also report weekly/monthly that things are going ok. Maybe restic#667 will help here.

TODO 2: the backup process is very slow even over a good wifi connection. For the 400GB it took 2 hours 23 minutes. As a matter of fact, it wouldn’t matter if a backup process takes time, but Restic needs to communicate with the repository in a way that renders my LAN slow. One cure could be to force Restic to read the local source data in total and avoid querying the remote repository using –force.

[EDIT] Samba on Mac is s l o w . The –force switch did not help and I almost decided to ditch Restic because the daily backup was just too much for my wifi. But then I found quite a few reports about Apple’s slow SMB implementation. I switched from using Samba to sftp, and the daily backup is now fast and does not clog up my network.

For Reference

$ cat ~/Library/LaunchAgents/local.restic_backup.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
 <dict>
  <key>Label</key>
  <string>local.restic_backup</string>
  <key>WorkingDirectory</key>
  <string>/Users/jussi/restic</string>
  <key>ProgramArguments</key>
  <array>
   <string>./restic</string>
   <string>-r</string>
   <string>/Volumes/restic/jussi</string>
   <string>-p</string>
   <string>repo_pwd.txt</string>
   <string>backup</string>
   <string>--exclude-file</string>
   <string>exclude.txt</string>
   <string>--one-file-system</string>
   <string>/Users/jussi</string>
  </array>
  <key>StandardOutPath</key>
  <string>/tmp/restic.log</string>
  <key>StandardErrorPath</key>
  <string>/tmp/restic.log</string>
  <key>StartCalendarInterval</key>
  <dict>
   <key>Hour</key>
   <integer>12</integer>
   <key>Minute</key>
   <integer>0</integer>
  </dict>
  <key>Nice</key>
  <integer>10</integer>
 </dict>
</plist>

$ cat ~/Library/LaunchAgents/local.restic_housekeeping.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
 <dict>
  <key>Label</key>
  <string>local.restic_housekeeping</string>
  <key>WorkingDirectory</key>
  <string>/Users/jussi/restic</string>
  <key>ProgramArguments</key>
  <array>
   <string>./housekeeping.sh</string>
  </array>
  <key>StandardOutPath</key>
  <string>/tmp/restic.log</string>
  <key>StandardErrorPath</key>
  <string>/tmp/restic.log</string>
  <key>StartCalendarInterval</key>
  <dict>
   <key>Weekday</key>
   <integer>7</integer>
   <key>Hour</key>
   <integer>15</integer>
   <key>Minute</key>
   <integer>0</integer>
  </dict>
  <key>Nice</key>
  <integer>10</integer>
 </dict>
</plist>

$ cat housekeeping.sh 
#!/bin/bash
./restic -r /Volumes/restic/jussi -p repo_pwd.txt forget --keep-tag initial --keep-weekly 2
./restic -r /Volumes/restic/jussi -p repo_pwd.txt prune

Management by Counting on 110% Effort

Posted by Jussi Huotari on 6 December 2016, 9:35 am

There’s a massively delayed subway construction project in Espoo, Finland. Based on the public info, the project seems to be a mess of mismanagement. In a discussion regarding the project’s management style, a friend of mine quipped:

You know these managers telling their team will get this done by pulling a few all-nighters, or that she’ll be able to push the team to give a 110% effort? Anybody listening to these kind of arguments must immediately file the message to the La-la Land catagory.

Well said. It’s easy to agree that wishful thinking is not the right way forward. A manager can’t generate positive outcome by force of will. You can’t force facts to inexistence by ignoring them.

But how about the flipside? Taking a team’s velocity as granted is not right either. It is possible for a team to perform much better than they are used to, if led properly. There are numerous examples of sports teams achieving victories that everyone deemed impossible. Think about how Iceland made it to the quarter finals in Euro 2016. It wasn’t supposed to be possible, but inspiration, confidence, and attitude will work wonders.

I have a fond memory of a time I took a friend for a 15km run. For someone not used to running or biking, the distance sounded insurmountable. It took a while to get her even thinking about being able to run the distance. Even when we set out, she was making contingency plans. My job was to show that it is possible, assure that she can do it, and try to maintain speed and direction. Pretty trivial, eh? The thing is, when we finished, she had done something she didn’t think was possible. That was a powerful and inspiring moment.

If, as a leader, you settle for the usual, “the team’s 100% effort”, you won’t reach the full potential. There are famous anecdotes about Steve Jobs:

Those who did not know Jobs interpreted the Reality Distortion Field as a euphemism for bullying and lying. But those who worked with him admitted that the trait, infuriating as it might be, led them to perform extraordinary feats. Because Jobs felt that life’s ordinary rules didn’t apply to him, he could inspire his team to change the course of computer history with a small fraction of the resources that Xerox or IBM had. [HBR 2012-4]

I tend to think those anecdotes represent wishful thinking and belong to the La-la Land.

However, I believe it’s worthwhile to invest time and effort to pushing the limits. The famous scene from Invictus resonates with me. “But how to get them to be better than they think they can be?”

The million dollar question is “How to inspire ourselves to greatness when nothing less will do?”

What Does a CDO Do?

Posted by Jussi Huotari on 16 November 2016, 1:06 pm

Your board of directors is incomplete if you don’t employ a Chief Digital Officer. Right? Coming from tech startups, this sounds odd to me. For us, digitalization is the air we breath.

So what do the companies need a CDO for?

IIC Partners published a global survey last September. According to the Rise of Digital Leadership:

1/4 of companies employ a CDO currently.
1/3 of companies plan to hire a CDO in the next two years.
From that we can calculate that 50% of businesses will soon have a CDO

From the survey, the CDO’s responsibilities are:

Ensuring digital assets work across business units
Articulating the strategic vision to other leadership teams
Championing increased value through use of digital transformation

The premise seems to be that digitalization is complex and significant enough to require a specific function, on par with CFO, CMO, and CIO. Digitalization enables a business to build new capabilities and move quickly. McKinsey’s Markovitch and Willmott write that Customers want a quick and seamless digital experience, and they want it now.

Most definitions of CDO’s role seem to focus on business transformation, and digitalization affecting all business functions. To me that sounds just like the good ol’ business development manager. And maybe it is: CDO is non-permanent role. When digitalization is integrated into business, there won’t be any particular digital strategy, but just strategy in the digital world.

Freelance Journalists Should Be In Twitter

Posted by Jussi Huotari on 10 June 2016, 8:45 pm

Last Sunday, I read an interesting article in a print magazine. It’s a rare occasion nowadays so I turned back to the story’s first page to find out the writer’s name. Then I opened the Twitter application on my smartphone.

I’ve learned that Twitter is the best way to follow interesting writers, journalists and essayists.

You’ll type in the writer’s name, find her Twitter account and click follow. Many writers don’t tweet often, which I think is a blessing. Instead of daily pseudo-clever notions about everyday life, I’d rather see an occasional well-thought tweet with a link to read more about an interesting topic.

I don’t want to follow magazines or publications tweeting about their most popular stories in a clickbait style. I’d rather pick myself the few interesting writers to follow, and get updated on their latest stories. This especially applies to freelancers who write for multiple publications at an irregular interval.

See, there’s at least one interesting use case for the 140 character tweets!

Back to last Sunday. To my astonishment and disappointment this guy didn’t have a Twitter account. What’s going on, Hannu Pesonen? It looks like Mr Pesonen has both the experience and expertise to cover Africa and Middle East in an interesting manner. I’m sure I wouldn’t be the only one interested in finding out more about these topics. We want to know when a new article or book is out there.

I don’t know about this particular case but, in general, a freelance writer would benefit from being active in social media. Having an established follower base helps to convince the potential publishers.

Content Marketing by a Stock Exchange

Posted by Jussi Huotari on 25 February 2016, 3:27 pm

The CEO of a public company tweeted today about an interview. In the interview, the president of Nasdaq Helsinki sets up a portable studio and dicusses with Marimekko’s CEO Mika Ihamuotila about issues that are interesting to shareholders and traders.

Great stuff! Finally a business that makes real, original content to engage its customers. This is not just a consultant repeating fancy ideas about inbound marketing, death of cold calls, content strategy, and social selling. This is not just a poor community manager tasked with fabricating stories, sans facts and figures. This is not just a campaign with polished edges, made by your shiniest and most innovative digital agency.

This is the real thing. The president of a stock exchange himself grabs a camera and goes to interview the head honcho of a publicly traded firm. The interviewer of that stature is bound to have access to interesting people. And respectively, the interviewee calls the bets in their company. Sounds like an excellent foundation to build on.

As neither the interviewer or interviewee is a media professional, the result is a bit rough. Furthermore, neither is talking their first language, which takes away from conversation flow. But that’s not the point! The point is the fabled, hard-to-reach authenticity that differentiates this piece of content from what most companies are producing. I’d say this the kind of content businesses should create. You know, if you’re not putting out stories, you basically don’t exist.

Alas, reality is not quite so good. The video is almost two year old and looks like it’s the only episode of Lauri’s Portable Talkshow. Why so? Probably because it takes significant time to produce such content. It’s easier just to do what everyone else is doing, i.e. create a separate “social media team” and invest in Facebook ads so that you can focus on real work.

Communicating Organization Change – Case Zenefits

Posted by Jussi Huotari on 17 February 2016, 10:35 am

Zenefits is a company in trouble. They missed revenue projections, face regulatory charges, and their founder/CEO was forced to resign. How to pick up from here?

The new CEO, David Sacks sent an email to all employees. The email was said to be “brilliant”. Let’s see what’s in it.

Stripping the email down to essential, it says:

Parker has resigned because of his mistakes. We will work on governance and compliance to fix the problems. I’ll appoint Josh Stein as our Chief Compliance Officer. Our company culture is outdated, here are our new values: 1. integrity, 2. customer success, 3. job satisfaction. As a CEO I will first focus on culture and values. Then we’ll sharpen our strategic focus. We’ll focus on small business market because it’s huge, underserved market with product-market fit for us. Purpose of Zenefits is to make it easier to hire, onboard and manage employees of small businesses.

That’s a good structure. Lay down the facts, explain how the problems will be solved, give a vision of the future. However, the email is loooong. It takes 1000+ words to tell the above information, which is less than 100 words. What else is in the email?

Here’s the original email (from Business Insider):

By now all of you have heard the news that Parker has resigned as CEO. I know that this will come as a shock. Parker was not only the founder of this company but also its driving force until this day.

I know it will take time for people to absorb and process this news, and it will raise many questions about the company. I believe that Zenefits has a great future ahead, but only if we do the right things. We sell insurance in a highly regulated industry. In order to do that, we must be properly licensed. For us, compliance is like oxygen. Without it, we die.

The fact is that many of our internal processes, controls, and actions around compliance have been inadequate, and some decisions have just been plain wrong.

As a result, Parker has resigned. In order for us to move forward as a company, we cannot seek to hide or downplay the problem. We must admit it and remediate it as soon as possible. In December, we hired a Big Four auditing firm to conduct an independent third-party review of our licensing procedures that we will turn over to regulators as soon as possible.

I will expand that effort into a top-to-bottom review to ensure appropriate and best-in-class corporate governance, compliance and accountability. I am also appointing Josh Stein as our Chief Compliance Officer. I know that he will bring the same rigor to this job as his did in his previous experience as a federal prosecutor. Josh is already in communication with regulators to advise and update them of our compliance issues. These steps are a start to fixing the problem, but they are not enough.

We must admit that the problem goes much deeper than just process. Our culture and tone have been inappropriate for a highly regulated company.

Zenefits’ company values were forged at a time when the emphasis was on discovering a new market, and the company did that brilliantly.

Now we have moved into a new phase of delivering at scale and needing to win the trust of customers, regulators, and other stakeholders. As an entrepreneur myself, I know that Zenefits can never lose its innovativeness and willingness to experiment.

But at the same time, I believe a new set of values are necessary to take us to the next level.

Effective immediately, this company’s values are: #1 Operate with integrity. #2 Put the customer first. #3 Make this a great place to work for employees.

In order to be a great company, integrity must be at the core of what we do. We must have integrity in our business practices, compliance obligations and internal processes. We must have integrity in our product. We must have integrity in our data and infrastructure.

And we must have integrity in the way we treat each other. We must also put customer success at the heart of what we do. Everything we do should further the goal of earning and extending our customers’ trust. We want customers for life, and if we can’t reasonably expect to make a customer successful, we shouldn’t sign them in the first place.

Finally, we must make this a great place to work for employees, because we’re all in this together, and if we’re not enjoying ourselves, what’s the point? This is not to say that there won’t be major challenges and tough days ahead, but that must be balanced with a feeling of fun, fellowship, and esprit-de-corps. I want all of us to feel excited to come into work every day.

We all have a role in that, but I’m going to try my best to do my part. I’m making my first actions as CEO about culture and values because I believe these things are fundamental to a company’s success and who we are and want to be. I want to push down decision-making ability into the company.

Culture and values enable us to do that by ensuring that everyone is aligned around the right goals.

Once we’re aligned as a team around core values, the next thing we’re going to do is sharpen our strategic focus. When you raise $500 million and have a vision as big as Zenefits’, it’s tempting to think you can do everything at once. But no matter your size or resources, companies execute better when they ruthlessly prioritize and sequence their efforts.

For us, that means hyper-focusing on the small business market where we have product-market fit.

This is a great market for us because (1) it’s huge (with many million small businesses in the US); (2) it’s “greenfield” (meaning that it’s under-served by technology — in fact an Excel spreadsheet is often the main competitor); and (3) the free aspect of our product is extremely compelling.

There is one other critical thing that I want to do, and that is make Zenefits a purpose-driven company. I’m glad that Zenefits is one of the fastest-growing business software companies, but that’s not our mission. It alone doesn’t fill my life with any meaning, and I doubt it does yours.

However, Zenefits does serve an important purpose in the world, by making the lives of small business owners so much easier. We further the dreams of anyone who wants to start a new company. We help them achieve something larger than themselves, by making it easier to hire, onboard and manage employees. By doing that, Zenefits makes entrepreneurship more accessible to everyone.

I believe this is a very powerful and real mission in the world, and personally I’m proud to be part of it. There are very few startups that ever get to this level of scale and importance.

I believe Zenefits has a great opportunity, vision, and future if we can align it to the right execution. As a friend once told me, “successful companies have multiple founding moments.”

This is one of them. This is a founding moment. You are all co-founders in this new path forward. I know how much hard work everyone has put into this company and that we will all come together on this journey.

This is Day 1.

David