Go scheduler: Ms, Ps & Gs

I decided to learn a bit more about Go internals and since it was long time that somebody wrote about Go scheduler, I though it would be an interesting post. So, let’s get to it!


The Go runtime manages scheduling, garbage collection, and the runtime environment for goroutines. Here, I will focus only on the scheduler.

Runtime scheduler runs goroutines by mapping them onto operating system threads. Goroutines are lightweight version of threads, with very low cost of starting up. Each goroutine is described by a struct called G, which contains fields necessary to keep track of its stack and current status. So, G = goroutine.

Runtime keeps track of each G and maps them onto Logical Processors, named P. P can be seen as a abstract resource or a context, which needs to be acquired, so that OS thread (called M, or Machine) can execute G .

You can control how many logical processor your runtime has by calling runtime.GOMAXPROCS(numLogicalProcessors), if you are planning to tweak this parameter (you probably shouldn’t), set it once and forget it, because it needs “stop the world” GC pause.

Essentially, Operating System runs threads, which runs your code. The Go’s trick is that compiler inserts calls into Go runtime in various places, e.g. sending a value thru the channel, making a call to runtime package), so that Go can notify scheduler and take action.

This is how go program, runtime & operating system fits together

Note: Idea for figure below taken from Analysis of the Go runtime scheduler.

The dance between Ms, Ps & Gs

The interaction between Ms, Ps & Gs is a bit complicated. Take a look at this amazing workflow graph, which is from  go runtime scheduler slides by Gao Chao.

This is what happens when you create a new goroutine



Here we can see, that there are two types of queues for G: a global queue in the schedt struct (which is rarely used) and each P maintains a queue of runnable G‘s.

In order to execute a goroutine, M needs to be holding context P. Machine, then just pops it’s P queue of goroutines and executes code.

When you schedule a new goroutine (do a go func() call) it is placed into P‘s queue. There is this interesting work-stealing scheduling algorithm, which runs when M finishes executing some G and then it tries to take another G out of a queue, which is empty, then it randomly chooses another P and tries to steal a half of runnable G’s from it!

Interesting things happen when your goroutine makes a blocking syscall. Blocking syscall will be intercepted, if there are Gs to run, runtime will detach the thread from the P and create a new OS thread (if idle thread doesn’t exist) to service that processor.

When a system calls resumes, the goroutine is placed back into a local run queue and the thread will park itself (meaning thread won’t be running) and insert itself in the list of idle threads.

If a goroutine makes a network call, runtime will do a similar action. The call will be intercepted, but because Go has integrated network poller, which has its own thread, it will be assigned to it.

Essentially Go runtime will run a different goroutine, if current goroutine is blocked on:

  • blocking syscall (for example opening a file),
  • network input,
  • channel operations,
  • primitives in the sync package.

Scheduler Tracing

Go allows to trace runtime scheduler. This is done via GODEBUG environment variable:

$ GODEBUG=scheddetail=1,schedtrace=1000 ./program

Here is an example of output it gives:

SCHED 0ms: gomaxprocs=8 idleprocs=7 threads=2 spinningthreads=0 idlethreads=0 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=0 syscalltick=0 m=0 runqsize=0 gfreecnt=0
  P1: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P2: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P3: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P4: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P5: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P6: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P7: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M0: p=0 curg=1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=1
  G1: status=8() m=0 lockedm=0

Note that it uses the same concepts of G, M & P and their state, such as P’s queue size. Usually, you don’t need that much detail, so you can just use:

$ GODEBUG=schedtrace=1000 ./program

There is this great article by William Kennedy, which explains how to interpret this and the detailed version of the trace.

Also, there is an advanced tool called go tool trace, which has an UI and allows you to explore, what your program and runtime is doing. You can read how to use it in Pusher article.


yaml url.URL parsing

I was using gopkg.in/yaml.v2 to unmarshal a .yaml file into a struct and couldn’t find a solution to unmarshal an url string into url.URL.

Trying to just use url.URL  fails with:

Error initializing configuration: yaml: 
unmarshal errors:\n  line 54:
cannot unmarshal !!str `http://...` into url.URL

This is because, url.URL doesn’t adhere to yaml.v2.Unmarshaler interface.

So here is my solution, feel free to copy paste it!

type YAMLURL struct {

func (j *YAMLURL) UnmarshalYAML(unmarshal func(interface{}) error) error {
	var s string
	err := unmarshal(&s)
	if err != nil {
		return err
	url, err := url.Parse(s)
	j.URL = url
	return err

And here is how to use it:

type Config struct{
    url YAMLURL `yaml:"url"`

Thanks for reading!

Think about your dependencies

A little copying is better than a little dependency, says Go proverb.

Go has a huge standard library, which has support for many, many things. I think many developers, don’t utilize it fully. I mean I have seen many new gophers try to use packages for everything. For example, do you really need a dependency on github.com/stretchr/testify ?  If you are just using assert, maybe you can just can copy the assert function from benbjohnson/testing? Ben Johnson has  published these convenience functions (MIT license):

func assert(tb testing.TB, condition bool, msg string) 
func ok(tb testing.TB, err error) 
func equals(tb testing.TB, exp, act interface{}) 

If you are using the TestSuite stuff, maybe you can just avoid it entirely? The same goes for all the cool HTTP muxes. It’s cool, but if you don’t have a lot of endpoints, maybe you can just avoid that dependency and use standard library’s http package.

But why am I against dependencies? Well, for new developers learning a new language is a lot, if you complicate this by adding custom packages and frameworks it will be harder. And it will be harder for you too. If you get back to a project after a break, you will notice that you have to relearn the dependencies you have used.

With less dependencies you will spend less time managing them. Once, I spent 3 hours fixing a dependency problem with Glide. The problem was that one of my dependency (let’s call this dependency A) had a dependency on an older version of a package C and another dependency (let’s call it B) had a dependency on a newer version of a package C. The biggest problem was that package hadn’t used semantic versioning, so I had to try random library C commits and try to get this thing working… After doing that for an hour, I took a break and realized, that I can easily rewrite the code and remove a dependency on B.

But aren’t packages from standard library are dependencies too? Well yes, but by utilizing the standard library, you are learning the language. You will start to remember public functions, structs and interfaces, after awhile. Also, Go has version 1 compatibility guarantee, that means that standard library’s API won’t break down on you, for go 1.* versions.

Of course don’t go and reimplement Go MySQL driver or anything crazy like that. Just think before adding another dependency!

That’s all from me. Do you enjoy these articles? Follow me on twitter @PofkeVe and join my mailing list 🙂

Go schema migration tools

In this post I did a comparison of following tools:


TL;DR If your looking for schema migration tool you can use:

mattes/migrate, SQL defined schema migrations, with a well defined and documented API, large database support and a useful CLI tool. This tool is actively maintained, has a lot of stars and an A+ from goreport.

rubenv/sql-migrate, go struct based or SQL defined schema migrations, with a config file, migration history, prod-dev-test environments. The only drawback is that it got B from goreport.

markbates/pop, use this if you are looking for an ORM like library. It has awesome model generation capabilities and a custom DSL language for writing schema changes.

go-gormigrate/gormigrate, use this if you are using GORM, this helper adds proper schema versioning and rollback capabilities.

Why these tools? Read below:

Comparing by Stars

mattes/migrate 961
rubenv/sql-migrate 605
markbates/pop 605
liamstask/goose – bitbucket project; stars not available
DavidHuie/gomigrate 110
pressly/goose 80
BurntSushi/migration 56
tanel/dbmigrate 38
GuiaBolso/darwin 29
go-gormigrate/gormigrate 22
pravasan/pravasan 14

Note: I put goose into 4th place because it has a lot of watchers.

I use Github stars as a metric to see how widely project is used. Because it is almost impossible to see how many users actually use it. I bolded out the tools that win in this category.

Comparing by last activity

markbates/pop Feb 14, 2017
mattes/migrate Feb 10, 2017
GuiaBolso/darwin Feb 10, 2017
rubenv/sql-migrate Feb 7, 2017
go-gormigrate/gormigrate Feb 4, 2017
pressly/goose Dec 9, 2016
DavidHuie/gomigrate Aug 9, 2016
tanel/dbmigrate Feb 23, 2016
pravasan/pravasan Mar 20, 2015
liamstask/goose Jan 16, 2015
BurntSushi/migration Jan 25, 2014

Note: I scrapped this info on Wednesday, February 15, 2017 3:00 pm EET. 

More often than not you want to use projects that are maintained. So last activity can be seen as a measure of project’s maintainability. It is important because if there is a bug you want to have an ability to submit a PR or create an issue, which hopefully would get resolved. Bolded out tools win in this category.

Comparing by goreportcard

markbates/pop A+
GuiaBolso/darwin A+
go-gormigrate/gormigrate A+
mattes/migrate A
liamstask/goose A
pressly/goose A
DavidHuie/gomigrate A
tanel/dbmigrate A
rubenv/sql-migrate B
BurntSushi/migration B
pravasan/pravasan D

Note: I scrapped this info on Wednesday, February 15, 2017 3:00 pm EET. 

Goreportcard allows you to check the code quality of any open source project written in go and gives an overall score (A+, A, B,..). I bolded out the tools that win in this category.

Comparing by usability

In this comparison I will try out some of the tools and give my opinion.

I will be playing around with a table MyGuests, which looks like this:

   firstname VARCHAR(30) NOT NULL,
   lastname VARCHAR(30) NOT NULL,
   email VARCHAR(50),
   reg_date TIMESTAMP


This is a simple tool, which does migrations based on files. It comes with a Go library and a CLI tool, which helps you to create SQL migration files and manages schema version. Let’s take a look at the example usage of CLI tool below:

$ migrate -url mysql://root@tcp( -path ./migrations 
 create initial_users_table

creates 2 files and a table called schema_migrations:

Version 1487240220 migration files created in ./migrations:

I added CREATE TABLE MyGuests … statement in file called *up.sql and DROP TABLE MyGuests.. statement in *down.sql

Running migration using CLI is simple:

$ migrate -url mysql://root@tcp( 
  -path ./migrations up

This creates a table and sets a row in schema_migrations table:

mysql> select * from schema_migrations;
| version |
| 1487240220 |
1 row in set (0.00 sec)

Here is an example of running  a “down” migration, which drops the table and removes a row from schema_migrations:

$ migrate -url mysql://root@tcp( 
  -path ./migrations down

CLI tool also allows going to specific schema version,  rollbacking previous migrations, etc.

The provided go library is also pretty simple, it allows you to run migrations from your code and provides you with synchronous and asynchronous implementations. Probably you will only be using UpSync function from your code. Take a look at the example below:

package main

import (
   _ "github.com/mattes/migrate/driver/mysql"

func main() {
   allErrors, ok := migrate.UpSync("mysql://root@tcp(", "./migrations")
   if !ok {

I like this library for it’s simplicity. It supports PostgreSQL, Cassandra, SQLite, MySQL, Neo4j, Ql, MongoDB, CrateDb. But it has a caveat: MySQL support is only experimental.


Playing around with this library was a bit painful for me. For about 20 minutes I couldn’t figure out what was wrong with my connection info. I was continuously getting Invalid DBConf errors, with no explanations:

2017/02/16 13:14:54 Invalid DBConf: 
   {mysql  root@tcp(  }

It appears to me now that I had left a space after specifying database type!

So in goose you have to create a dir called db and add a file called dbconf.yaml , which contains connection information. This is how my file looked:

  driver: mysql
  open: root@tcp(

In this config you are also allowed to choose your SQL dialect and import a different db driver.

Creating a migration with goose is easy:

goose create initial_users_table

which creates a  file called 20170216132820_initial_users_table.go, which contains 2 go functions:

func Up_20170216132820(txn *sql.Tx) {


func Down_20170216132820(txn *sql.Tx) {


Here is how I filled these functions:

// Up is executed when this migration is applied
func Up_20170216132820(txn *sql.Tx) {
   res, err := txn.Exec(`CREATE TABLE MyGuests (
      firstname VARCHAR(30) NOT NULL,
      lastname VARCHAR(30) NOT NULL,
      email VARCHAR(50),
      reg_date TIMESTAMP

// Down is executed when this migration is rolled back
func Down_20170216132820(txn *sql.Tx) {
   res, err := txn.Exec("DROP TABLE MyGuests;")

Executing up/down migrations is also easy:

goose up

goose down

Internally goose maintains a table called goose_db_version:

 mysql> select * from goose_db_version;
| id | version_id | is_applied | tstamp |
| 1 | 0 | 1 | 2017-02-16 13:37:20 |
| 2 | 20170216132820 | 1 | 2017-02-16 13:37:47 |
| 3 | 20170216132820 | 0 | 2017-02-16 13:39:32 |
| 4 | 20170216132820 | 1 | 2017-02-16 13:40:04 |
| 5 | 20170216134743 | 1 | 2017-02-16 13:51:30 |
| 6 | 20170216134743 | 0 | 2017-02-16 13:51:34 |
6 rows in set (0.00 sec)

This tools also allows you to specify migration using SQL files, by default this tool supports postgres, mysql, sqlite3 and has a go library. Mostly I liked that you specify connection info in a config file, which simplifies your work with the CLI. Also, writing db migrations as go code looks interesting! Overall, not a bad tool.


pop is more like an “ORM”, which helps you to create models and sql schema for you. pop also comes with migration capabilities in a CLI tool called soda and a DSL for specifying migrations called fizz.

At the start you have to specify database connection config in a database.yaml file. Mine looked like this:

  dialect: "mysql"
  database: "pop"
  host: "localhost"
  port: "3306"
  user: "root"
  password: ""

Then you can create/drop a database using CLI tool:

soda create -e development
soda drop -e development

Generating a model with it’s migration script based on fizz DSL is simple:

soda generate model MyGuest firstname:text lastname:text email:text 

Generated DSL looks like this:

create_table("my_guests", func(t) {
   t.Column("id", "uuid", {"primary": true})
   t.Column("firstname", "text", {})
   t.Column("lastname", "text", {})
   t.Column("email", "text", {})
   t.Column("reg_date", "timestamp", {})

and model:

type MyGuest struct {
        ID uuid.UUID `json:"id" db:"id"`
        CreatedAt time.Time `json:"created_at" db:"created_at"`
        UpdatedAt time.Time `json:"updated_at" db:"updated_at"`
        Firstname string `json:"firstname" db:"firstname"`
        Lastname string `json:"lastname" db:"lastname"`
        Email string `json:"email" db:"email"`
        RegDate time.Time `json:"reg_date" db:"reg_date"`

Migrate up/down:

soda migrate up 
soda migrate down

On migration Fizz DSL produced the following schema:

mysql> desc my_guests;
| Field | Type | Null | Key | Default | Extra |
| created_at | datetime | NO | | NULL | |
| updated_at | datetime | NO | | NULL | |
| id | char(36) | NO | PRI | NULL | |
| firstname | text | NO | | NULL | |
| lastname | text | NO | | NULL | |
| email | text | NO | | NULL | |
| reg_date | datetime | NO | | NULL | |
7 rows in set (0.00 sec)

Internally this tool maintains a table called schema_migration, which holds schema version number.

I liked this library a lot, but it feel like you should only use this then you are looking for “ORM” like library. Generating models and migrations looks cool! Also, bonus points for Fizz DSL, which looks a lot like go 🙂

One drawback is that pop supports only : PostgreSQL (>= 9.3), MySQL (>= 5.7) and SQLite (>= 3.x).


Gormigrate is a migration helper for GORM library. This helper adds proper schema versioning and rollback cababilities. I like schema versioning + schema migration definition in a list of structs. This is how this looks with MyGuests example:

func main() {
        db, err := gorm.Open("mysql",
        if err != nil {
                panic("failed to connect database")

        if err = db.DB().Ping(); err != nil {


        defer db.Close()

        m := gormigrate.New(db, gormigrate.DefaultOptions, 
                        ID: "201702200906",
                        Migrate: func(tx *gorm.DB) error {
                                type MyGuest struct {
                                        Firstname string
                                        Lastname  string
                                        Email     string
                                        RegDate   time.Time
                                return tx.AutoMigrate(&MyGuest{}).Error
                        Rollback: func(tx *gorm.DB) error {
                                return tx.DropTable("MyGuest").Error

        if err = m.Migrate(); err != nil {
                log.Fatalf("Could not migrate: %v", err)

This migration creates a table, with the following schema:

mysql> desc my_guests;
| Field      | Type             | Null | Key | Default | Extra          |
| id         | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| created_at | timestamp        | YES  |     | NULL    |                |
| updated_at | timestamp        | YES  |     | NULL    |                |
| deleted_at | timestamp        | YES  | MUL | NULL    |                |
| firstname  | varchar(255)     | YES  |     | NULL    |                |
| lastname   | varchar(255)     | YES  |     | NULL    |                |
| email      | varchar(255)     | YES  |     | NULL    |                |
| reg_date   | timestamp        | YES  |     | NULL    |                |
8 rows in set (0.00 sec)

I would definitely use this with GORM.


sql-migrate is a look like goose, you specify connection info in a yaml file, migrations are written in SQL files. It has a CLI, which generates a template for your migration:

$ sql-migrate new MyGuests

My schema change looked like this:

-- +migrate Up
           firstname VARCHAR(30) NOT NULL,
           lastname VARCHAR(30) NOT NULL,
           email VARCHAR(50),
           reg_date TIMESTAMP

-- +migrate Down

Applying migration with CLI tool is pretty straightforward:

$ sql-migrate up
Applied 1 migration

It also has nice looking library, with well defined API, which supports having migrations in a struct or in a directory. The only drawback I can think of is that goreport card gave a B for this library.


Is a really simple toolkit, which only allows you to run migrations from go code:

err := migrator.Migrate()
err := migrator.Rollback()

migration are defined in sql files named { id }}_{{ name }}_{{ “up” or “down” }}.sql, which you have to manage yourself, because it’s only a library. mattes/migrate seems to cover the same functionality and add much more, so I would prefer to use it over this library.


It’s a library, which tracks schema changes in a struct and only allows up migrations. I love the idea of storing all migrations in a slice:

var (
   migrations = []darwin.Migration{
   Version: 1,
   Description: "Creating table MyGuests",
   Script: `CREATE TABLE MyGuests (
              firstname VARCHAR(30) NOT NULL,
              lastname VARCHAR(30) NOT NULL,
              email VARCHAR(50),
              reg_date TIMESTAMP

func main() {
   database, err := sql.Open("mysql", "root@tcp(")

   if err != nil {

   driver := darwin.NewGenericDriver(database, darwin.MySQLDialect{})

   d := darwin.New(driver, migrations, nil)
   err = d.Migrate()

   if err != nil {

I don’t think there is more to say about this library, but I guess it’s a good thing.


Simple library for PostgreSQL or Cassandra. Runs migrations (.sql or .cql files) sorted using their file name. mattes/migrate seems to cover the same functionality and add much more, so I would prefer to use it over this.


It’s a fork of goose, which drops support for config files and custom drivers. Migrations can be run with any driver that is compatible with database/sql. This tool looks like liamstask/goose and mattes/migrate had a baby 🙂 It takes good things from both projects: good CLI, no configuration, write migrations using .sql or .go files, store history of migrations in a goose_db_version table. But there are some things that this tools lacks: it doesn’t support Cassandra or any other non SQL database, CLI can migrate SQL files only.

Then trying to use it I had problems with this tool:

I created SQL based migration:

$ goose mysql "root@tcp(" create initial_users_table sql
Created sql migration at 20170301085637_initial_users_table.sql

Filled 20170301085637_initial_users_table.sql file with:

-- +goose Up
-- SQL in section 'Up' is executed when this migration is applied

	firstname VARCHAR(30) NOT NULL,
	lastname VARCHAR(30) NOT NULL,
	email VARCHAR(50),
	reg_date TIMESTAMP

-- +goose Down
-- SQL section 'Down' is executed when this migration is rolled back
$ goose mysql "root@tcp(" up
2017/03/01 08:57:41 WARNING: Unexpected unfinished SQL query: -- +goose Up
-- SQL in section 'Up' is executed when this migration is applied

	firstname VARCHAR(30) NOT NULL,
	lastname VARCHAR(30) NOT NULL,
	email VARCHAR(50),
	reg_date TIMESTAMP
). Missing a semicolon?
OK    20170301085637_initial_users_table.sql
goose: no migrations to run. current version: 20170301085637
$ goose mysql "root@tcp(" down
2017/03/01 08:57:47 FAIL 20170301085637_initial_users_table.sql (Error 1051: Unknown table 'pressly.myguests'), quitting migration.
$ goose mysql "root@tcp(" up
goose: no migrations to run. current version: 20170301085637

I did miss the semicolon, but this tool wrote into it’s table that migration ran successfully, so I was left at state where table doesn’t exist, but tool thinks it’s exist.

mysql> select * from goose_db_version;
| id | version_id     | is_applied | tstamp              |
|  1 |              0 |          1 | 2017-03-01 08:44:45 |
|  2 | 20170301085637 |          1 | 2017-03-01 08:57:41 |

I had to manually delete the row in goose_db_version the table and restart the migration…
After that everything worked normally:

$ pressly goose mysql "root@tcp(" up
OK    20170301085637_initial_users_table.sql
goose: no migrations to run. current version: 20170301085637
$ pressly goose mysql "root@tcp(" down
OK    20170301085637_initial_users_table.sql

Also I tried to run migration based on go code, but wasn’t successful:
I created the migration and compiled the example cmd file provided in the repo for running migrations:

$ goose mysql "root@tcp(" create initial_users_table
 Created go migration at 20170301083810_initial_users_table.go

and filled with my migration code:

package migration

import (


func init() {
	goose.AddMigration(Up_20170301083810, Down_20170301083810)

func Up_20170301083810(tx *sql.Tx) error {
	res, err := tx.Exec(`CREATE TABLE MyGuests (
	       firstname VARCHAR(30) NOT NULL,
	       lastname VARCHAR(30) NOT NULL,
	       email VARCHAR(50),
	       reg_date TIMESTAMP
	return err

func Down_20170301083810(tx *sql.Tx) error {
	res, err := txn.Exec("DROP TABLE MyGuests;")
	return err

Then I tried to run:

./pressly --dir=migrations/ mysql "root@tcp(" up
2017/03/01 09:02:24 FAIL 00002_rename_root.go 
(Error 1146: Table 'pressly.users' doesn't exist), quitting migration.

my migrations directory contains only 1 file called 20170301083810_initial_users_table.go, there is no file called 00002_rename_root.go, so I don’t know what the hell this tool is doing, but I really don’t like that it tries to run file called rename_root.go, which I didn’t write and don’t know nothing about.

So be careful with this tool!

Apache Mesos in the face of network partitions

In MesosCon Europe 2015 welcome speech, Benjamin Hindman said that Mesos was designed to support building and running distributed systems. In order to survive production, systems have to learn to adapt occasional network blips and outages. As research shows network can have problems and thinking that network  is reliable is one of the 8 fallacies of distributed computing.

In this post, I will cover what happens when Mesos Master has network problems with it’s Mesos Agent and how Mesos applications (frameworks) gets notified. So, let’s consider the following architecture:

Mesos arch1 - Page 1 (1) 2.pngThis is a bit simplified architecture, because Scheduler can reside in a different host. White boxes are Mesos components, red shaded boxes are components, which are developed by the Mesos app (framework) developer. So, let’s consider what happens when Mesos Master cannot connect to the Mesos Agent.

Mesos Master detects the network blip using pings. The detection is controlled using
–agent_ping_timeout (default 15s) and –max_agent_ping_timeouts (default 5), so the Agent which does not answer to a ping after agent_ping_timeout times max_agent_ping_timeouts seconds, will be considered lost.  In this case, on reconnection Mesos agent will kill all tasks, executors and shutdown. The scheduler will be informed using task status updates, it will receive TASK_LOST for each task on that Agent and an agent failure event.
Here is log of reconnected agent shutting everything down:

I0203 14:31:00.725201 128352256 slave.cpp:809] Agent asked to shut down by master@ because ‘Agent attempted to re-register after removal’
I0203 14:31:00.725296 128352256 slave.cpp:2218] Asked to shut down framework e1637465-8791-460c-8c66-fadaa19f8148-0000 by master@
I0203 14:31:00.725313 128352256 slave.cpp:2243] Shutting down framework e1637465-8791-460c-8c66-fadaa19f8148-0000
I0203 14:31:00.725334 128352256 slave.cpp:4407] Shutting down executor ‘59160602-24bc-4a44-9c53-26a43d32402e’ of framework e1637465-8791-460c-8c66-fadaa19f8148-0000 (via HTTP)
E0203 14:31:00.725385 131571712 process.cpp:2105] Failed to shutdown socket with fd 14: Socket is not connected
I0203 14:31:00.725507 128352256 slave.cpp:4407] Shutting down executor ‘7eebc364-a1ab-464d-8624-0f785afccc38’ of framework e1637465-8791-460c-8c66-fadaa19f8148-0000 (via HTTP)

If the network blip is shorter than agent_ping_timeout times max_agent_ping_timeouts seconds, everything should still work.

This is not a very good approach as it doesn’t allow app developer to change this behavior in any way. For example, if you run Spark on Mesos, maybe this behavior is fine (some tasks will be killed, but automatically will be rescheduled on the different Agents), but when using Cassandra on Mesos,  you probably wouldn’t want your Cassandra nodes to be killed after 80s of network blip?

Thats why in Mesos 1.1. there was added experimental support for partition-aware Mesos frameworks. If the framework developer will opt in into this feature, tasks and executors wont be killed when agent reconnects. This allows frameworks to define their own policies for how to handle partitioned tasks. Also, it adds new task states, which allows handling network partitions differently.

Let’s take a look at the new task states:

  • TASK_UNREACHABLE – is sent when Mesos Master detects the network blip. On Mesos Agent rejoining the the task will be set as TASK_RUNNING again (if that was the case).
  • TASK_DROPPED – is sent when a task fails to launch because of a transient error. The task is not running.
  • TASK_GONE – is sent during task reconciliation process, when Master knows that task was running on an Agent that has been terminated. So, the task is not running. 
  • TASK_GONE_BY_OPERATOR – As I understand mesos will provide some maintenance primitive for operators to mark tasks as gone, so this task state has the possibility of human error. So, the task is probably gone.
  • TASK_UNKNOWN – will be sent during task reconciliation process, when the Master does not know about the task’s agent. So, the task may still be running.

The left work with partition-aware frameworks can be tracked in this Jira ticket.

Thanks for reading! If you have any questions or want to provide feedback, you can contact me via twitter @PofkeVe.


Cool feature of jUnit 5

One thing I love in the upcoming release of jUnit is the @DisplayName annotation. This annotation allows to use language to name a unit test. Kevlin Henney in his talk “What We Talk About When We Talk About Unit Testing” has shown power of using English language to specify what unit test is doing. For example:

@DisplayName("ISBNs with more than 13 Digits are Malformed")
public void isValid() {
    ISBNValidator isbnValidator = new ISBNValidator();

    Boolean isValid = isbnValidator.isValid("1111222233334444");

    assertEquals(false, isValid);

Which results in unit tests being a great specification for a component. Furthermore you can use this cool IntelliJ IDEA feature, which allows to see all component tests by name. So the final result looks like this:


By looking at the image, we can clearly know what component does and what it’s corner cases are, which gives us a lot of insight. Actually, we even know that it does what it say it does, because test ran fine!

Although you can do this using regular camel case method naming, but it is really hard to read:

public void isbnsWithMoreThan13DigitsAreMalformed() {
    ISBNValidator isbnValidator = new ISBNValidator();

    Boolean isValid = isbnValidator.isValid("1111222233334444");

    assertEquals(false, isValid);

and the resulting spec:



This example is taken from the talk mentioned above, I really suggest you to watch it.

Thanks for reading ! Until next time!

Unit tests

Probably you have heard many advantages of unit testing such as getting bugs out of the code, producing better code design, having overall good code quality. But for me it’s all about reducing the fear of making changes in production. It gives me confidence in my code so that my new code additions didn’t break something. Also it shortens the development-testing cycle, which allows me to always publish code which works.

It’s important to distinguish unit tests from other kind of tests. If you put all kind of tests into same place it quickly can become a mess: some tests run slowly, some require testing database, a queue or other external dependency. At least in my company people don’t have a clear understanding what a unit test is and so we ended up with ice cream cone anti-pattern, where we have many end to end tests, a lot of manual testing, zero unit tests and few integration tests. Why this pattern is an anti pattern you can read in this blog post from Google testing blog.

So, what is a unit test? There are many different definitions what are the unit tests, I tend too look at them as the low-level tests, that test a single unit: function call, method, class or a group of classes. These tests should run quickly, use test stubs or mocks for external dependencies, so that they should test only the system under test and not the external dependencies. These tests are really really fast. you can run thousands of the in a second or two.

Lastly, I wanted to share some cool articles about what is and what isn’t a unit test: