Rewriting Git history

One of the things I often do in Git is rewrite my local Git history. For example, when I correct the message of the last commit, combine subsequent commits into one, or rebase my feature branch onto master to take in its latest changes. I always thought that such a rewrite modifies my Git commits. But that is not the case: it merely adds new commits. In this blog post I explain what happens.

A rewrite example: combining, or squashing, commits

To show what happens when you rewrite your local Git history, I start with a repo with a single README. The repo has seen three commits: the first adds the README and the second and third update it. I will to combine the second and third commit into one.

The Git log starts out like this:

$> git log
commit 71afd173843d044ef5c8cc7a911b61ac0449153c
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 15:00:38 2017 +0100

    Add second message to README

commit 97bbcd0893269d5ae3bd680c4a88bb83d341fe4a
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:59:45 2017 +0100

    Add message to README

commit 56b3a080cd720d6086b634fb6b18b04d3c037aaf
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:58:36 2017 +0100

    Add README

The following ASCII art shows the sequence of commit IDs:

[master] 56b3a --> 97bbc --> 71afd

Now I squash the last two commits into a single one:

$> git reset --soft HEAD~2
$> git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   README.md

$> git commit -am "Update README" 
[master 65d0c05] Update README
 1 file changed, 2 insertions(+)

This gives me the following log:

$> git log
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 15:15:49 2017 +0100

    Update README

commit 56b3a080cd720d6086b634fb6b18b04d3c037aaf
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:58:36 2017 +0100

    Add README

As mentioned, I thought that a rewrite of my Git history modified my commits, but the original commits are still there:

$> git log 71afd173843d044ef5c8cc7a911b61ac0449153c
commit 71afd173843d044ef5c8cc7a911b61ac0449153c
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 15:00:38 2017 +0100

    Add second message to README

commit 97bbcd0893269d5ae3bd680c4a88bb83d341fe4a
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:59:45 2017 +0100

    Add message to README

commit 56b3a080cd720d6086b634fb6b18b04d3c037aaf
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:58:36 2017 +0100

    Add README

This means that the history looks like this:

[master] 56b3a --> 65d0c           
	      \
	       --> 97bbc --> 71afd

What happens when I do checkout the previous HEAD of the master branch? Well, a checkout of commit 71afd gives me the following:

Note: checking out '71afd173843d044ef5c8cc7a911b61ac0449153c'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 71afd17... Add second message to README

So I can still work with it as if it were an ordinary branch and as soon as it is appropriate, I can "convert" the chain of commits to an actual branch.

Will the original commits be in my repo forever? It depends. If they are on a branch or tag, they will remain in your repo. If not, Git does make use of a garbage collector to delete objects that it considers unreachable. The original commits are reachable from your (local) commit history and as such, safe for deletion. However, it is a temporary safety net as your commit history can be pruned after a certain time, by default after 90 days. If that happens, they can be deleted1.

Conclusion

The example in the previous paragraph shows that a rewrite of Git history does not modify commits, it adds new ones. The old ones are still available, albeit only temporarily. Knowing this it has become a lot less scary to rewrite my Git history.

Footnotes:

1

The information came from this StackOverflow question: What's the difference between git reflog and git log?

Looking back at 2016

A single post in the whole of 2016... It is not that I did not do anything that year, it is the opposite. The one thing I did not do is write about the things that kept me busy. So here it goes, my summary for 2016.

Qt/QML

In 2016 I worked as a freelancer on multiple projects on "control system software for milling machines". I cannot say anything more about that, apart from the fact that it involves the development of an application in C++ using Qt/QML. The following description is from the Wikipedia article on QML

QML is mainly used for [...] applications where touch input, fluid animations (60 FPS) and user experience are crucial. QML documents describe an object tree of elements [both] graphical (e.g., rectangle, image) and behavioral (e.g., state, transition, animation). These elements can be combined to build components ranging in complexity from simple buttons and sliders, to complete [...] [applications].

The UI of the application, the "view" is written in the JavaScript-like QML language, often in a declarative way. The "models" could be written in QML but more often than not they are implemented in C++ and Qt. Hooking up view and model can be done in QML and in C++.

My first Qt project was in 2003 and I have been using it ever since. I am selling Qt short here, but Qt, without the QML part, allows you to build the desktop applications you know from the late 90s and early 2000s. You have your main window with standard menus, buttons, dialogs etc. You can style these standard UI elements, but it is not necessary to have a good-looking, recognizable application.

QML is different: standard QML provides you with a blank canvas to place your UI elements on with no styling whatsoever. You do not (have to) place a predefined button on that canvas. You create a rectangle on the canvas, create a so-called touch area in that rectangle and add behavior to be executed when a touch is registered. That rectangle does not have any styling associated with it and you have to style them. Styling can be done by graphical designers - no programming knowledge required. Probably it even should be done by a designer to have a UI that is easy on the eye and consistent.

Apart from the great learning experience, QML opens up new areas for me to work in as it can be used to develop desktop applications, mobile applications and embedded applications.

ClearCase

The aforementioned Qt/QML application had to be developed at my customers site, where code is version controlled using ClearCase. Sigh... On the positive side, ClearCase can version control your code. I do not want to turn this overview into a rant about ClearCase, but the model behind it and the tools to use it "take some time getting used to". To name some objective drawbacks, e.g.

  • files are version controlled, there are no change sets;
  • a file has to be explicitly checked out before modification;
  • it requires access to a "ClearCase server" for VC operations.

Maybe tooling exists that fixes these drawbacks, but these tools were not available at the site I worked. There were also other ClearCase-related drawbacks that were specific to the site:

  • the working area that contains the source code is located on network drives, which tremendously slows down compilation times
  • all version control actions could not be executed on the development machine, as it did not have access to the ClearCase server: you had to go to another machine to checkout & checkin files on the network drives.

After a few weeks I started to use the clearcase.el package to checkin and checkout files from Emacs. In this way I did not have to leave my IDE (and visit the ClearCase GUI tooling) to be able to change a file. However, if checkin and checkout were the only things you need, Git would probably not be developed. There you have it: I wanted to use Git.

I could not use Git on the network drives as it would have tainted the ClearCase "view". Fortunately a Git - ClearCase bridge exists that allows you to sync ClearCase repos and locally stored Git repos, namely git-cc. git-cc is a Python package that uses the command-line ClearCase and Git tools to do the work. It allowed me to store my code locally, version controlled by Git, and to sync from (and to) ClearCase when necessary. The use of Git sped up the build times enormously: a full rebuild went from 15 minutes to 1 minute 30.

The local Git workflow spread out to other developers also and we presented our results to management. It turned out the company has a Bitbucket license and management allowed our project to migrate from a ClearCase workflow to a Bitbucket & Git one. That will not only improve our personal workflow but also our collaboration. For example, using ClearCase our code reviews are done using file compare tools and using copy-and-paste in emails. Bitbucket allows a web-based review process, with comments and replies.

Although our project uses Bitbucket, we to keep using ClearCase as the final store of code. This means the sync between Clearcase and Bitbucket will remain necessary. I made some changes to git-cc to further improve the sync workflow and I merged these changes upstream. This was a major moment for me: it was my first open-source contribution not being paid for by a company and to be used outside the company.

Bitbucket & Jenkins

As mentioned, ClearCase does not have the concept of a changeset. This meant that as a developer, you had to manually kick off a Jenkins job that built the code and ran all tests. We were able to setup a Jenkins job that was triggered when changes in ClearCase occurred, but the result was not entirely satisfactory. It could happen that a build was triggered before all modifications were checked in. Such builds seldomly succeeded.

As Bitbucket & Git were being introduced, we had another look at the automated builds. Using the right combination of Bitbucket and Jenkins plugins, each push to Bitbucket automatically triggers a build. This sounds very easy but in practice this involved a lot of trial and error. Especially in the Jenkins plugin ecosystem it can be difficult to find the plugin you need and when you think you have found it, to configure it correctly. Not all Jenkins plugins are properly documented and sometimes it turns out that a plugin has been superseded by another one.

Another problem we faced was that the setup of our system test environment wasn't (and still isn't) a one-click affair. The system test environment that Jenkins used for the ClearCase builds was setup by another department. It took quite some communication, and unfortunately also some miscommunication, to reproduce that setup at another site. We documented each step to create this setup, but ideally that process is fully automated.

Google Mock

The current code base has a lot of unit tests, although some of these "unit" are integration tests. I do not mind that too much unless it makes testing difficult. What I often see is that integration test cannot test specific code paths in the Class Under Test: the indirection at play makes it hard to inject certain behavior in the dependencies. This problem was acknowledged by the team and some of my team members started using manually coded stubs for the Class Under Test.

In a previous project I used Google Mock, a C++ framework that allows you to assign expected behavior to C++ classes. So I asked for time to demonstrate its applicability to the current code base. That turned out really nice and since then Google Mock is part of the tool chest of the developers on that project.

Pharo & Smalltalk

Now for the non-work related things. In my one post of 2016, I mentioned I was looking into Pharo, an open-source implementation of Smalltalk, both a programming language and development environment. That post outlined with the struggles I had with the language and environment, and made me doubt whether it was a wise decision to invest time in it. At the end of the post I stated that I enrolled in a 7-week online course on Pharo to help me decide whether I should continue that investment.

I finished the course, with an "Attestation of Achievement" :) This does not mean I am fluent in Smalltalk. But it did show me the power of a view of your application as a live environment. To mention just one thing that already has spoiled me, Pharo allows to pause your application, go back to an earlier stack fame, modify code and data and proceed from that modified frame. That is very different from an environment that repeated (lengthy) builds and restarts of program runs to reproduce the error...

Is it wise to invest more time in Pharo? To be honest, I do not think I can put it to use in a business setting to the extent I was able to use Python when I learned it in the early 2000s, at least not in the short run. I found a comment on the Slashdot article Can learning Smalltalk make you a better programmer that eloquently explains my current position on Smalltalk:

I tend to recommend Lisp, Smalltalk and Haskell as languages to train how you think about programming. A basic grasp of these three does wonders for how you think about programming, at least for high level stuff. There is a big difference between languages which help train your brain, and languages which help you get stuff done. There is considerable overlap, of course, but by sticking only to languages which get stuff done you limit your capacity to think about your programming.

And Smalltalk does show me new ways of doing things. To give a really simple example, I seldomly use the debugger when I am developing in Python. Especially for my own code I mostly rely on print statements. The power of the Smalltalk debugger convinced me to keep the console-based Python debugger pudb within reach and that did improve my workflow.

So I am continuing my journey with Pharo. Currently I am working my way through the book Enterprise Pharo, which is geared towards web development using Pharo. That brings me to the next section.

Elm

Elm is a framework/language for front-end web-development. As a developer I want to be well-rounded and front-end web development is one of the things I feel is missing from my resume. I have worked as a (Django) web-developer but mostly on the back-end. As with most back-end work, regardless of how good a job you did, it is not more often than not the front-end that makes people go "wow". In short, I have always been a bit jealous of front-end developers.

So the lure is there, but the technology stack of HTML, CSS, JavaScript, etc. never was that alluring to me. The Elm Architecture offers you a much more high-level view that consists of a (1) model (state), a (2) view to display the model and an (3) update (method) to update your model state. In a way, it is very similar to QML development, which makes it all come full-circle.

I read the docs, worked my ways to some toy examples and this wet my appetite enough to enroll in a two-day online Elm workshop.

After that, I definitely wanted to spend more time on it, but it turned out that with all the other things that kept me busy, I was spreading myself thin. The teacher of the online workshop, Richard Feldman, is writing the book Elm in Action which should be finished in the summer of 2017. I hope to revisit Elm by then, unless Pharo allows me to scratch my (front-end) web-development itch.

New Year's resolution

This overview turned out a lot longer that I initially expected. And that is with leaving out most of the details and even skipping things... As mentioned, I worked on a lot of things but I just did not blog about it. Better file this one in the "Missed opportunities" category. Well, it does give me an idea for another New Year's resolution...

Struggling with Smalltalk and Pharo

In April of 2015, I came across the release announcement of Pharo 4.0 on Hacker News. Pharo is an open-source implementation of Smalltalk, a programming language and development environment.

I first heard about Smalltalk in the early 1990s. Back then, I studied Computer Science and one professor mentioned how Smalltalk allowed him to model and simulate systems with tremendous ease. By the way, he was a professor of Mechanical Engineering :). Since then, I never worked with Smalltalk or even met someone who did. My languages of choice became C, C++ and later Python.

Every once in a while I came across an article, or other online reference of someone looking back on it favorably. Take for example this tweet from Kent Beck from December 20, 2012:

great joy today coding in smalltalk with an old friend. the design space is
HUGE compared to Java, PHP, and C++.

and the reply from Ron Jeffries:

@KentBeck i miss smalltalk a lot. nothing like it

References such as those kept my interest in Smalltalk alive. It had become clear to me that Smalltalk was, or had been, something special. This answer to the question what is so special about Smalltalk says it all:

the highly interactive style of programming you experience in Smalltalk is
simpler, more powerful and direct than anything you can do with Java or C# or
Ruby... and you can't really understand how well agile methods can work until
you've tried to do extreme programming in Smalltalk.

Go and read the full answer: it gives an impressive list of innovations that Smalltalk introduced.

So when I was looking for a small project for the summer holidays of 2015, I decided to spent some time with Pharo.

First impressions

It quickly became clear that Pharo is a development and execution environment in one. Compare this to Visual Studio (VS), which is an Integrated Development Environment (IDE). It allows you to build software that runs under one or more flavours of Microsoft Windows. VS is the development environment and Windows is the execution environment. With Pharo, these environments are the same.

Another way of looking at it is that of Pharo being a Virtual Machine (VM) on which the code runs. But in contrast to the Java Virtual Machine (JVM) which remains more-or-less hidden from the user, the Pharo virtual machine is very visible to the user. The following screenshot shows several applications running inside the Pharo environment:

/images/PharoScreenshot.png

This has several consequences. Pharo applications will always run inside their own top-level Pharo window. Furthermore, because the Pharo environment has its own distinct look-and-feel, Pharo applications will not look native to the host OS [1] .

And what about command-line applications? Apparently it is possible to run Pharo without a visible environment, called "headless". I have not spent the time to find out how to do so and what it means for the development and execution of command-line applications.

What also became clear is that all development had to be done inside the Pharo environment. Everything that is done in the Pharo environment is stored in a special files, the image files and a changes file. I realized I would not be able to use the tooling I have grown accustomed to. This was not the most pleasant realization for an ardent command-line user like me.

This meant no Emacs, grep, find, sed etc. What about git? Nope, Pharo uses its own distributed Version Control System (VCS) [2] . I did find that suprising. To me open-source software development is very much a "standing on the shoulders of others" kind of activity. And here I have something that really wants to do it (everything?) its own way... But maybe "its own way" is better than what I have been doing, so I plodded along.

Pharo 4.0 relies heavily on the mouse. But I am the kind of developer that really likes to keep his hands on the keyboard, at all times. Of course there are a keyboard shortcuts, but some of them are context dependent: whether they work or not depends on the item that has focus.

Switching windows is possible using the keyboard shortcut ALT-TAB. This is shortcut is often reserved by the host OS to switch applications. In Pharo, I did not find an easy way to bind it to another shortcut. All in all, I have to use the mouse a lot more than I would like.

There are more things I found cumbersome or "rough around the edges". For example, tabbing to the next UI element is inconsistent, the text editor is rather basic when you are used to Emacs. But enough about that.

Development in Pharo

To help me on my way with Pharo, I started out with two books, viz. Pharo by Example and Dynamic Web Development using Seaside. I worked my way through them, well mostly [3] , but it did not became clear to me what Smalltalk provides that makes it such a productive environment. That might be due to the size of the examples used. For example, it is shown that you can interact with live objects and modify them. That might be nice for small examples, but in my experience, if an application throws an exception that is not handled, the state is such that the best thing to do is to close the it. All in all, I am not (yet?) sold on the feature of working with live objects.

This is also made worse by the fact that the Pharo environment keeps feeling alien. The tools remain cumbersome to use, it is as if I am developing with one hand tied behind my back.

Let's go back to 2002 when I was a C/C++ developer that started spending some time with Python. Almost immediately the benefits of Python became clear: a batteries-included language that did not require compilation and that allowed me to develop small applications & scripts in no-time. Compare that to the time I spent on Smalltalk and where I am still left wondering what benefits it brings.

I have to acknowledge that my self-thought Python knowledge got a big boost when I started to work for a company that used Python professionally. From that moment on I had colleagues to consult and to learn from. However, a Smalltalk contract will be much more more difficult to find.

Doubts

Doubts started to creep in on whether I should keep investigating Pharo: the environment that required one to leave behind familiar tools, that kept feeling alien, the lack of learning progress, the lack of understanding why it can be so productive etc..

Thanks to the internet, if you are looking for confirmation, you will find it. Just as you can find articles that praise Smalltalk, you can also find ones that criticize it, for example the C2 Wiki page Why is Smalltalk dead [4] , the blog post What killed Smalltalk [5] and the (in)famous 2009 Rails Conference keynote by Robert M. Martin What Killed Smalltalk Could Kill Ruby. To some of the objections I could already relate, viz. the lack of integration with the OS and the outside world in general.

The Smalltalk community appears small. The volume on the Smalltalk reddit and the Pharo developers mailing list is low, but to be honest, I might be looking at the wrong online channels. And also a small community can still be very much alive. But the fact that several Smalltalk-related websites were not up-to-date (or had nothing new to tell for several years) did not instill much confidence. An example is the Seaside website, whose homepage shows "latest news" from 2013 and whose Success Stories page has a lot of dead links. Another example is the PharoCasts website, which contains screencasts of Pharo. The last entry is from September 2012...

What's next?

In the beginning of this post, I mentioned that I was looking for a small project for the summer holidays of 2015. Well, after the summer holidays I concluded that Smalltalk was not for me, but I did so reluctantly.

The following quote is from the website of Object Arts, which developed a Smalltalk implementation specifically for Microsoft Windows:

Smalltalk is dangerous. It is a drug. My advice to you would be don't try it;
it could ruin your life. Once you take the time to learn it (to REALLY learn
it) you will see that there is still nothing out there that can quite touch
it.

There must be something to Pharo and I feel that I just do not get it, yet.

A month ago I learned that a Massive Open Online Course (MOOC) on Pharo would start at the beginning of May 2016, Live Object programming in Pharo. It is a 7-week course developed by and given by Smalltalk developers and Pharo contributors. Among them is Stéphane Ducasse, one of the driving forces behind Pharo. The course looked like the ideal way to be able to finally determine whether Pharo can be productive for me, so I entered.

At the time of writing, I just finished the first week, which introduced Pharo, the Smalltalk language and which ended with a screencast of a small programming exercise. I especially liked how the screencast showed, almost casually, some minor usage tips. Let's see how it goes in the following weeks!

[1] Dolphin Smalltalk is a Smalltalk implementation for Windows that allows you to develop applications hat look native. There may be others.
[2] I know one can use Git with Pharo, but the standard way to do so is to use Monticello.
[3] To be honest, I came only half-way with both of them.
[4] The resulting Hacker News discussion at https://news.ycombinator.com/item?id=10071681 contains some interesting comments.
[5] The resulting Hacker News discussion at https://news.ycombinator.com/item?id=10099304 contains some interesting comments.

What makes development "agile"?

Edit: Originally this post was titled "Agile Development as I interpret it"

Currently the company I work for, FEI, has several software-related positions open and in the last few months we interviewed software architects, team leads and project leads. The job descriptions mention a preference for people with experience in the area of "agile development". So when a resume mentioned that the applicant has experience in that area, I asked that person what, according to him or her, makes software development "agile". This blog post is about the answer I would give.

To me, "agile software development" means that you deliver user-facing functionality in a continuous sequence of short iterations [1]. Because of the length of each iteration, you have to limit the scope of the functionality you promise to deliver. If the scope is too broad, it will not fit time-wise.

To limit the scope of functionality is not enough, you also have to limit the scope of design and implementation. In part this is due to the limited duration of each iteration. But as it is uncertain what functionality will have to be supported in the coming iterations, designing and implementing code in this iteration for iterations to come can turn out to be a waste of time.

To limit oneself in scope of design and implementation does not mean you deliver sub-standard code. Because unless the project is scrapped, you will build upon that code in the coming iterations, you will refactor code to be able to support new functionality. So the code better be in good shape and remain in good shape to be able to keep up your pace for iterations to come.

So that would have been my answer, I hope. When I compare my anwer to the Agile Manifesto, I realize it is woefully incomplete. However, I do think that the elements of software development my answer touches upon, are required for it to be called agile software development.

[1] Personally I prefer iterations of two weeks, and three weeks at the most.

The Nature of Software Development

In this blog post I talk about the book The Nature of Software Development, written by Ron Jeffries and published in February of this year. To quote Wikipedia, Ron Jeffries is one of the 3 founders of the Extreme Programming (XP) software development methodology in 1996. My introduction to XP was about 4 years later. His website XProgramming.com was one of the first websites I mined for information about XP. He literally has a lifetime of software development experience and whenever he writes, blogs or tweets, I take note.

What the book is about

Every software developer knows that building a product can be a painful experience. I am not so much talking about the technical aspects but more the questions of what to build and when. There are a lot of things that can make a seemingly simple job difficult, e.g. requirements that are unclear or change, deadlines that turn out to be unreachable. Mr. Jeffries wants to show us that there is safe way through the field of lava that a software development project might resemble.

In the Introduction of his book, Mr. Jeffries states the following:

Come along with me, and explore how we can make software development simpler by focusing on frequent delivery of visible value.

The gist of the book is a familiar one, namely that we should deliver value to the customer feature by feature. The following quotes are from the Chapter 2, "Value Is What We want"

We need to build pieces that make sense to us, and to our users. These are often called minimal marketable features (MMFs).

and

we [...] benefit from providing business features at an even finer grain than the usual MMF.

The book discusses these notions and their impact on other aspects of software development, such as planning, design and quality.

My $0.02

A lot, if not most of the things in the book you will have heard before. But the value of The Nature of Software Development is in the way it presents their combination to the reader. When you read the book, it feels like Mr. Jeffries is sitting next to you, as if you are having a conversation. If you are open to the approach (to software development) he advocates, the book will invigorate you to tackle that difficult software project at work.

However, if you are skeptical about his approach, I can imagine that Mr. Jeffries alone will not convince you. That it sounds so deceptively simple does not help. He kind-of addresses that simplicity in Chapter 13, "Not that simple". There he acknowledges that "real business" will complicate things, but that "it's all about deciding what we want, guiding ourselves toward what we really want". True, but the "we" can be a large "we": stakeholders such as developers, architects, project managers and customers have to be on board also. You will have to overcome any skepticism by any of these parties, so prepare to encounter the "that will never work for our situation".

I have seen software that was late due to features that were "really necessary for the first release" but which needed a lot of rework afterwards. I have also seen software that was late end up in a drawer because apart from being late, it was also unusable. To avoid that I prefer to build an application feature by feature, slice by slice. The Nature of Software Development confirms my personal experience and beliefs with a clarity and completeness my own thoughts on this subject lack. I can thoroughly recommend it.

GitHub issues and Emacs Lisp unit tests

In my current project I spent a significant amount of time reviewing other people's code. Progress on those pull requests is not always continuous and it is easy to lose track of those that require my attention. To avoid that, I keep a list of these pull requests in a separate org-mode file in Emacs. Then, with the right key press(es), my Agenda shows me the ones I need to have a(nother) look at.

When I add a pull request to the org-mode file, I also copy the title and URL from the GitHub website. This proved to be cumbersome and error-prone, so I wrote a small Lisp package to retrieve that information automatically. Read on for details on its development and the use of automated tests in Emacs Lisp.

You can find the code of the aforementioned Lisp package in my github-query repo at Bitbucket.

General idea

I wanted a function that asks for an issue number and automatically retrieves the title and the URL of that issue [1]. This proved to be relatively easy to do as

  1. GitHub can be accessed through a Web API that is nicely documented.
  2. Emacs comes with the url package that you can use, among others, to post requests over http and https, and receive responses.

Of course, the devil is in the details and it took me some time to develop the following function:

(defun github-query-get-issue(owner repo issue-number)
  "Return issue ISSUE-NUMBER of GitHub repo OWNER/REPO.

This function returns the response as an association list, but
you can also use `github-query-get-attribute' for that."

The following snippet shows how you can use that function to retrieve the url of issue 1 from the GitHub repo bbatsov/projectile:

(let ((response (github-query-get-issue "bbatsov" "projectile" 1)))
  (let ((url (github-query-get-attribute 'html_url response)))
    url))

The nicest thing was that I used automated unit tests during the development of the package, something that I had never done for Lisp code [2].

Automated tests in Emacs Lisp

Emacs comes standard with the library ert, "a tool for automated testing in Emacs Lisp". That library provides macro "ert-deftest" that allows you to define a test as an ordinary functions. It also provides you with several test assertions such as "should" and "should-not". The following test from the github-query shows how it can be used:

(ert-deftest github-query--get-issue-retrieves-correct-response()
  (let ((response (github-query-get-issue "bbatsov" "projectile" 1)))
    (should (equal 1 (github-query-get-attribute 'number response)))
    (should (equal "Obey .gitignore, .bzrignore, etc."
                (github-query-get-attribute 'title response)))
    (should (equal "https://github.com/bbatsov/projectile/issues/1"
                (github-query-get-attribute 'html_url response)))))

Being able to run automated tests helped me enormously. Without them I find it easy to end up in a spot where I am thinking "but this was working before, or wasn't it?" [3], especially with the interactivity that the REPL provides.

I did have some minor issues with ert. There are two modes in which you can run tests, viz.

  • in interactive mode, where you execute the tests in the current Emacs process, and
  • in batch-mode, where you start a new Emacs that runs your tests and exits.

Working in interactive mode means that you always have to explicitly reload the code under test when you have changed it. Fail to do so and ert uses the code as it was during the previous run. To avoid that explicit reload, I use ert in batch-mode. Then I know for sure that all code under test is (re)loaded. Unfortunately in batch-mode it is more difficult to specify which tests to run. There are things that you can only do in interactive mode, for example only run the tests that failed during the last run.

Conclusion

Well, ert is here to stay for me. I cannot imagine doing Emacs Lisp development without automated tests anymore (not that I develop a lot of Emacs Lisp code :).

I do find ert a bit cumbersome to use in batch-mode, but on the positive side, it forces me to have quick unit tests. There is another test runner for ert, ert-runner, that seems to make it easier to specify which tests to run in batch-mode so. I will have a look at that one.

[1] In my current project, I am part of a team that works in a single (private) repo.
[2] Well, not really the first time. I once worked on a Lisp package to run all functions in an Emacs Lisp file whose name started with test_ . I bootstrapped that functionality during its development. However, that code was neither completed nor published.
[3] For me this is not limited to Emacs Lisp, it holds for any programming language.

Python unittests from Emacs

At the company I currently work for, most of my coworkers use PyCharm to develop the Python application we are working on. I tried PyCharm several times and although I can understand why it is so popular, I still prefer Emacs :) One of the nice PyCharm features is the functionality to run the unit test the cursor resides in. So I decided to support that functionality in Emacs and in this post I describe how. You can find the code I developed for that in my skempy repo at Bitbucket.

General idea

The general idea to be able to run the unit test "at point" was simple:

  1. Write a Python script that, given a Python source file and line number, returns the name of the unit test in that file at that line.
  2. In Emacs, call that Python script with the file name of the current buffer and the line number at point to retrieve the name of the unit test.
  3. In Emacs, use the compile-command to run the unit test and display the output in compilation mode.

It is easy to run a specific unit test using standard Python functionality, e.g. the command:

$> python -m unittest source_code.MyTestSuite.test_a

executes test method test_a of test class MyTestSuite in file source_code.py.

I wanted to have all the complexity in the Python script, so the output of the Python script had to be something like:

source_code.MyTestSuite.test_a

which the Emacs Lisp code could then pickup to build the compile-command that Emacs should use. This idea resulted in the Python package skempy and the command-line utility skempy-find-test.

skempy

The following text, which is from the skempy README, explains how to use skempy-find-test:

$ skempy-find-test --help
usage: skempy-find-test [-h] [--version] file_path line_no

Retrieve the method in the given Python file and at the given line.

positional arguments:
  file_path   Python file including path
  line_no     line number

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

Assume you have the Python file tests/source_code.py:

import unittest


class TestMe(unittest.TestCase):

    def test_a(self):
        print "Hello World!"
        return

The following snippet shows the output of skempy-find-test on that Python file at line 7, which is the line that contains the print statement:

$ skempy-find-test tests/source_code.py 7
source_code.TestMe.test_a

Emacs integration

The root of the repo contains the Emacs Lisp file skempy.el, which provides a function to retrieve the test method at point and executes that test as a compile command:

(defun sks-execute-python-test()
  (interactive)
  (let ((test-method (shell-command-to-string (format "skempy-find-test %s %d" (buffer-file-name) (line-number-at-pos)))))
    (compile (concat "python -m unittest " test-method)))
  )

If you bind it to a key then running the test at point is a single keystroke away, e.g.:

(add-hook 'python-mode-hook
          '(lambda () (local-set-key [C-f7] 'sks-execute-python-test)))

Implementation details

Initially I wanted to parse the Python file that contains the unit test, reading the file line-by-line and using regular expressions to do some pattern matching. You might know the quote [1]

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

Indeed, before too long my spike in this direction was becoming overly complex.

I searched for another approach and this quickly lead to the Python ast module. This module "helps Python applications to process trees of the Python abstract syntax grammar". In other words, it helps you parse Python files.

To parse a Python file, I used the following exports of the ast module:

  • function ast.parse to create a tree of syntax nodes of a given Python file;
  • class ast.NodeVisitor which implements the visitor pattern to inspect the tree of nodes.

To put it bluntly, each syntax node represents a statement and contains additional information such as the line numbers of that statement. When you call ast.NodeVisitor.visit and pass the tree of nodes, visit calls the appropriate ast.NodeVisitor method for each node. If you want a specific behavior for a node type, you override the method for that node type. This resulted in the following code:

class LineFinder(ast.NodeVisitor):

    def __init__(self, line_no):
        self.line_no = line_no
        self.path = ""

    def visit_ClassDef(self, node):
        self.class_name = node.name

        if node.lineno <= self.line_no:
            self.path = self.class_name

        self.generic_visit(node)

    def visit_FunctionDef(self, node):

        max_lineno = node.lineno
        for statement_node in node.body:
            max_lineno = max(max_lineno, statement_node.lineno)

        if node.lineno <= self.line_no <= max_lineno:
            self.path = "%s.%s" % (self.class_name, node.name)

        if not self.path:
            self.generic_visit(node)


def get_path_in_code(source_code, line_no):

    tree = ast.parse(source_code)
    line_finder = LineFinder(line_no)
    line_finder.visit(tree)

    return line_finder.path

This code does not support all possible edge cases but it supports the use cases I currently have, which is enough for me.

Making it complete

The ast code alone is not enough. For example, the previous code snippet only returns a class and method name. That is not enough for the Python unit testrunner, which wants a Python file package path. So we had to go from

TestMe.test_a

to

package.source_code.TestMe.test_a

It was easy to support this. You can find the complete code including unit tests, documentation, setup etc. in my skempy repo at Bitbucket. If you want to try it out, please have a look at the README, which explains how to install it.

[1] This is the quote as you might know it and I only use to jest. The actual quote only warns against the overuse of regular expressions, as explained in this post on Coding Horror.

Emacs configuration with Cask

This post is about how I use Emacs package management in combination with Cask to manage my personal Emacs configuration. If you just want to have a look at that configuration, you can find it in my emacs-config repo at Bitbucket. Read on if you want to know how I got there...

DIY package management

When you use Emacs, you will rely on a whole plethora of Emacs packages that do not belong to the default installation. Until October of last year, I used a Makefile to retrieve those external Emacs packages. The following snippet is from that Makefile and shows how I retrieved package ace-jump-mode:

install-ace-jump-mode:
        - rm -rf externals/ace-jump-mode
        cd externals ; git clone git://github.com/winterTTr/ace-jump-mode.git

The Makefile contained a lot of awfully similar rules to install the other external packages I relied on. Ideally it would have also contained rules to update those packages, but I never got round to implement that.

The actual configuration of the external packages was done in the Emacs startup file init.el. The next Lisp snippet from that file shows the configuration of ace-jump-mode:

;use ace-jump-mode to improve navigation
(add-to-list 'load-path "~/.emacs.d/externals/ace-jump-mode")
(autoload
  'ace-jump-mode
  "ace-jump-mode"
  "Emacs quick move minor mode"
  t)

This home-grown solution to package management served me well over the years. It enabled me to get a new install of Emacs up-and-running rather quickly.

Using Emacs package management directly

Emacs has a package management infrastructure since version 24.1 [1]. It consists of a set of online repositories and a library to interact with them. Since then it has become the way to install packages and in October of last year, I finally started using it. Indeed it is a breeze to use, the biggest benefits being

  • the use of a global list of available packages: no need to search through GitHub or other online sources,
  • the one-click install of an interesting package: no need to add an additional rule to my Makefile, and
  • automatic support for upgrading installed packages.

Initially I used the Emacs package package.el directly to access the package management infrastructure. The following snippet from my configuration file shows how that looked:

(require 'package)
(package-initialize)
(add-to-list 'package-archives '("melpa" . "http://melpa.milkbox.net/packages/") t)

;Bozhidar Batsov at
;
;  http://batsov.com/articles/2012/02/19/package-management-in-emacs-the-good-the-bad-and-the-ugly/
;
;Thank you!

(defvar sks-packages
  '(ace-jump-mode     ;;to enable text-specific cursor jumps
    )
  "A list of packages that should be installed at Emacs startup.")

(require 'cl)

(defun sks-packages-installed-p ()
  (loop for p in sks-packages
        when (not (package-installed-p p)) do (return nil)
        finally (return t)))

(unless (sks-packages-installed-p)
  ;; check for new packages (package versions)
  (message "%s" "Emacs is now refreshing its package database...")
  (package-refresh-contents)
  (message "%s" " done.")
  ;; install the missing packages
  (dolist (p sks-packages)
    (when (not (package-installed-p p))
      (package-install p))))

      (eval-after-load "ace-jump-mode-autoloads"

(eval-after-load "ace-jump-mode-autoloads"
  '(progn
     (autoload
       'ace-jump-mode
       "ace-jump-mode"
       "Emacs quick move minor mode"
       t)
     (define-key global-map (kbd "C-0") 'ace-jump-mode)))

Please note that the value of variable sks-packages in this snippet specifies a single external package to keep this example concise. The original Lisp file specifies a dozen more packages.

This approach was a big improvement over my own solution. Installation and configuration were located in the same file and gone were the dependencies on an additional Makefile and VC clients.

Emacs package management via Cask

The code that makes sure all packages in sks-packages are installed at startup in case there were not yet installed, is from Bozhidar Batsov. Since then, he has developed Cask, which is, and I quote from its documentation,

[...] a project management tool for Emacs Lisp to automate the package development cycle; development, dependencies, testing, building, packaging and more.

Cask can also be used to manage dependencies for your local Emacs configuration.

Because of both those design goals, I decided to use Cask for my Emacs package management purposes.

Cask uses a so-called Cask file where you specify the external packages you rely on. For example, to install ace-jump-mode, my Cask file would look like this:

(source melpa) ;;archive of VCS snapshots built automatically from upstream repositories

(depends-on "ace-jump-mode")

(eval-after-load "ace-jump-mode-autoloads"
  '(progn
     (autoload
       'ace-jump-mode
       "ace-jump-mode"
       "Emacs quick move minor mode"
       t)
     (define-key global-map (kbd "C-0") 'ace-jump-mode)))

The first lines specifies the online repository that should be searched and the third line specifies the external package itself [2]. To install this dependency, I execute the following command:

$ .emacs.d> cask

Cask installs the package in ~/.emacs.d/.cask/24.3.50.1/elpa, where 24.3.50.1 is my current Emacs version.

To update any installed dependencies, I just have to do:

$ .emacs.d> cask update

To close it off

As mentioned, you can find my Emacs configuration in my emacs-config repo. Although the repo is hosted at Bitbucket, it is a Git repo and not a Mercurial one as one might expect. I am using Git almost full-time now as my main client relies on it and have become more proficient with Git than with Mercurial. Furthermore, there is not much that can beat the excellent Emacs mode for interacting with Git, magit.

To conclude all this, previously my Emacs configuration was accessible from one of my public Launchpad repos. I hosted my configuration there as Launchpad uses the Bazaar version control system, which was the first distributed VCS I used. The last few years Bazaar adoption has declined and its development has slowed down so I do not gain much, if anything, from hosting my configuration there.

[1] Emacs version 24.1 has been released 2012-06-10.
[2] The Cask documentation lists the repos it supports here. The comments in this snippet are from that documentation.

Build Qt 5.2 from source (Ubuntu 13.10)

Qt 5.2 was released on the 12th of December, 2013. I wanted to give it a spin and I downloaded the source tarball to build it myself. This proved to be more difficult than expected but I managed in the end.

The biggest hurdle was to get Qt Quick (2) working. Qt Quick uses OpenGL so you need the OpenGL development headers. If these are not installed, which was the case with my new laptop, the output of the Qt configure script mentions the lack of OpenGL support. Unfortunately it took me quite some time to connect that to the fact that my build did not contain Qt Quick.

The remainder of this blog post describes how to create out-of-tree builds for Qt 5.2 and Qt Creator 3.0. I have created these builds on Ubuntu 13.10 but the information should be applicable to other flavors of Linux also.

Prerequisites

As mentioned, for Qt Quick you need to have the OpenGL development headers installed. Execute the following command to install them:

$> sudo apt-get install libgl1-mesa-dev

To have your Qt5 applications blend in with your GTK desktop, you need the GTK 2.0 development headers:

$> sudo apt-get install libgtk2.0-dev

This enables GTK theme support but even with that working, Qt5 applications use a different theme by default. To force the use of a specific style, use the -style parameter when you start the Qt5 application, for example:

$> standarddialogs -style gtk+

For Qt4 applications you can set a default style with the qtconfig-qt4 utility, but Qt5 applications ignore its settings.

Build Qt 5.2

Download the Qt 5.2 tarball and unpack it:

$> wget http://download.qt-project.org/official_releases/qt/5.2/5.2.0/single/qt-everywhere-opensource-src-5.2.0.tar.gz
$> tar xvzf qt-everywhere-opensource-src-5.2.0.tar.gz

To build Qt for the local platform, execute the following commands [1]:

$> cd qt-everywhere-opensource-src-5.2.0
$> mkdir -p builds/local && cd builds/local
$> export PATH=$PWD/qtbase/bin:$PATH
$> ../../configure -prefix $PWD/qtbase -opensource -qt-xcb -nomake tests
$> make -j 4

We use a so-called out-of-source build to make it easy to rebuild Qt without having to worry that previous build artifacts influence the new build.

With the above value of the -prefix parameter, you do not have to install Qt using the make install command.

Note the -qt-xcb parameter for the configure command. It is there to, and I quote,

[...] get rid of most xcb- dependencies. Only libxcb will still be linked dynamically, since it will be most likely be pulled in via other dependencies anyway. This should allow for binaries that are portable across most modern Linux distributions.

This is mentioned in $PWD/../qtbase/src/plugins/platforms/README.

The "-j 4" parameter to "make" specifies to run 4 jobs simultaneously. My laptop has 4 processing cores, so theoretically this could speed up compilation by a factor of 4. I did notice one drawback of using multiple jobs: when one of the jobs fails, it can be difficult to determine which compilation step failed as the messages from the failing job already have scrolled off the screen.

Build Qt Creator 3.0

Download the Qt Creator 3.0 source tarball and unpack it:

$> wget http://qt-mirror.dannhauer.de/official_releases/qtcreator/3.0/3.0.0/qt-creator-opensource-src-3.0.0.tar.gz
$> tar xvzf qt-creator-opensource-src-3.0.0

To build Qt Creator, execute the following commands from the root of the extracted tarball:

$> cd qt-creator-opensource-src-3.0.0.tar.gz
$> mkdir -p builds/local && cd builds/local
$> qmake -r ../..
$> make -j 4

Again we create an out-of-source build.

Potentially dangerous tip

If you accidentally build Qt in its source directory, you can clean that directory using the following command:

$> find . -type f -mtime -1 -exec rm {} \;

This command deletes all files that are less than a day old. It does this silently except for:

$PWD/qtbase/src/corelib/global/qconfig.cpp

This file is read-only and you have to explicitly acknowledge that you want to delete it. It is safe to do as it the file will be regenerated during the next configure/build run. The following remarks are appropiate when you use this command:

- Be very, very careful where you execute that command. I once had it delete
  my new Qt build but much worse things can happen.
- This command only works if the original Qt files are more than a day old,
  which is the case for the version we are building here.
[1] these commands are inspired by this page of the Meego 1.2 Developer Documention

Scrum, tasks & task estimates

At the start of an iteration our Scrum team determines the tasks involved to realize each story and estimates them. This is an activity which can take a lot of time and, if your not careful, a lot of energy. When that happens, be prepared for whispers or even loud complaints about "micro management". In this post I explain why we need tasks and estimates.

In Scrum, the user stories to be realized in the next iteration have already been estimated at a high level. Teams often estimate them in so-called story points, which indicate a relative effort required to deliver that story: the higher the number (of story points), the more effort required. At my current employer, we use the values 1 (extra small), 2, 4, 8, 16 and 32 (extra large).

The team assigns story points by comparing the new stories to older, realized ones and the story points assigned to them. When the number of story points that the team has realized in previous iterations is (relatively) stable, we use it to predict the velocity of the team in the next iteration. The stories that are selected for that iteration should fit the velocity of the team.

As mentioned, these story points are a high-level estimate and, unless you are a really experienced and gelled team, the stories contain too much unknowns to plan and monitor the current iteration. This is were tasks and estimates come in. They should provide us with

  1. a better understanding of the work that needs to be done,
  2. a shared understanding of the work that needs to be done, and
  3. a burndown chart to track our progress [1].

The better understanding should lead to better estimates that tell us whether the iteration is overloaded or underloaded. We use these estimates to track progress throughout the sprint so we can

  • keep stakeholders informed during the sprint,
  • re-allocate people and resources when necessary,
  • add, remove or modify stories.

It can be tempting to define the perfect breakdown and find the perfect estimates, whatever those may be. If that works for your team, good for you but avoid drowning in a kind of mini-waterfall. For me the big a-ha moment was that we track progress on stories and not on tasks. The task breakdown should provide you with a better understanding of the work that has to be done and hopefully better estimates. That is their purpose. Keep in mind that the best understanding often comes from doing the actual work.

So what if the actual work deviates from the breakdown? In that case you should adapt the amount of work remaining according to your new insights. In this way the burndown reflects that slow-down or speed-up.

[1] This list has been inspired by the article at http://www.scrumalliance.org/articles/116-a-cure-for-task-estimation-obsession