bloggo ergo sum

Dedicated window-buffer mapping with Emacs

I use CScope to navigate source code from within Emacs. It’s very, very useful and integrates will into Emacs. However, I’ve been wanting a way to control how cscope updates the buffer/window mappings as it locates search results for you. Sometimes, I like that CScope updates the buffer where I initiated the search to reflect the results, and it’s easy to get back to the point of origin using the C-c s u command.

However, sometimes I want CScope to leave my origin buffer alone and show the result location in another window so I can see both at the same time. It’s bothersome to have to arrange the buffers manually after performing a search, so I asked on stackoverflow.com, and voila! I got a good answer – create a simple keybinding to a function for dedicating a window/buffer mapping:

;; keybindings
(global-set-key [pause] 'toggle-window-dedicated)

;; buffer dedication (mostly for cscope
(defun toggle-window-dedicated ()
  "Toggle whether the current active window is dedicated"
  (interactive)
  (message
   (if (let (window (get-buffer-window (current-buffer)))
	 (set-window-dedicated-p window
				 (not (window-dedicated-p window))))
     "Window '%s' is dedicated"
     "Window '%s' is normal")
   (current-buffer)))

Now, using this, I can just hit the pause button on my keyboard when I want to pin down my main source buffer.

pre-commit script for submodule hygiene

Our team at work is using git submodules to track re-usable code across projects, and it’s been pretty good so far, but we have hit minor snags along the way (such as the absence of a ‘git submodule rm’ command!). Another one is that using submodules adds a step to the sequence of things you have to do to publish changes: pushing submodule commits. It’s an easy thing to forget, but it’s a headache for anyone on the other end of a pull when git-checkout-index fails. This pre-commit hook script will cause the commit to fail if the commit contains new submodule moments and those moments are not present in the corresponding submodule origin.

#!/bin/sh

function array_has
{
    for item in $1
    do
	if [ "$item" = "$2" ]; then
	    return 1;
	fi
    done

    return 0;
}

diffs=`git diff --cached --name-only`

IFS=`echo -en "\n\b"`
for smstat in `git submodule 2>/dev/null`
do
    if [[ "$smstat" =~ '^\+(.*)' ]]; then
	smstat=${BASH_REMATCH[1]}
    fi

    head=$(echo $smstat | awk '{print $1}')
    path=$(echo $smstat | awk '{print $2}')
    moment=$(git ls-files -s $path | awk '{print $2}')

    array_has $diffs $path
    if [ $? ]; then
	pushd >/dev/null $path
	for rhead in $(git ls-remote -h origin
	               | awk '{print $1}')
	do
	    if [ "$(git rev-list $moment ^$rhead)"
		    != "" ]
	    then
		unpub=1;
	    fi
	done
    fi

    if [[ $unpub -gt 0 ]]; then
	echo -n "ERROR: you are trying to commit "
	echo -n "unpublished changes to the $path "
	echo    "submodule."
	exit 1;
    fi

done

exit 0;

submodule moment

Seems that new features and concepts appear in Git at such a steady pace that it’s difficult for the jargon to keep up. I don’t follow the git mailing list as closely as I should: it’s too high-traffic and I already don’t have enough time to do the work I have on my plate.

Right now what’s getting tongue-tied is referring to the commit-id of a submodule stored in the HEAD of a containing repository. As far as I know, there’s not a good, succinct and unambiguous term for this commit-id.

I have decided to call it the ‘submodule moment‘, because moment captures the ideas nicely:

  • a submodule at a particular point in its history (which we typically think of as linear in time)
  • a property of the submodule as it relates to it’s parent repository (a moment in mathematics is a statistical property of random variables)

More importantly, though, this term disambiguates itself from the other verbal references to commit identifiers.

Extended Git Submodule Status

At work I’m involved in some projects that will very likely make heavy use of submodules. The reason is that submodules make it very convenient to make use of a set of “common” code without a ton of duplication. We’re currently breaking our “common” code into packages that can be included in a project independent of each other, and they will likely exist as submodules.

The challenge is that submodule support in Git isn’t quite as polished as you’d like it to be. What do you do if you have 20+ submodules, some of which may be on a branch and contain uncommitted changes that need to be dealt with? What if it’s been two weeks since you last looked at it?

One solution would be to write a git-all script like this (which is simpler than a real git-all would be, and may actually be incorrect):

#!/bin/sh
# git-all

for repo in $(find . -name '.git' -type d | xargs dirname); do
    pushd $repo >/dev/null
    git $*
    popd >/dev/null
done

We used this approach, particularly before the advent of git-submodule, and some projects still use it. It’s not a bad approach, but it’s not really what you want.

Here’s another solution that’s a little more integrated into git itself:

#!/bin/sh
# git-submodule-changes

. git-sh-setup

status=0

for sm in `git-submodule status
              | sed 's/^[[:space:]]*\(.*\)/\1/'
              | cut -d ' ' -f 2`;
do
    pushd $sm >/dev/null
    substat=$(git-ls-files -d -m -o -s -u -t
                  | cut -d ' ' -f 1 | sort | uniq)
    substat=$(echo $substat | tr -d '[[:space:]]')

    if [ "$substat" != "" ]; then
        status=1
    fi

    printf "%7s %s\n" "$substat" $sm
    popd >/dev/null
done

exit $status

I think this works better. An added benefit is that you can do line-based scripting with tools like grep, sed, and awk (and tons of other unix utilities) with it, because all the info appears on one line:

{master}$ git submodule-changes
      ? sub0
        sub1

This isn’t terribly illuminating, but the format is just “[HMRCK?] submodule-path” on each line. The optional codes are the single-character status codes from git-ls-files.

An even better solution would be to build this into git-submodule.sh, which I will look into as I get time.

Lisp: no easy download

Over at LispCast, eric’s got a new post up called No easy download. In it he talks about the difficulty of being new to Lisp, wanting to learn, and not knowing where to start.

He’s right. It’s way too hard – and I’d argue it’s one of the primary factors that hampers Lisp adoption. As one of his commenters points out

Perl, Python, and Ruby have the advantage that they are the only or at least canonical implementation of that particular language. Therefore, you go to perl.com or python.org and there it is.

When I run lisp, I don’t type ‘lisp.’ My implementation isn’t at lisp.org or anything like that because there are a bunch of implementations, none of them strictly canonical. If you’d gone to sbcl.org or cons.org (CMUCL) for instance, you would have found an accessible download link, although not really a mascot.

The comment is correct that there isn’t a canonical implementation of Lisp. But why should it matter? There can be a canonical place to go to for Lisp resources. There’s common-lisp.net, which isn’t bad, but as eric points out, it doesn’t lend itself to the “getting started” meme as well as python.org and ruby.org do. What about lisp.com? Well, it redirects to yeah.com, which has nothing to do with Lisp that I can tell. I can’t be sure though, because my employer blocks it:

\"Lisp.com redirected to yeah.com\"

Alright then, what about lisp.org?

Ok, it’s there and it’s Lisp-related, but it’s mostly about some group called the Association of Lisp Users. What about Lisp? Where can I download it?

Directory index of http://lisp.org

I can learn about the history of this group, read about their board, their members, their meetings, and their sponsors. A little further down, I can read about past conferences. Further down, I can read about Lisp resources.

If the ruby.org page had been this way, there might not have ever been a Ruby on Rails, and there’d be no Basecamp or Twitter (ok, there would be, but they’d be made from PHP).

Now, to be fair, the ALU has a CL Gardeners project that is aimed specifically at dealing with this issue, and the ALU Wiki is actually pretty good – but the wiki should be the index of lisp.org, and that page should have a link to the alu.org homepage.

Of course, eric’s post spurred some discussion over at News.YC, in which I participated. Said radu_floricica:

Arguing between 3 or 4 clicks doesn’t change anything. There are a lot of problems with beginners and lisp, but they’re far from how to get to the first implementation. As a newbee myself, googling for lisp environments was damn easy and instructive: i knew within 20 minutes that I should use linux, and if i really wanted windows (as I did at first) there was only one choice. When I finally had access to a decent linux it took me maybe another 20 to confirm that sbcl is the one I wanted. All that gave me lots of background information on the side.

Now the first brush with asdf on the other hand was a nightmare. I still don’t understand why I have to become an expert in pgp (definitely not just a beginner) just to use asdf-install.

I think the main problem for beginners is the effort it takes to install a reasonable environment. I tried two versions for a web application server: ucw and weblocks. Weblocks meant installing more then 10 different packages and source code is pretty much standard documentation – but it’s ok because it’s understandable and officially beta anyways. UCW is the standard – after a year and a half I still couldn’t produce an installation running on a port different then 8080 (I use 8080 for tomcat for all my machines).

All this effort is way disproportionate to the effects. Installing emacs and slime is may not be a breeze, but once you discover that googling “emacs configuration” brings up a bonanza it’s worth it. (Also that .emacs in windows is _emacs… that’s half a day I’ll never have back). But so much effort just to get a server running… if you want to help beginners write a better ucw tutorial, don’t just rearrange the links on the frontpage of lisp sites.

And my response:

Arguing between 3 or 4 clicks doesn’t change anything.

You’re wrong. And besides, the argument isn’t between 3 or 4 clicks. It should be no more than 2, and preferably one.

There are a lot of problems with beginners and lisp, but
they’re far from how to get to the first implementation.
As a newbee myself, googling for lisp environments was
damn easy and instructive: i knew within 20 minutes that
I should use linux, and if i really wanted windows (as I
did at first) there was only one choice.

You’re still wrong. Not every newbie even knows what to google for, and even if they did, it shouldn’t take them 20 minutes of web searching to figure out what they should download. And even that didn’t lead you to the right place – there is more than one option for Windows (cusp, allegro, lispworks, and clisp for starters – that’s not even counting scheme stuff), Linux isn’t the only way, and SBCL isn’t the only thing going on Linux. But a newbie either

a) has no way to figure all that out in a reasonably short amount of time b) isn’t going to try

And beyond that, you’re still missing the point: he shouldn’t have to.

Once you’re concerned with learning emacs and slime and customizing your environment and installing weblocks and getting asdf and asdf-install to work, you’re already committed. Difficulty doesn’t matter nearly as much by that point. Being able to easily download and dick around in the REPL is the number one hurdle that keeps people from trying Lisp.

The rest of the post I agree with, and I don’t argue that there are hurdles beyond the first one that are bigger and harder to conquer, but that’s irrelevant if newbies aren’t willing to jump the first one.

Smarter Git Prompt

After posting my git prompt shell snippets, someone pointed out on news.YC pointed out my daftness. All you need is the following in your .bashrc:

PS1 =
'$(git branch &>/dev/null;
    if [ $? -eq 0 ]; then
        echo "\[\033[00m\]{$(git branch | grep ^*|sed s/\*\ //)}";
    fi)\$\[\033[00m\]'

debugging macro snippet

I wish I could say I came up with the below, and I suppose I did, but I more or less gathered two or three techniques I didn’t come up with into a few lines to code to make a macro useful for debugging. Its primary merits are readable output and extreme ease of use.

#define MY_MODULE_DEBUG_CATEGORY1  0x0001
#define MY_MODULE_DEBUG_CATEGORY2  0x0002
#define MY_MODULE_DEBUG_CATEGORY3  0x0004

#if !defined(MY_MODULE_DEBUG)
#  define MY_MODULE_DEBUG                 0x0000
#endif
#define _pp_string(x) #x
#define _pp_str(x)     _pp_string(x)

#if (__GNUC__ > 3)
#  define dbg(cat, fmt, ...)					\
   do {								\
      if (MY_MODULE_DEBUG_##cat & MY_MODULE_DEBUG)		\
      {								\
	 printf(__FILE__":"_pp_str(__LINE__)" [%s] "fmt,	\
		__func__, ##__VA_ARGS__);			\
      }								\
   } while (0)
#else
#  define dbg(cat, fmt, args...)				\
   do {								\
      if (MY_MODULE_DEBUG_ ## cat & MY_MODULE_DEBUG)		\
      {								\
	 printf(__FILE__":"_pp_str(__LINE__)" [%s] "fmt,	\
		__FUNCTION__ , ##args);				\
      }								\
   } while (0)
#endif
#if defined(_SOME_OS_)
#  define os_err(cat, errval)					\
   do {								\
      char *errstr = 0;						\
      error_string(errval, &errstr, 80, ERR_GET_ALL);		\
      if (errstr)						\
      {								\
	 dbg(cat, "OS ERROR:\n%s\n", errstr);			\
	 free(errstr);						\
      }								\
      else							\
      {								\
	 dbg(cat, "OS ERROR: %#lx - err_string() failed!\n",	\
	     errval);						\
      }								\
   } while (0)
#endif

Notice the “os_err” macro above, for use with OS-specific error string functions. This particular incantation is oriented toward a particular OS I interact with at work (names changed to protect the innocent), but the same idea would be applicable to any system, really.

Picking up a Lisp

I’ve recently gone back to learning Lisp. I used it a little bit in graduate school to do some homework assignments in one of my algorithms class and learned some of the basics there, but since I started my professional career, I haven’t gone back to it at all until now. I got Paul Graham’s book On Lisp and have made my way through several chapters, and it’s quite enjoyable.

One thing I keep noticing is how similar Lisp and C++ are. Many Lispers, of course, would regard this as heresy, but to me it makes perfect sense. For example, take this passage from the book:

Many languages offer some form of macro, but Lisp macros are singularly powerful. When a file of Lisp is compiled, a parser reads the source code and sends its output to the compiler. Here’s the stroke of genius: the output of the parser consists of lists of Lisp objects. With macros, we can manipulate the program while it’s in this intermediate form between parser and compiler. If necessary, these manipulations can be very extensive.

Of course, the two languages aren’t exactly the same in this regard. C++ templates can’t be manipulated in the same way first-class C++ expressions can be; they are powerful, but really only give you full control over the type system. Being able to say lots of things about types in C++ will get you a long way, but not as far as Lisp macros will get you. Still, the similarities I think are striking, especially considering some of the newer techniques that have been discovered with C++ templates that let you go a bit beyond just the type system by doing some really obscure things with types.

A number of other similarities I think are significant; Lisp and C++ are the two languages with the best multi-paradigm support to which I’ve been exposed. Lisp tends to tilt functional, while C++ tilts procedural, but they both support all the major programming paradigms to practical degrees (although, to be fair, it would be easier to write an entirely procedural and/or object-oriented program in Lisp than it would be to write and entirely functional program in C++).

Anyway, just blurting some thoughts out loud while reading On Lisp.

Useful Git Prompt

Here’s a snippet of bash script that I wrote to make my prompt tell me when I’m in a git repository and what branch I’m on.

And to my pleasant surprise, the PROMPT_COMMAND mechanism will respect color codes, so if you have your branch listings in git colorized, it will be reflected in your prompt.

function rfind
{
    dir=$PWD
    while [ ! -e $1 ]; do
        if [ $PWD == "/" ]; then
            command cd $dir
            return 1
        else
            command cd ..
        fi
    done

    rfdir=$PWD
    command cd $dir

    return 0
}

function git_dir
{
     typeset str
     rfdir=""
     rfind '.git'
     if [ $? -eq 0 ]; then
        str="{`git branch | grep '*' | cut -d ' ' -f 2`}"
        rfdir=`echo $rfdir | sed "s#$HOME#~#"`
     fi

    echo "$str"
}

function mkprompt
{
    typeset branch
    branch=`git_dir`
    PS1="${branch}$ "
}

PROMPT_COMMAND=mkprompt
export PROMPT_COMMAND

More functional support in C++

Sutter just announced that C++0x will have support for lambdas and closures.

It looks from the N2550 report that these things are motivated by a desire for some syntactic sugar to make using STL algorithms easier (which is understandable), but I hope this will lead to more first-class support for the functional programming paradigm. It has long had support for function objects, but it would be difficult to write predominantly functional programs in C++. I hope it just got a little bit easier.

My question(s): will these lambdas be serializable? In Lisp, there are a couple of ways you could ship around a closure. Ultimately, if you really wanted to, you could express the lambda as a regular list, transmit it, and then call “compile” on it. Since C++ doesn’t have a built-in way to write and compile code from within the language itself, it will have to be somewhat different (or maybe it’s time to start writing a functor serializer for the boost library).

I am a huge C++ aficionado, and this news excites me. A comment over on news.yc captures my sentiments pretty well:

Lambda functions and closures in C++. E-gads man! Wasn’t the language complex enough already!

I have to say: I love it. C++ has always been my favorite language that I don’t program in. It’s like kung-fu for programmers. You can attack any problem. But most times we don’t need kung-fu.

There’s nothing like “getting in the zone” with a good C++ program. It’s a thing of beauty. And if you’re doing stuff that you have to use the tool, like writing drivers, low-level stuff, doing some “break the rules” programming, it’s the best there is. I wrote a shell extender for windows many years ago in C++. I got to change the desktop and file system to work the way I wanted it to. Very neat stuff.

The sad fact is that most of the time you can make more money writing something with a much higher level language. And you can make it faster and adapt to the market faster. So C++ is a tougher language to use for startups, in my opinion (although it is still my favorite)

Can you imagine the trouble you can get into with macros, lambda functions, and unsafe pointers? How about throwing i some operator overloading? It’s like a playground for nuclear weapons.