25 Things I’ve learned in Software Development

21 07 2014
  1. Developers don’t think they need marketing people. Until they try to market their own products.
  2. Developers don’t think they need business people. Until they spend a years writing software no one wants and is eventually abandoned.
  3. Developers tend to make user interfaces a reflection of the underlying data structures. The more intuitive user interfaces often involve duplicate sections, intertwined complexity, and the base assumption that the user has no idea what they’re doing (eg, the anti-developer).
  4. The more users you have, the more risk you have when deploying new code.
  5. The more risk you have, the higher quality you tend to make code.
  6. High quality code (design documents, unit tests, integration tests, code reviews…) isn’t usually as fun as high speed hacked together prototypes.
  7. At software companies, you’ll be treated like rockstars. At other engineering companies, you may be stuffed in a dark corner with the IT people, second class citizens to the electrical engineers, finance people, or whoever else is a part of the main company’s mission.
  8. The popular misquote of Linus’s law, “with many eyeballs, all bugs are shallow,” is bullshit. Reading other people’s code is hard. Understanding other people’s code is even harder. Don’t expect code reviews to catch all the bugs.
  9. With many users, all bugs are shallow. If you own enterprise software with 5 clients, it’s probably going to be full of bugs. If you own software with public facing APIs and 2 million people calling them, you’ll find bugs rather quickly when you push to production.
  10. Aggressive, disagree and commit style arguments, are stressful. Passive aggressive, disagree and give the silent treatment arguments, are time wasting.
  11. Become a master of your chosen source control tools. Your everyday coding work will be easier and your coworkers will thank you when you can rattle off in 4 commands how to move their commits from one branch to another or fix their broken merges.
  12. Writing tests is the one time in your life when you’ll be happy you find a bug. Nothing’s worse than spending an hour writing tests only to find your code works just like you expected it to.
  13. Keep notes a text file, wiki, zim, etc. It doesn’t have to be pretty, but 3 months after you run an obscure SQL query you’d only need to do once, it’s really useful to be able to search through you notes file and find it again. Organization isn’t important, Ctrl + F will get you where you need to go.
  14. You will forget the details of basically everything you work on within a few months. I saw a code review where someone was having trouble getting Joda DateTimes to work with the Jackson JSON serializer. I immediately knew that I had encountered the exact same problem about a month ago, but had absolutely no idea where that code was. Luckily our code review system is easily searchable so I could find my old commit.
  15. You will realize that forgetting is irrelevant as long as you can quickly navigate your way back back to what you once knew. Knowing what’s at the end of the road and how to get there is more important than being everywhere at once.
  16. Bugs really do become features. Telling people years later, “that’s a bug, not a feature,” doesn’t go over well when you tell them how you “fixed” it.
  17. Deprecating internally used software interfaces is hard.
  18. Deprecating publicly used  software interfaces is next to impossible.
  19. Algorithms are important for the majority job interviews.
  20. Algorithms aren’t very important for the majority of programming jobs.
  21. Time goes by faster when you have no idea what time it is. Disable your system clocks.
  22. Meetings are easily missed when you have no idea what time it is. Use calendar notifications.
  23. It’s too easy to get tunnel vision. Walk away from your computer with a sheet of paper and brainstorm every now and then.
  24. You can identify who wrote code from the constants they used in their unit tests. I’m personally a fan of 108. Coworkers favor 1984, 42, 69, and 8675309.
  25. Beware the 2 minutes of time it takes to build your project. It can easily turn into 20 minutes reading blogs.

Google Software Engineer Interview

12 12 2013

I got contacted by a Google recruiter back in August 2013 who found me via some of my open source work on Github, and after a couple of phone interviews I got flown out to the Mountain View office for a day of in person interviews. I didn’t end up getting an offer, but I thought I’d share some advice on the things I wish I had studied more or done differently.

I got there in the morning and had a slight chat with the recruiter who had contacted me. He brought me to a conference room and told me about the process of what was to come. In my case there were 5 interviews with a break for lunch in the middle where a Google employee gave me a tour around the campus and spent some time talking to me. It was made clear that the lunch tour and conversation would be off the record and wasn’t part of the interview, so it was time you could relax a bit.

Each interviewer presented 1 or 2 problems, where solutions needed to be coded on a whiteboard (I used C++). I can’t go into the exact problems due to NDAs I had to sign, but it’s not very important for the purpose of this post anyway.

Interview 1

The first interviewer seemed very mathematically inclined and only had a single problem for me. The only thing I’ll tell you is that it was a sort of combinatorial optimization problem that involved a couple of math tricks to get the running time down. I got to the correct answer eventually, but I needed a few hints. I have a hard time deciding if this interview went well or not due to the number of hints that were given before I came to the correct solution. If this interviewer rated me badly, I wouldn’t have blamed him, my discrete math and memory of combinatorial algorithms was just too rusty to figure it out quickly. On the other hand, it all seemed mostly obvious after a few hints, and the interviewer was friendly and seemed happy that I did get to the correct answer at the end.

Interview 2

The second interview went great. This interviewer had two problems prepared, both of which I solved very quickly. After that he admitted that he didn’t really have anything else for me, and started asking me some generic questions about hash maps and sorting algorithms. Low stress and I can’t imagine anything I could have done better in this one.

Advice: Make sure you know all about sorting algorithms and hashmaps. Even if you don’t have to code them, an interviewer that runs out of prepared stuff might default to asking you a dozen questions on running times, hashing functions, choosing pivots in sorts, etc.

Interview 3

The third interview was the most stressful, but surprisingly had the least algorithmically complex problem. The problem itself was seemingly simple from an algorithm standpoint, but involved getting a lot of indexes correct and dealing with ranges of memory and whatnot without getting off by one errors or subtracting the wrong lengths from things. In addition, I made the mistake of using my own implementation of what was essentially a fixed size circular queue for part of the problem, when I should have used some more high level C++ data types (like std::queue), leaving me with even more indexes and pointers to potentially make mistakes on.

The interviewer had a thick accent and occasionally struggled to find the words to describe what he meant to me. This by itself wasn’t that big of problem, but combined with him being the type of person who gave hints and ideas often, and got frustrated by the language barrier, he would grab a sharpie and start writing bits of code on the board with mine rather than just telling me his hint.

A seemingly helpful gesture actually made the entire thing that much worse. When he would write something, it was often so different from what I was thinking that it would throw off my train of thought and I’d be left trying to figure out exactly where he was going with it. The problem given also left a certain amount of structuring and design work up to me, and when I started doing it one way, he soon suggested that I structure it differently and went so far as to take his sharpie and write out a method stub, telling me I should be able to solve the problem by implementing these methods (this was when I was already a good way through solving the problem differently). Already frustrated at my own code, I erased what I had and started from scratch doing it his way. The “assists” continued any time I would seem to be struggling or would pose rhetorical questions to myself to fill the silence and describe my thought process, and every time would leave me more stressed: derailing my train of thought and making me feel like I was doing badly.

Although Pair Programming with two keyboards is most certainly a bad idea, I’m not blaming everything on the interviewer. I wish I could go into detail about the problem, but all I can really say is that it wasn’t algorithmically complex, but I struggled far too much trying to get the little details of it worked out. dealing with all the indexes, ranges, pointers, timestamps, and some bad initial design decisions left me with a huge desire to erase everything and start from the beginning for a 3rd time, but time was not something I had left.

The interviewer snapping a picture of the whiteboard, and leaving with nothing said but “good luck”, conclude the encounter.

Lunch Break

Whew. I’m glad I got a lunch break when I did, because the last interview left me more than frazzled, and the time to decompress a bit and wander around the campus was timely.

Interview 4

This interview had a few questions, all some problem solving using manipulation of standard tree data structures. Not much to say on this one, I solved the problems with almost no hints and with time to spare.

Advice: Know your trees.

Interview 5

This interview was a another single problem interview. I spent a lot more time than I should have realizing it was a graph problem, and then got thrown off by the interviewer suggesting that DFS wasn’t going to be useful. In hindsight, part of the solution was topological sort, and there are two common ways to implement it. One of which is based off DFS, which is the only method I knew and the direction I was going with the problem, but the interview’s comment made me stop, since he was wanting the alternative way of implementing it (that involves actual modification of the graph by deleting edges and finding nodes with no incoming edges).

Advice: If there are multiple ways to implement an algorithm, make sure you at least know the high level differences of them.


2 interviews were spectacular.

2 interviews were okay and I got to the correct solution, but only with enough hints.

1 interview went terribly.

Result: generic rejection letter about Google not having a good position for me right now.

I wasn’t actively applying places or even attempting to get a job at Google; interviewing was just an opportunity that came up through a recruiter and was too tempting to pass on, if nothing else because of the experience and practice. Even with that said, I couldn’t help being disappointed at the result and being left with a lot of second guessing on how I could have done better. I wish I could have gotten some sort of idea on how close I was. If I had just been better at combinitorics, would that have been enough to push it over the edge? Or if I had recognized a graph problem quicker? Or would have all 5 interviews have to go really well to get hired? Guess I’ll never know.

Things I wish I did different

This isn’t a long list of ways to prepare, but rather a specific list of weaknesses and mistakes I made so that perhaps others can avoid them.

  • I should have reviewed more combinatorics. Not just the math, but the algorithms. Practice things like computing all the subsets of a set, all of the permutations and combinations of characters to form strings, etc.
  • I should have reviewed and studied more graph algorithms. DFS and BFS alone aren’t going to cut it, and it often isn’t clear that the problem given to you is actually a graph problem at all.
  • I should have tried to spend as little mental effort as possible worrying how the last interview went. I couldn’t help trying to tally how I was doing in my head as the day went on, but you should save the post-interview analysis for the flight home and not the breaks between interviews.
  • Finally, and perhaps most importantly, I need to be a lot more careful about letting interviewers throw off my current train of thought. A comment like, “so what does that get you?” with a seemingly negative tone is enough to make me assume that everything I’m doing is wrong and start second guessing myself instead of carrying on with my problem solving process.

What to expect interviewing for Google

Getting an interview, and interviewing for, a software developer position at Google has an odd contrast. From my experience at least, having a good resume, open source projects, and being able to talk about things you’ve done is a requirement to attract the attention of a recruiter and get an interview. However, once you get that in person interview, no one cares anymore. Not a single person out of five asked me a question about my resume, past projects, technical proficiencies, or anything about what I’ve done and think I can do. From the moment you switch from talking to the recruiter to the first in person interviewer, it’s nothing but whiteboard coding and algorithms. Don’t expect much small talk and be well practiced with Topcoder style problems.

Thoughts on Open Source Software Development

28 10 2013

The last year I did a lot of work with some small Open Source projects (Nova, Honeyd, Neighbor-Cache Fingerprinter, Node.js Github Issue Bot…). I’ve also used Linux for all of my development and have used a lot of Open Source projects in that time. In some ways I’ve come out being more of on Open Source advocate than ever, and in other ways I’ve come out a bit jaded. What does Open Sourcing a project get you?

Good thing: free feedback on features and project direction

Unless you’re Steve Jobs, you probably don’t know what customers want. If you’re an engineer like most people reading this blog, you really probably don’t know what customers want. Open Sourcing the project can provide free user feedback. If you’re writing a business application, people will tell you they want pretty graphs generated for data that you never thought would be important. If you’re writing something with dependencies, users will tell you they want you to support multiple versions of potentially incompatible libraries that you would never have bothered with on your own.

If you’ve got an IRC channel, you’ll occasionally find a person who’s more than willing to chat about his or her opinions on the project and what features they think would be useful, in addition to the occasional issue tickets and emails.

The Open Source community can be your customers when you don’t have any real customers yet.

Good thing: free testing

Everyone who downloads and uses the project becomes someone that can help with the testing effort. All software has bugs, and if they’re annoying enough, people will report them. I’ve tried to make small contributions to bigger Open Source projects by reporting issues I’ve found in things like Node.js, Express, Backtrack, Gimp, cvv8… As a result, code becomes better tested and more stable.

Good thing: free marketing

Open Sourcing the project, at least in theory, means people will use it. They’ll talk about it to their friends, they’ll write articles and reviews about it, and if the project is actually useful it’ll start gaining popularity.

Misconception: you’ll get herds of developers willing to work on your project for free

I’ve reported dozens of bugs in big Open Source projects. I’ve modified the source code of Nmap and Apache for various data collection reasons. I’ve never submitted a patch bigger than about 3 lines of code to someone else’s Open Source project. That’s depressing to admit, but it’s the norm. People will file bug tickets, sometimes offer suggestions on features, but don’t expect a herd of developers working for free and flocking to your project to make it better. Even the most hardcore Open Source advocates have their own pet projects they would rather work on than fixing bugs or writing features into yours. Not only that, the effort to fix a bug in a foreign code base is significantly higher than the effort required for the original developer of the code to fix it. Why spend 3 hours setting up the development environment and trying to fix a bug, when you can file a ticket and the guy that wrote the code can probably fix it in 3 minutes?

There are large Open Source projects (Linux, Open Office, Apache…) that have a bunch of dedicated developers. They’re the exceptions. From what I’ve seen, most Open Source projects are run by one person or a small core group.

Misconception: the community will take over maintaining projects even if the core developer leaves

We used a Node.js library called Nowjs quite a lot. It’s a wonderful package that takes away all the tedium of manual AJAX or socket.io work and makes Javascript RPC amazingly easy. It has over 2,000 followers on Github, and probably ten times that many people using it. One day the developer decided to abandon the project to work on other things; not that unusual for a pet project. Sadly, that was the death of the project. Github makes it trivial to clone the project, with a single press of a button someone could make a copy of the repository and take over maintaining and extending it. Dozens of people initially made forks of the project in order to do that, and dozens more made forks to fix bugs they found.

What’s left? A mess consisting of dozens of Github forks of the project, all with different bugs being fixed or features added, and the “official” project left abandoned in such a way no one can figure out which fork they should use. There’s no one left to merge in patches or to make project direction decisions. New users can’t figure out which fork to use and old users that actually write patches don’t know where to submit them anymore.

The developer of Nowjs moved on to develop Bridge-js. Then Bridge-js got abandoned too.

Bridge is still open source but the engineers behind Bridge have returned to school.

This pattern is almost an epidemic in Node.js. Someone creates a really amazing module, publishes it to Github and NPM, and then abandons it. Dozens of people try to take over the development, but in the end all fail (partly because Github’s lack of specifying which fork of the project is “official”, and the Open Source problem that there is no “official” fork). A dozen new people create the same basic module from scratch, most of which never become popular, and most of which also become abandoned… You see the picture.

If you sense a hint of frustration, you’d be right. On multiple occasions I had to dig through dozens of half abandoned projects trying to figure out which library I wanted to use to do something as common as SQL in Node.js.

The reason it’s an epidemic with Node is because no one is really sure what they want yet, and projects haven’t become popular enough to have momentum to continue after abandonment by their original authors. Hopefully at some point projects will acquire a big enough user base and enough developers that they can sustain themselves.

Fork is a four letter word

Even the biggest projects aren’t immune to the anarchy of forks. Libreoffice and Openoffice, GNU Emacs vs XEmacs, the list goes on. For the end user of these software suits, this is mainly annoying. I’ve switched between LibreOffice and OpenOffice more than once now, because I keep finding bugs in one but not the other.

Sometimes forks break out for ridiculous reasons. The popular IM client Pidgin was forked into the Carrier project. Why?

As of version 2.4 and later, the ability to manually resize the text input box of conversations has been altered—Pidgin now automatically resizes between a number of lines set in ‘Preferences’ and 50% of the window depending on how much is typed. Some users find this an annoyance rather than a feature and find this solution unacceptable. The inability to manually resize the input area eventually led to a fork, Carrier (originally Funpidgin). – https://en.wikipedia.org/wiki/Pidgin_(software)

You can view the 300+ post argument about the issue on the original Pidgin ticket here.

The fact there’s no single “official” version of a project and the sometimes trivial reasons that forks break out cause a lot of inefficiency as bug are fixed in some forks but not others, and eventually code bases diverge so much that they also develop in one fork or another.

Misconception: people outside of software development understand Open Source

I once heard someone ask in confusion how Open Source software can possibly be secure, because can’t anyone upload backdoors into it? They thought Open Source projects were like Wikipedia, where anyone could edit the code and their changes would be somehow instantly applied without review. After all, people keep telling them, “Open Source is great, anyone can modify the code!”.

A half dozen times, ranging from family members to customers and business people, I’ve had to try and explain how Open Source security products can work even though the code is available. If people can see the code, they can figure out how to break and bypass it, right? Ugh…

And don’t even get me started on the people that will start comparing Open Source to communism.

Concluding thoughts

I believe Open Source software has plenty of advantages, but I also think there’s a lot of hype surrounding it. The vast majority of Open Source projects are hacked together as hobby projects and abandoned shortly after. A study of Sourceforge projects showed that less than 17% of projects actually become successful; the other 83% are abandoned in the early stages. Most projects only have a few core developers and the project won’t outlive their interest in it. The community may submit occasional patches, but are unlikely to do serious feature development.

Why release Open Source software then? I think the answer often becomes, “why not?”. Plenty of developers write code in their spare time. They don’t plan to make money directly from it: selling software is hard. They do it to sharpen their saws. They do it for fun, self improvement, learning, future career opportunities, and to release something into the world that just might be useful and make people’s lives better. If you’re writing code with no intention of making money off it, there’s really no reason not to release it as Open Source.

What if you do want to make money off it? Well, why not dual licence your code and have a free Open Source trial version along with an Enterprise version? You’ll get the advantages of free marketing, testing, and user feedback. There is the risk that someone will come along and extend the Open Source trial version into a program that has all of the same features, or even more features, as your Enterprise version, and this is something that needs to be considered. However, as I mentioned before, it’s hard to find people that will take over the development and maintenance of Open Source projects. I think it’s more likely that someone will steal your idea and create their own implementation than bother with trying to extend your trial version code, but I don’t have any proof or evidence of that.

Jade Mixins (blocks, attributes, and more)

24 07 2013

Thought I’d note this down for anyone else having problems with Jade mixins. It’s fairly undocumented at the moment and if you follow the documentation on the Jade github it will actually break with obscure errors which took lot of trial and error to figure it out.

Note: I’m using Jade 0.32. You’ll probably need that version or newer.

What is a mixin?

A mixin is simple method to allow reuse of HTML snippets inside of Jade templates. Lets go ahead and explain with an example. Suppose you have a page of quotes. Each quote is in it’s own section, with the author’s name in bold, and a like button that keeps track of the most liked quotes with some Javascript.

The basic syntax to define a mixin that takes in a couple of arguments is as follows,

mixin section(quote, author)
      p #{quote} – said by
        b #{author}

    span.buttonSpan Like this quote

Then to actually use the mixin, we use the (somewhat undocumented) “+” symbol as follows,

+section(“Imagination is more important than knowledge”, “Albert Einstein”)
+section(“Writing, to me, is simply thinking through my fingers.”, “Isaac Asimov”)

This will generate the HTML,

<div class=”contentBox”>
  <div class=”quoteText”>
    <p>Imagination is more important than knowledge. – <b>Albert Einstein</b></p>

  <a onclick=”quoteLiked()” class=”likeButton”>
    <img src=”images/like.png” class=”buttonIconLeft”/>
    <span class=”buttonSpan”>Like this quote</span>

Mixin arguments can be objects too

You don’t have to just pass in strings to the mixins, but you can use any Javascript objects you passed into the render call or that you created earlier in the template. This can lead to some useful mixins like this one to convert a Javascript array to a select dropdown list.

mixin listData(selectId, options)

    each obj in options>
      option(value=”#{obj}”) #{obj}
– var countries= [‘UK’, ‘USA’, ‘CANADA’, ‘MEXICO’]
+listData(“countrySelect”, countries)

What are block mixins?

The need for block mixins came up when I had my page divided into sections, with each section having a title and a few containing divs, as well as a help icon.

mixin headerWithHelp(title, helpAnchor)
      span #{title}
      a(href=”help##{helpAnchor}”, target=”_blank”)
        img.helpIcon(src=”/images/help.png”, style=”float: right;”)
    div.cardOuter(style=”display: inline-block”)

The important thing to note here is the “block” keyword at the end of the mixin definition. This will make it so the indent block after the mixin will included in that location, so you can do things like,

+headerWithHelp(“Test Section”, “testHelp”)
   p All of my content can go here now

It’s important to note that as of right now, block mixins DO NOT WORK if you use the mixin keyword to use the mixin instead of the “+” symbol shorthand (which is all I showed you in this tutorial). I believe this is a bug, you can track the status of it on the ticket I made here. Trying to use the mixin keyword to use the mixin instead of + when using a block after it will result in “Error at new JS_Parse_Error” and a stack trace.

What are mixin attributes?

Mixins have the ability to let you modify the attributes of one of the tags inside of it when you modify the attributes of the mixin. For example, suppose that you have a mixin to define a section of the page with a header that you use a lot, but you change some of the style attributes like the width and display type a lot.

mixin header(title)
    h3.sectionHeader #{title}
    div.content(style=”display: block”)

Notice the attributes keyword? Now you can use the mixin like so,

+header(“Section Title”)(style=’text-align:center; display: block; width: 500px;’)

And the style attribute will now be applied to the container div.


A final comment is that you may want to have a mixins folder inside views for the sake of organization. Then in your other jade files, you can just include the mixins you need.

include mixins/headers.jade

include mixins/quotes.jade

Neighbor Cache Fingerprinter: Operating System Version Detection with ARP

30 12 2012

I’ve released the first prototype (written in C++) of an Open Source tool called the Neighbor Cache Fingerprinter on Github today. A few months ago, I was watching the output of a lightweight honeypot in a Wireshark capture and noticed that although it had the capability to fool nmap’s operating system scanner into thinking it was a certain operating system, there were subtle differences in the ARP behavior that could be detected. This gave me the idea to explore the possibility of doing OS version detection with nothing except ARP. The holidays provided a perfect time to destroy my sleep schedule and get some work done on this little side project (see commit punchcard, note best work done Sunday at 2:00am).


The tool is currently capable of uniquely identifying the following operating systems,

Windows 7
Windows XP (fingerprint from Service Pack 3)
Linux 3.x (fingerprint from Ubuntu 12.10)
Linux 2.6 (fingerprint from Century Link Q1000 DSL Router)
Linux 2.6 (newer than 2.6.24) (fingerprint from Ubuntu 8.04)
Linux 2.6 (older than 2.6.24) (fingerprint from Knoppix 5)
Linux 2.4 (fingerprint from Damn Small Linux 4.4.10)
Android 4.0.4
Android 3.2
Minix 3.2
ReactOS 0.3.13

More operating systems should follow as I get around to spinning up more installs on Virtual Machines and adding to the fingerprints file. Although it’s still a fairly early prototype, I believe it’s already a useful enough tool that it can be beneficial, so install it and let me know via the Github issues page if you find any bugs. There’s very little existing research on this; arp-fingerprint (a perl script that uses arp-scan) is the only thing remotely close, and it attempts to identify the OS only by looking at responses to ARP REQUEST packets. The Neighbor Cache Fingerprinter focuses on sending different types of ARP REPLY packets as well as analyzing several other behavioral quirks of ARP discussed in the notes below.

The main advantage of the Neighbor Cache Fingerprinter versus an Nmap OS scan is that the tool can do OS version detection on a machine that has all closed ports. The next big feature I’m working on is expanding the probe types to allow it to work on machines that respond to ICMP pings, OR have open TCP ports, OR have closed TCP ports, OR have closed UDP ports. The tool just needs the ability to elicit a reply from the target being scanned, and a pong, TCP/RST, TCP/ACK, or ICMP unreachable message will all provide that.

The following are my notes taken from the README file,


What is the Neighbor Cache? The Neighbor Cache is an operating system’s mapping of network addresses to link layer addresses maintained and updated via the protocol ARP (Address Resolution Protocol) in IPv4 or NDP (Neighbor Discovery Protocol) in IPv6. The neighbor cache can be as simple as a lookup table updated every time an ARP or NDP reply is seen, to something as complex as a cache that has multiple timeout values for each entry, which are updated based on positive feedback from higher level protocols and usage characteristics of that entry by the operating system’s applications, along with restrictions on malformed or unsolicited update packets.

This tool provides a mechanism for remote operating system detection by extrapolating characteristics of the target system’s underlying Neighbor Cache and general ARP behavior. Given the non-existence of any standard specification for how the Neighbor Cache should behave, there several differences in operating system network stack implementations that can be used for unique identification.

Traditional operating system fingerprinting tools such as Nmap and Xprobe2 rely on creating fingerprints from higher level protocols such as TCP, UDP, and ICMP. The downside of these tools is that they usually require open TCP ports and responses to ICMP probes. This tool works by sending a TCP SYN packet to a port which can be either open or closed. The target machine will either respond with a SYN/ACK packet or a SYN/RST packet, but either way it must first discover the MAC address to send the reply to via queries to the ARP Neighbor Cache. This allows for fingerprinting on target machines that have nothing but closed TCP ports and give no ICMP responses.

The main disadvantage of this tool versus traditional fingerprinting is that because it’s based on a Layer 2 protocol instead of a Layer 3 protocol, the target machine that is being tested must reside on the same Ethernet broadcast domain (usually the same physical network). It also has the disadvantage of being fairly slow compared to other OS scanners (a scan can take ~5 minutes).

Fingerprint Technique: Number of ARP Requests

When an operating system performs an ARP query it will often resend the request multiple times in case the request or the reply was lost. A simple count of the number of requests that are sent can provide a fingerprint feature. In addition, there can be differences in the number of responses to open and closed ports due to multiple retries on the higher level protocols, and attempting to send a probe multiple times can result in different numbers of ARP requests (Android will initially send 2 ARP requests, but the second time it will only send 1).

For example,

Windows XP: Sends 1 request

Windows 7: Sends 3 if probe to closed port (9 if probe to open port)

Linux: Sends 3 requests

Android 3: Sends 2 requests the first probe, then 1 request after
A minimum and maximum number of requests seen is recorded in the fingerprint.

Fingerprint Technique: Timing of ARP Request Retries

On hosts that retry ARP requests, the timing values can be used to deduce more information. Linux hosts generally have a constant retry time of 1 second, while Windows hosts generally back off on the timing, sending their first retry after between 500ms and 1s, and their second retry after 1 second.

The fingerprint contains the minimum time difference between requests seen, maximum time difference, and a boolean value indicating if the time differences are constant or changing.

Fingerprint Technique: Time before cache entry expires

After a proper request/reply ARP exchange, the Neighbor Cache gets an entry put in it for the IP address and for a certain amount of time communication will continue without additional ARP requests. At some point, the operating system will decide the entry in the cache is stale and make an attempt to update it by sending a new ARP request.

To test this a SYN packet is sent, an ARP exchange happens, and then SYN packets are sent once per second until another ARP request is seen.

Operating system response examples,

Windows XP : Timeout after 10 minutes (if referred to)

Windows 7/Vista/Server 2008 : Timeout between 15 seconds and 45 seconds

Freebsd : Timeout after 20 minutes

Linux : Timeout usually around 30 seconds
More research needs to be done on the best way to capture the values of delay_first_probe_time and differences between stale timing and actually falling out of the table and being gc’ed in Linux.

Waiting 20 minutes to finish the OS scan is unfeasible in most cases, so the fingerprinting mode only waits about 60 seconds. This may be changed later to make it easier to detect an oddity in older windows targets where cache entries expire faster if they aren’t used (TODO).

Fingerprint Technique: Response to Gratuitous ARP Replies

A gratuitous or unsolicited ARP reply is an ARP reply for which there was no request. The usual use case for this is notification of machines on the network of IP changes or systems coming online. The problem for implementers is that several of the fields in the ARP packet no longer make much sense.

Who is the Target Protocol Address for the ARP packet? The broadcast address? Zero? The specification surprisingly says neither: the target Protocol address should be the same IP address as the Sender Protocol Address.

When there’s no specific target for the ARP packet, the Target Hardware Address also becomes a confusing field. The specification says it’s value shouldn’t matter, but should be set to zero. However, most implementations will use the Ethernet broadcast address of FF:FF:FF:FF:FF instead, because internally they have some function to send an ARP reply that only takes one argument for the destination MAC address (and is put in both the Ethernet frame destination and the ARP packet’s Target Hardware Address). We can also experiment with setting the Target Hardware Address to the same thing as the Sender Hardware Address (the same method the spec says to use for the Target Protocol field).

Even the ARP opcode becomes confusing in the case of unsolicited ARP packets. Is it a “request” for other machines to update their cache? Or is it a “reply”, even though it isn’t a reply to anything? Most operating systems will update their cache no matter the opcode.

There are several variations of the gratuitous ARP packet that can be generated by changing the following fields,

Ethernet Frame Destination Address : Bcast or the MAC of our target

ARP Target Hardware Address : 0, bcast, or the MAC of our target

ARP Target Protocol Address : 0 or the IP address of our target

This results in 36 different gratuitous packet permutations.

Most operating systems have the interesting behavior that they will ignore gratuitous ARP packets if the sender is not in the Neighbor Cache already, but if the sender is in the Neighbor Cache, they will update the MAC address, and in some operating systems also update the timeouts.
The following sequence shows the testing technique for this feature,

Send ARP packet that is known to update most caches with srcmac = srcMacArg Send gratuitous ARP packet that is currently being tested with srcmac = srcMacArg + 1 Send probe packet with a source MAC address of srcMacArg in the Ethernet frame

The first packet attempts to get the cache entry into a known state: up to date and storing the source MAC address that is our default or the command line argument –srcmac. The following ARP packet is the actual probe permutation that’s being tested.

If the reply to the probe packet is to (srcMacArg + 1), then we know the gratuitous packet successfully updated the cache entry. If the reply to the probe is just (srcMacArg), then we know the cache was not updated and still contains the old value.

The reason the Ethernet frame source MAC address in the probe is set to the original srcMacArg is to ensure the target isn’t just replying to the MAC address it sees packets from and is really pulling the entry out of ARP.

Sometimes the Neighbor Cache entry will get into a state that makes it ignore gratuitous packets even though, given a normal state, it would accept them and update the entry. This can result in some timing related result changes. For now I haven’t made an attempt to fix this as it’s actually useful as a fingerprinting method in itself.

Fingerprint Technique: Can we get put into the cache with a gratuitous packet?

As mentioned in the last section, most operating systems won’t add a new entry to the cache given a gratuitous ARP packet, but they will update existing entries. One of the few differences between Windows XP and FreeBSD’s fingerprint is that we can place an entry in the cache by sending a certain gratuitous packet to a FreeBSD machine, and test if it was in the cache by seeing if a probe gets a response or not.

Fingerprint Technique: ARP Flood Prevention (Ignored rapid ARP replies)

RFC1122 (Requirements for Internet Hosts) states,

“A mechanism to prevent ARP flooding (repeatedly sending an ARP Request for the same IP address, at a high rate) MUST be included. The recommended maximum rate is 1 per second per destination.”

Linux will not only ignore duplicate REQUEST packets within a certain time, but also duplicate REPLY packets. We can test this by sending a set of unsolicited ARP replies within a short time range with difference MAC addresses being reported by each reply. Sending a probe will reveal in the probe response destination MAC address if the host responds to the first MAC address we ARPed or the last, indicating if it ignored the later rapid replies.

Fingerprint Technique: Correct Reply to RFC5227 ARP Probe

This test sends an “ARP Probe” as defined by RFC 5227 (IPv4 Address Conflict Detection) and checks the response to see if it confirms to the specification. The point of the ARP Probe is to check if an IP address is being used without the risk of accidentally causing someone’s ARP cache to update with your own MAC address when it sees your query. Given that you’re likely trying to tell if an IP address is being used because you want to claim it, you likely don’t have an IP address of your own yet, so the Sender Protocol Address field is set to 0 in the ARP REQUEST.

The RFC specifies the response as,

“(the probed host) MAY elect to attempt to defend its address by … broadcasting one single ARP Announcement, giving its own IP and hardware addresses as the sender addresses of the ARP, with the ‘target IP address’ set to its own IP address, and the ‘target hardware address’ set to all zeroes.”

But any Linux kernel older than 2.6.24 and some other operating systems will respond incorrectly, with a packet that has tpa == spa and tha == sha. Checking if tpa == 0 has proven sufficient for a boolean fingerprint feature.


Feedback from higher protocols extending timeout values

Linux has the ability to extend timeout values if there’s positive feedback from higher level protocols, such as a 3 way TCP handshake. Need to write tests for this and do some source diving in the kernel to see what else counts besides a 3 way handshake for positive feedback.


Infer Neighbor Cache size by flooding to cause entry dumping

Can we fill the ARP table with garbage entries in order for it to start dumping old ones? Can we reliably use this to infer the table size, even with Linux’s near random cache garbage collection rules? Can we do this on class A networks, or do we really need class B network subnets in order to make this a viable test?

All about network configuration in Ubuntu Server 12.04/12.10

1 11 2012

Network configuration in Linux can be confusing; this post traces hrough the layers from top to bottom in order to take away some of the confusion and provide some detailed insight into the network initialization and configuration process in Ubuntu Server 12.04 and 12.10.

Initial System Startup

In Ubuntu, upstart is gradually replacing traditional init scripts that start and stop based primarily on run levels. The upstart script for basic networking is located in /etc/init/networking.conf,

# networking – configure virtual network devices
# This task causes virtual network devices that do not have an associated
# kernel object to be started on boot.

description “configure virtual network devices”

emits static-network-up
emits net-device-up

start on (local-filesystems
and (stopped udevtrigger or container))


pre-start exec mkdir -p /run/network

exec ifup -a

The ‘local-filesystems’ event is triggered when all file systems have been finished being mounted, and the ‘stopped udevtrigger or container’ line is to ensure that the /run folder is ready to be used (which contains process IDs, locks, and other information programs want to temporarily store while they’re running). The task keyword tells upstart that this is a task that should end in a finite amount of time (rather than a service, which has daemon like behavior). The important thing to take away from this is that the command “ifup -a” is called when the system starts up.

Configuring ifup

The configuration file for ifup is located in /etc/network/interfaces and this is the file you’ll want to modify for a basic network configuration. The ifup tool allows configuring of your network interfaces and will attempt to serially go through and bring them up one at a time when ifconfig -a is called if the ‘auto’ keyword is specified. An example configuration follows,

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp

auto eth1
iface eth1 inet static


This configuration starts out with the loopback adapter, which should be there by default. The next entry for eth0 will attempt to use DHCP, and the entry for eth1 will use a static IP and configuration. Something to keep in mind is that this will ONLY be run when the system starts up. For a simple desktop machine that is always plugged into the network, this configuration will probably be all you need to do. But what happens if you’re unplugging and plugging things in a lot, such as on a laptop? You will run into two problems: the first is that if you have interfaces set to DHCP and they aren’t plugged in when you’re booting, you’ll likely have a  “waiting for network configuration” message followed by “waiting up to 60 more seconds for network configuration..” which can slow your boot time by several minutes. The second problem is that once the system is booted, plugging in an Ethernet cable won’t actually cause a DHCP request to be sent, since ifup -a is only called once when the system is booting. If you want to avoid these problems, you’ll have to use a tool like ifplugd.

Using ifplugd to handle interfaces that are unplugged a lot

From the man page,

ifplugd is a daemon which will automatically configure your ethernet device when a cable is plugged in and automatically unconfigure it if the cable is pulled. This is useful on laptops with on-board network adapters, since it will only configure the interface when a cable is really connected.


Installing and configuring ifplugd is easy. First, go into /etc/network/interfaces and change the ‘auto eth0’ settings to ‘allow-hotplug eth0’. Now ifup -a will not activate this interface, but will instead allow the ifplugd daemon to bring it up. The configuration information for the interface will still be used from /etc/network/interfaces. To install and configure ifplugd run the following,


sudo apt-get install ifplugd

sudo dpkg-reconfigure ifplugd

Enter the names of all the interfaces that you want ifplugd to configure when the link status changes and the reconfigure tool will update /etc/default/ifplugd. Instead of using upstart, ifplugd currently uses the old style init.d scripts, and is launched from /etc/init.d/ifplugd.

Now you should be able to unplug and plug in Ethernet cables and DHCP requests will be sent each time!

Note for Ubuntu Desktop Users

Ubuntu Desktop uses the Network Manager GNOME tool to configure the network, and most of the time you should be able to configure everything graphically using it. This is specifically for Ubuntu Server or an Ubuntu version without Network Manager.

NOVA: Network Antireconnaissance with Defensive Honeypots

7 06 2012

Knowledge is power, especially when regards to computer and information security. From the standpoint of a hacker, knowledge about the victim’s network is essential and the first step in any sort of attack is reconnaissance. Every little piece of seemingly innocent information can be gathered and combined to form a profile of the victim’s network, and each bit of information can help discover vulnerabilities that can be exploited to get in. What operating systems are being used? What services are running? What are the IP and MAC addresses of the machines on a network? How many machines are on the network? What firewalls and routers are in place? What’s the overall network architecture? What are the uptime statistics for the machines?

Since network reconnaissance is the first step in attacking, it follows that antireconnaissance should be the first line of defense against attacks. What can be done to prevent information gathering?

The first step in making the difficult to gather information is simply to not release it. This is the realm of authentication and firewalls, where data is restricted to subsets of authorized users and groups. This doesn’t stop the gathering of information that, by it’s nature, must be to some extent publicly available for things to function. Imagine the real life analogy of a license plate. The license plate number of the car you drive is a mostly harmless piece of information, but hiding it isn’t an option. It’s a unique identifier for your car who’s entire point is to be displayed to world. But how harmless is it really? Your license plate could be used for tracking your location: imagine a camera at a parking garage that keeps logs of all the cars that go in and out. What if someone makes a copy of your license plate for their car and uses it to get free parking at places you have authorized parking? What if someone copies the plate and uses it while speeding through red light cameras or committing other crimes? What if someone created a massive online database of every license plate they’ve ever seen, along with where they saw it and the car and driver’s information?

Although a piece of information may seem harmless by itself, it can be combined to get a more in depth picture of things and potentially be a source of exploitation.  Like a license plate, there any many things on a network that are required to be publicly accessible in order for the network to function. Since you can’t just block access to this information with a firewall, what’s the next step in preventing and slowing down reconnaissance? This is where NOVA comes in.

Since hiding information on a LAN isn’t an option, Datasoft’s NOVA (Network Obfuscation and Virtualized Anti-reconnaissance) instead tries to slow down and detect attackers by making them go threw huge amounts of fake information in the form of virtual honeypots (created with honeyd). Imagine an nmap scan on a typical corporate network. You might discover that there are 50 computers on the network, all running Windows XP and living on a single subnet. All of your attacks could then target Windows XP services and vulnerabilities. You might find a router and a printer on the network too, and spend a lot of time manually poking at them attempting to find a weakness. With NOVA and Honeyd running on the network, the same nmap scan could see hundreds of computers on the network with multiple operating systems, dozens of services running, and multiple routers. The attacker could spend hours or even days attempting to get into the decoy machines. Meanwhile, all of the traffic to these machines is being logged and analyzed by machine learning algorithms to determine if it appears hostile (matches hostile training data of past network scans, intrusion attempts, etc).

At the moment NOVA is still a bit rough around the edges, but it’s an open source C++ Linux project in a usable state that could really use some more users and contributors (shameless plug). There’s currently a QT GUI and a web interface (nodejs server with cvv8 to bind C++ to Javascript) that should provide rudimentary control of it. Given the lack of user input we’ve gotten, there are bound to be things that make perfect sense to us but are confusing to a new user, so if you download it feel free to jump on our IRC channel #nova on irc.oftc.net or post some issues on the github repository.