Four steps for describing cloud anchitectures on a whiteboard

2 04 2015

If you work at a company that owns a lot of cloud services, it’s inevitable that you’ll have some high level meetings with other teams regarding the architecture of their systems. Usually people start drawing boxes while saying, “we own this service called Foo which talks to this service called Bar, which then talks to Pineapple and….”.


Five minutes later, everyone is lost. The diagram and the descriptions made while drawing it transfer nothing except the knowledge that they own webservices, which have names (probably unrelated to what they do), and at some undefined point in time communicate something with eachother. Everyone spends the rest of the meeting checking their email in between squinting at the whiteboard and wondering when they can go back to writing code.

To try and make describing a distributed system easier, I ask four questions.

1. What are the entities?

Your system does stuff with entities (SKUs, Reports, Documents, Processing Requests, Marketplaces, Orders, Resources…). Telling me all about your service that processes SKU Merger Requests is going to be Chinese to me unless you start with telling me what the heck a SKU Merger Request is. Systems deal with entities that are usually abstractions around business concepts and client functionality. Start with a non-technical explanation of what the system does, why it exists in the first place, and most importantly what are the entities it deals with.

2. Where is the state/persistence?

There are stateless services and services with state. If the service has state, where is it persisted? Oracle databases? DynamoDb? Elastic search clusters? Encoded on top of the Quantom superposition of hydrogen atoms? I don’t care much about the details of how things are encoded in the persistence layer (SQL, JSON, text files…). I just care about where on the diagram of distributed stuff the state of the system resides.

3. What triggers synchronous chains of events?

Drawing arrows between service boxes doesn’t represent when those arrows are exercised. In distributed systems, there are lots of triggers which start some chain of synchronous events. These triggers should be described. For example, a user on a webpage submits an order. This is a trigger to some chain of events. Perhaps that chain of events will end with a message being put on a queue. An asynchronous agent pops the message off that queue, which begins another chain of synchronous events, that perhaps ends with a database call. The database happens to have a post-commit trigger which starts another chain of events. You get the picture; describe the triggers and make sure you draw distinctions between different chains of events.

4. The functional bits: APIs, inputs, and outputs.

Finally, start breaking down the details of the interactions between system components. All software takes input and produces output. This can be hard to draw on a whiteboard, but can be described while drawing at least.


The symbols to draw on the whiteboard? It doesn’t matter. People are adaptable. Jeff can draw his chains of events using different colored markers, Bob can draw his chains of events using squiggly lines, and Jessica can draw hers with numeric labels. The whiteboard is just temporary storage to sketch out systems that would otherwise escape short term memory before they can be fully understood. It’s a scaffolding that must be filled in with descriptions and discussions during the drawing process.

Just remember to hit all of the high level talking points. What entities does your system deal with? Where are they stored? What triggers their mutation? What are the inputs and outputs to your APIs? This is usually enough to sketch out the high level workings of complex systems filled with workflows, asynchronous queues, distributed state, and  all sorts of common cloud computing patterns. Really the only thing missing is conditionals, and my answer that? Don’t draw a bunch of crazy conditional paths on the same diagram. Make one diagram follow a single path through code and draw different diagrams for the alternative flows when authorization is denied, processing is cancelled, or whatever other branching and exception cases you’re trying to communicate.

25 Things I’ve learned in Software Development

21 07 2014
  1. Developers don’t think they need marketing people. Until they try to market their own products.
  2. Developers don’t think they need business people. Until they spend a years writing software no one wants and is eventually abandoned.
  3. Developers tend to make user interfaces a reflection of the underlying data structures. The more intuitive user interfaces often involve duplicate sections, intertwined complexity, and the base assumption that the user has no idea what they’re doing (eg, the anti-developer).
  4. The more users you have, the more risk you have when deploying new code.
  5. The more risk you have, the higher quality you tend to make code.
  6. High quality code (design documents, unit tests, integration tests, code reviews…) isn’t usually as fun as high speed hacked together prototypes.
  7. At software companies, you’ll be treated like rockstars. At other engineering companies, you may be stuffed in a dark corner with the IT people, second class citizens to the electrical engineers, finance people, or whoever else is a part of the main company’s mission.
  8. The popular misquote of Linus’s law, “with many eyeballs, all bugs are shallow,” is bullshit. Reading other people’s code is hard. Understanding other people’s code is even harder. Don’t expect code reviews to catch all the bugs.
  9. With many users, all bugs are shallow. If you own enterprise software with 5 clients, it’s probably going to be full of bugs. If you own software with public facing APIs and 2 million people calling them, you’ll find bugs rather quickly when you push to production.
  10. Aggressive, disagree and commit style arguments, are stressful. Passive aggressive, disagree and give the silent treatment arguments, are time wasting.
  11. Become a master of your chosen source control tools. Your everyday coding work will be easier and your coworkers will thank you when you can rattle off in 4 commands how to move their commits from one branch to another or fix their broken merges.
  12. Writing tests is the one time in your life when you’ll be happy you find a bug. Nothing’s worse than spending an hour writing tests only to find your code works just like you expected it to.
  13. Keep notes a text file, wiki, zim, etc. It doesn’t have to be pretty, but 3 months after you run an obscure SQL query you’d only need to do once, it’s really useful to be able to search through you notes file and find it again. Organization isn’t important, Ctrl + F will get you where you need to go.
  14. You will forget the details of basically everything you work on within a few months. I saw a code review where someone was having trouble getting Joda DateTimes to work with the Jackson JSON serializer. I immediately knew that I had encountered the exact same problem about a month ago, but had absolutely no idea where that code was. Luckily our code review system is easily searchable so I could find my old commit.
  15. You will realize that forgetting is irrelevant as long as you can quickly navigate your way back back to what you once knew. Knowing what’s at the end of the road and how to get there is more important than being everywhere at once.
  16. Bugs really do become features. Telling people years later, “that’s a bug, not a feature,” doesn’t go over well when you tell them how you “fixed” it.
  17. Deprecating internally used software interfaces is hard.
  18. Deprecating publicly used  software interfaces is next to impossible.
  19. Algorithms are important for the majority job interviews.
  20. Algorithms aren’t very important for the majority of programming jobs.
  21. Time goes by faster when you have no idea what time it is. Disable your system clocks.
  22. Meetings are easily missed when you have no idea what time it is. Use calendar notifications.
  23. It’s too easy to get tunnel vision. Walk away from your computer with a sheet of paper and brainstorm every now and then.
  24. You can identify who wrote code from the constants they used in their unit tests. I’m personally a fan of 108. Coworkers favor 1984, 42, 69, and 8675309.
  25. Beware the 2 minutes of time it takes to build your project. It can easily turn into 20 minutes reading blogs.

Thoughts of a lone traveler (Seattle)

16 03 2014

B17, B17, B17… I repeat in my head as I traipse through the airport terminal trying to find my gate. B13, B14, B15, B10? I spin around with an uncensored gesture of confusion and annoyance, march back to the last sign I saw, and look perplexed at the map telling me gate B17doesn’t exist. I rummage through my pockets trying to find my boarding pass. Boarding group B, position 17, gate B12. Damnit, I mixed up the B17 boarding position with the B12 gate. What kind of UI designer would design boarding positions to resemble gate numbers?  Boarding positions could start with a number and be suffixed with a letter to avoid mixing them up, or they could use letters that are larger than the largest gate number, or they could use roman numerals, or be prepended with… I find gate B12 and stop muttering to myself about how airports are designed all wrong.

The bathroom sinks at this airport are these strange Y shaped contraptions. You put your hands under them, and water comes out. You hold your hands vertically in front of them, and it blows air to dry them. It’s the little things you notice when you travel.

Here I am, writing about travel again on my software engineering themed blog. I’m tempted to create a new blog, one for writing about the adventures of life outside of engineering, but why partition and segregate the areas of my life? We show too many different masks to too many different people; I’d rather transcend than encourage that habit.

This time I’m sitting in the Seattle Tacoma International airport, in a giant open room three stories tall with a beautiful wall of glass overlooking the runways and asphalt outside. It would be a wonderful view if not for the solid grey sky sprinkling down rain in a gloomy manner. I’m from Arizona, so in general I love it when it rains, but Seattle weather still doesn’t appeal to me much. In Arizona we have storms: dark clouds and strong winds with the occasional monsoon or dust storm (we call them haboobs). They’re powerful, beautiful, and rare. Here, the sky is solid grey, there’s no wind to be found, and cold rain boringly sprinkles down onto the raincoat covered city goers.

This trip was a bit long and tiring. It started out arriving Monday night with a coworker, followed by three days of death by PowerPoint at a work related conference. Some of the talks were actually interesting, but sitting in the same place for 7+ hours, 3 days straight, will make even the most interesting talks seem to drag on forever. We went to some local restaurants every night, but in general I didn’t get to do any site seeing until Friday. I set up the trip so that I could stay Friday-Sunday on my own time and actually get a change to see a bit of Seattle this time (second time I’ve been here for work, first time I’ve gotten to see anything).

Friday, I slept in, recovering from the PowerPoint induced mental coma of the days past. I got up in the morning with a vague notion of seeing the Space Needle and beginning my use of the Seattle City Pass I purchased (entry to the Space Needle, EMF museum, science center, aquarium, and a harbor boat tour). However, I decide to wander to the south side of Lake Union first in the morning. The lake wasn’t anything particularly noteworthy, but I did run across the Museum of History and Industry, and spent some time walking through there. I stopped at a Goodwill on the way back to the hotel and made the completely random purchase of an orange turtle lamp, beginning the collection of odd souvenirs I’m bringing back from the trip. Hah, turtle lamp! I smile a little every time I see it for no determinable reason, followed by fearing that airport security will complain that it could be used as a bludgeoning instrument, as the turtle is made of a surprisingly hefty chunk of metal with an  orange frosted glass shell.

My energy was already running low when I got back to the hotel, and I decide to pull out my laptop and see if there were any local bands playing that Friday night. Seattle is a city obsessed with music. Posters for local indie bands are tacked onto every light poll and bus stop you come across downtown. Looking online I find not one, but dozens, of local bands playing in various venues. Of all my impressions of Seattle, the best thing I found was the local music. I one by one plugged the addresses of venues into Google Maps, narrowing down the list to a half dozen places within reasonable walking distance from my hotel. I finally settled on a place called the Crocodile, which was featuring Jamie Nova, The Pink Slips, and Into the Cold.

Music! Just what I needed to get my energy levels back up. As an introverted engineer with almost no musical talent, I have the utmost respect and love for singers who go on stage and pour their hearts and stories out. I fell in love with the music by the first singer, Jamie Nova, and ended up adding one of her CDs and a flask to my collection of odd souvenirs.  The Pink Slips featured the pop singing of a 16-year-old girl, and the night was finished off with the more surreal and dark sounds of the 4 women choir in Into the Cold.

Saturday morning I put on my tourist hat (figuratively, though later in the story there is a hat) and started in on the stereotypical Seattle tourist attractions.  First thing in the morning, take the elevator up to the top of the Space Needle and look around. It was a bit disappointing, since in my opinion, Seattle is sort of an ugly city. Seeing more of it at once doesn’t make it any prettier. Too many of the buildings around here are dull colors, reflecting the colors of the often grey Seattle skies.  Looking off in the distance I see rows of perfectly square buildings with the same bland colors and placements of perfectly square windows. Half of the city also seems to be under construction, with cranes sprouting out of every other building like mechanical yellow trees.

I stop by a random shop and pick up a chain/necklace with a pirate themed guitar pick hanging off it. Another item for the pile of souvenirs, one that fits in quite well with computer security conferences, and might become a permanent feature of my everyday attire. When I try to throw away the packaging I’m greeted with the usual trio of trash cans around Seattle: recycling, compost, and garbage. I still haven’t figured out which cups are supposed to go into compost and which ones go into recycling. Sometimes they call compost “food waste” too. Or maybe that’s different than compost. I dread throwing things away here.

Next, I wander over to the Experience Music Project, a music themed museum. My spontaneous timing works out well, and I enter the museum just in time to see a few songs performed by a local high school chorus. Once I get to the main part of the museum, I’m greeted by a breathtaking cone shaped tower of guitars that stretches two stories high. Over 500 instruments are attached to this tower of music. Many of them are wired to motors, which actually play them. Standing next to it you can barely hear anything, just the occasional flicks of motors and strums of an instrument. The microphones on the instruments pick up the sounds though, and putting on a pair of headphones lets you hear the amazing symphony of the tower. Not just a graveyard of instruments, but an electromechanical giant alive with music!

Sunday, I check out the Seattle aquarium and take a 1-hour boat ride around the Puget Sound. I’m greeted in the morning with solid grey skies and a constant trickle of rain. I’m utterly annoyed at walking in the rain with glasses after a mere ten minutes, and dart into the nearest store looking for a hat to help keep the rain off my glasses. Add a blue and bright green Seattle Seahawks baseball cap to my collection of souvenirs.  I shiver on the top deck of the boat, fighting the rain and cold to snap some pictures of the fog and cloud covered Seattle skyline. I picked the wrong day for a boat ride. Alas, it made for an authentic Seattle experience I suppose.

I finish the day by wandering around Pike Place and seeing the wall of gum. Yes, there’s a wall, covered with gum. Without a doubt the most chewed gum you will ever see in one spot.

Now I sit, typing away at the airport, waiting for my plane home. I’m not sure I like Seattle much. A nice place to visit, but I’m not sure I’d want to stay. Perhaps my mood has been dampened by the long week, grey skies, and delayed flights, but I’m looking forward to being back in Arizona.

Deep Thoughts from a Lone Traveler

31 12 2013

In front of me is the beach of Ventura, California. The sun is slowly setting to my right, somewhat by design, as I commanded my aching legs to briskly walk here again before the sky separated itself into beautiful colors one last time before I travel home. I was compelled to write, and sitting here watching a dog run up and down the sand in front of the sunset seemed the place to maximize inspiration.

The locals are perplexed about why I’m here. I’ve given up trying to explain it, because in the process I realized I don’t really know myself. “Nothing happens in Ventura. You should have kept driving North to Santa Barbara”, they tell me. “You should have went to Vegas”, a man named Matt tells me at a local dive bar. “Very random”, a girl at the beach tells me, as I play with her dog and she waits for someone she knows to finish surfing. I wanted a simple vacation. Biking, hiking, beaches and bars. A little town in California that I fell in love with when I spent a night here seemed the perfect place. I don’t know why I fell in love with it; love is like that, you can’t help who you fall in love with. The locals are perplexed nonetheless.

The first night I ran across a man playing the Star Wars imperial march on a guitar and a homeless beggar with a sign saying, “bad advice: $1”. I found a bar where they ask, “boot or barrel?”, and give you a glass shaped like a boot or a bigger one shaped like a barrel. I eventually ended up at a tiny dive bar with chandeliers made out of bras and only a single beer on tap. I like terrible bars. People bring their girlfriends to nice bars. They bring their families, they go with coworkers after work, they bring their friends. People don’t bring their friends to terrible bars. That’s what makes them the best. You can go there alone, and strike up conversations with the other patrons around you, also alone, not just going to a bar to talk to the same old people but with a different view.

I feel as if every day I’ve been here I’ve had a realization. Answered a question not by logical thinking, but more an emergence of an idea as my mind wandered without direction, the reigns removed for a few days to give it the freedom to try and figure out life, instead of being commanded to focus it’s attention on some specific problem. My body hasn’t had as much of a vacation. I can’t help but feel it a separate entity, as it seems to have a will of it’s own, wanting to move slower, to stop sooner, to rest, but I compel it to carry on anyway. In the last two days I’ve told it to hike 8 miles across an island in the pacific, bike untold distances across hilly terrain, carry my curious mind to the shops and museums of downtown, and drag my laptop to this beach where I sit.

And I feel compelled to write. The words starting to form in my mind before they had anything to spill into. I wonder if this is why great poets like Walt Whitman, men with soft hearts and hard bodies, speak so highly of the open road. Allons, the road is before us!

The first day’s realization was had walking down the beach. I was planning to take it easy that day, in preparation for a trek across an island the next, but before you know it I had walked for hours along the beach collecting pockets full of sea shells. I like walking to the end of places. The beach is seemingly never ending. Seeing others at the beach gave me the slight desire to stop and talk to someone. I eventually ran across a red headed girl in jeans and a sweater, sitting high up on the sand alone. It was on my way back and she had been there for a while. As I walked up, I hesitated a bit, trying to think of what to say, and then trying to muster the courage to say it. I envisioned myself bolding sitting beside her and saying, “Hello. You look like you could use some company. Or maybe I’m just saying that because I could.”. I was nearly done hesitating, but the moment I decided to walk up, she got up and started making her way back toward the houses above the beach. I couldn’t help thinking, what would have happened if I had come back 10 seconds earlier? I would have met someone new. A local to the area, someone independent that enjoyed walking down to the beach and watching the sunset alone. I would have asked her about good places to go in the area, places to eat, maybe if I was feeling bold enough I’d ask her to have dinner with me. Maybe she would have said no, or maybe we’d just talk for a while and I’d never ask, but the conversation alone would surely have changed the events of my day. Maybe it could have changed the course of my life.

So I had the first realization. I’ve had it before, maybe it’s common sense, but in that moment I saw it with more clarity than ever before. The entire course of my life could have changed if I had been somewhere 10 seconds earlier or if I didn’t hesitate. All of our lives can be traced back to events that could have happened entirely different if you had been somewhere at just a slightly different time. As I walked back up the beach, my mind searched for these moments and thought about how things would have been different, or at least realized they wouldn’t have been like they were.

I once walked into a bar called Mr. G’s and got a drink when I was only a bit over 21 years old. If I had sat two seats further down the bar, I never would have met a man that bought me a drink and told me about being an aircraft mechanic, a traveler, and a cyclist. If I had sat two seats away from where I did, I wouldn’t have had such a great conversation that I felt like going back to that bar. I wouldn’t’ have become a regular, learned how to play pool, met some of my neighbors, and felt such loss when that little bar closed down. I wouldn’t have started to develop the confidence to walk into a bar alone and be able to meet new people. I never would have made a lot of friends that I did. If I had sat two seats down at the bar that day, it’s possible I would have never taken a liking to bars at all, and traveling alone would have been a bit more dull. It’s quite likely I wouldn’t be sitting at this beach feeling the cool ocean breeze, listening to the sound of the water crashing against the sand, and feeling the night creep upon the beach with its ever so cold fingers.

I wouldn’t have the career I do now if it wasn’t for random chance that, as a ham radio operator, I turned on my radio late one night when I was supposed to be sleeping and heard some people joking about morse code. I wouldn’t have found out that the people I heard actually had a youth ham radio club not too far from my house. I wouldn’t have met a bunch of intelligent electrical engineers, software engineers, and people that become roll models. I wouldn’t have had anyone to ask how to make a webpage, and anyone to tell me to learn HTML to do it. I wouldn’t have had anyone to ask how to program, and I wouldn’t have been told by a random friend to learn the obscure language TCL. I later wouldn’t have gotten an internship at Emerson Network Power doing TCL scripting. If I hadn’t turned on a radio that one night, I probably wouldn’t have become a Software Engineer. And if my grandfather hadn’t showed me all of his equipment one day, I never would have become a ham radio operator. I need to thank him for that one day. I can really trace everything in my career back to a conversation I had with him when I was 12 years old, and I can’t possibly imagine how much life would be different if that hadn’t happen. It wasn’t a deep profound conversation either, it was just a random event, the everyday kind of thing you don’t realize will be root of a branching tree that will change the course of your life forever.

The sun has set now, the light nearly entirely gone. The temperature is dropping, my body starts having a will of it’s own again, urging me to find shelter, find heat, to stop commanding my slowing fingers to keep typing on this cold beach. I need to buy a better jacket.

The second day I had a different realization, my mind satisfied for a while with answering the question of how I got here, done tracing events back to their sources. Off the coast of Ventura, over an hour by boat, there’s a chain of islands called the Channel Islands. The biggest of which is called Santa Cruz island, and there’s a road that runs 7+ miles from a place called Scorpion Beach to a place called Smugglers Cove on the other side of the island. I hiked across it, against the will of my complaining legs.

The cold got to me. I’ve retreated to the hotel lobby. Interrupted by bodily concerns mid story, how rude. I could keep the action and thought to myself, but this is a story that’s still happening. It’s a story of past and present intertwined, it’s a story of thoughts without purpose. The lobby is actually somewhat crowded. Three girls sit across the way; I’m tempted to ask them what they’re doing for New Year’s, they look like they’re going out soon. It’s about 6 hours until 2013 ends in this little town. No idea what I’m doing tonight. They have glow stick crowns. Where did they get glowsticks? Ah, they’re leaving. Another moment of hesitation. A world that could have been. A world maybe not better, but different. An older couple is asking if there will be fireworks tonight. It appears the answer is no. Nothing much happens in Ventura. The lobby is busy though, maybe some overheard conversation will give me some tidbit of information about what to do tonight.

Where was I? Ah yes, the second realization. After hiking around the island, I was on the boat ride back. I brought a book, but I found it too difficult to pull myself away from living in the moment and retreat to a world of fiction, however unremarkable the moment was.

I moved to the very front of the boat and stood with some others at the edge, holding onto the railing, the cold wind blowing hard against us, watching the texture of the ocean ahead. The ocean has a texture to it; I never knew that. I’m not sure what causes it. Some parts were smooth, some parts were rough. Some sections were darker than others. Some glistened like a perfect undulating puddle of liquid glass, other sections were filled with floating plants and algae. The texture would change quickly. You could see it coming, see the boundary, and then see where it went back to the old texture.

There are chips and salsa? I think the lobby might have beer too. Explains why it’s so crowded here. Some guys just walked past to their rooms with a cooler of beer. I guess they gave up on the idea of going out, nothing interesting happens in Ventura after all. A girl also walked by with a painting. An older couple just gave a girl a glowstick halo. Her boyfriend just came back, looked confused, and the old man said “did we de something wrong?”. The man replied, “Nah! If anything you did something amazingly right!”. The older couple laughed and asked if he wanted a glowstick too. Now we know where the glowsticks came from.

So there I was, standing on the edge of the boat, watching the texture of the ocean. I saw another man, alone, doing the same. Noticing him made me notice how few solo travelers there were on this boat. The boat had dozens of couples, families, groups of friends. Only a handful of people going alone. Maybe only the two of us. Why do so many people insist on doing everything with others? It seems woman are the worst. Most of them would never dream of going to a movie, going to a bar, or going on a road trip by themselves. I spent the rest of that day with the question in the back of my mind. I rejected ideas like people wanting someone near by to talk to or genuine concerns for safety. You don’t talk in movies and they’re quite safe, yet people still fear going to them alone. I think it runs deeper than fear of being different by being there alone too.

Eventually that night the answer came to me. I’m not sure if it’s the correct answer, but it seems to have a sufficient amount of explanatory power. I think we’re conditioned to be dependent from birth. We spend the first 18 years of our lives only doing things with family and friends. Do kids ever go to the movies alone? Go to dinner alone? Go to a party where they don’t know people? No, parents would never allow that, and it really just doesn’t come up to begin with. From a combination of parental fear and social tradition, we spent our childhood only doing things with family and family approved friends. Then we move out. We live on our own. We become independent. But most people still want to do things with others. To find friends to cling to, to find significant others to take with them. Maybe our entire culture of dating has actually come about from people not being independent enough to do anything on their own. Why should people be expected to drastically change the way they do things when they move out? It seems to explain the situation. It’s a hypothesis.

The third realization dawned on me today, when I picked a random direction and biked. No destination, just a direction, and eventually I had to give up on that direction and go somewhere else. There are mountains to the north I wanted to bike in, but I kept finding dead ends. One road north dead ended at a farm, another to a logging camp, and a third into a gated community that I nearly got stuck in. I tried a few codes at the gate, found out that 1234 worked, and went through. Then I realized it didn’t go anywhere interesting, and there was no keypad on the outside. The fence itself was the kind created from spear like sharpened pieces of metal. I found a way back eventually, but the ordeal turned me off on the direction of North. I decided South was a better idea.

At some point I stumbled across miles and miles of strawberries. Strawberries as far as the eye can see. Strawberries farms are actual rather bland scenery, especially after miles and miles of them. What world would have transpired if I had biked East? Too many variables to ever know, but I would’t be writing about strawberries.

A girl with a black cowboy hat walks past on her way to her room. I can’t help but stopping to smile. I think I like hotel hobbies. I never knew hotel hobbies were at all interesting. Imagine the world that would have been if I had brought a better jacket was still at the beach.

On my way back, I saw an old farmer on the side of the road selling strawberries and avocados on the side of the road I was biking on. I decide to stop and buy a box of strawberries. Best decision of the day. They were the best strawberries I’ve ever had in my life. Maybe that experience removed the stigma in my mind regarding eating food from some guy selling it on the street. Oh imagine the world that may now be! Maybe one day as an old man I’ll die from eating a bad breakfast burrito from a man on the side of the road, and it’ll all trace back to choosing to bike South instead of East today.

When we were sailing out to Santa Cruz island, we ended up getting there nearly an hour late, because we kept stopping to observe things. First, we stopped to see some sea lions sunning themselves on a big orange buoy floating out in the ocean. Second, we saw a huge flock of birds, and after heading in that direction we found that they had gathered because a huge pack of dolphins were hunting in the area. We slowed down and let the dolphins follow along in the wake of boat. Finally, when we had almost made it to the island, we came across the four grey whales swimming past. We stopped and floated for quite a while, watching them come to the surface for a few moments. They would come up every ten seconds or so three or four times, taking a breath each time, and then dive deeper, staying under for around five minutes.

The next day, sitting on the side of a road in a completely unremarkable location, eating a box of strawberries, I had my third realization. I was comparing the experience on the boat with the experience of seeing dolphins and whales at Sea World. The experience on the boat was amazing, and the strength of that experience was partly because of the serendipitous nature of it. There hadn’t been a whale spotting for days, and the fact we ran across four of them like that was somewhat rare on this particular boat trip. We live in an on-demand culture. We want entertainment now. We want to see a whale now. We want to hear a story now. We want to micromanage and plan our lives. The best experiences are the ones that were unplanned though. The best experiences are the ones where you can’t guarantee you’ll see something, or that something will happen, but you place yourself in a situation where something could happen, and if that something does, it’s so much better than the canned on-demand version. Oh serendipity! How so many of the good things in our lives can be traced back to it.

It’s 5 hours until the end of 2013 now. I’ve apparently been typing for an hour. Nothing terribly interesting is happening in the lobby now. Someone took the chips and salsa away. The impulse to write has subsided, the beast well fed for now. Maybe these realizations can be combined to form something more coherent later. Something about how the tiniest events can change the courses of our lives, how the best things in life are when the courses of our lives are changed not from some on-demand desire being fulfilled, but because of patience and putting ourselves into situations setup to make interesting things possible. And of how traveling alone, parental upbringings and social conditioning be damned, is the best kind of travel.

An older lady that works at the lobby restaurant asks if I’m okay, since I’ve been rather quiet. I smile and assure her I’m fine. As I get up to leave, she says, “no you don’t have to leave!”. I tell her that’s okay, I’m going to wander downtown to see if anything interesting is happening. Probably not, nothing interesting happens in Ventura. She asks me if I’ll be back at the lobby for breakfast tomorrow. I tell her yes, and she wishes me a safe night.

Thoughts on Open Source Software Development

28 10 2013

The last year I did a lot of work with some small Open Source projects (Nova, Honeyd, Neighbor-Cache Fingerprinter, Node.js Github Issue Bot…). I’ve also used Linux for all of my development and have used a lot of Open Source projects in that time. In some ways I’ve come out being more of on Open Source advocate than ever, and in other ways I’ve come out a bit jaded. What does Open Sourcing a project get you?

Good thing: free feedback on features and project direction

Unless you’re Steve Jobs, you probably don’t know what customers want. If you’re an engineer like most people reading this blog, you really probably don’t know what customers want. Open Sourcing the project can provide free user feedback. If you’re writing a business application, people will tell you they want pretty graphs generated for data that you never thought would be important. If you’re writing something with dependencies, users will tell you they want you to support multiple versions of potentially incompatible libraries that you would never have bothered with on your own.

If you’ve got an IRC channel, you’ll occasionally find a person who’s more than willing to chat about his or her opinions on the project and what features they think would be useful, in addition to the occasional issue tickets and emails.

The Open Source community can be your customers when you don’t have any real customers yet.

Good thing: free testing

Everyone who downloads and uses the project becomes someone that can help with the testing effort. All software has bugs, and if they’re annoying enough, people will report them. I’ve tried to make small contributions to bigger Open Source projects by reporting issues I’ve found in things like Node.js, Express, Backtrack, Gimp, cvv8… As a result, code becomes better tested and more stable.

Good thing: free marketing

Open Sourcing the project, at least in theory, means people will use it. They’ll talk about it to their friends, they’ll write articles and reviews about it, and if the project is actually useful it’ll start gaining popularity.

Misconception: you’ll get herds of developers willing to work on your project for free

I’ve reported dozens of bugs in big Open Source projects. I’ve modified the source code of Nmap and Apache for various data collection reasons. I’ve never submitted a patch bigger than about 3 lines of code to someone else’s Open Source project. That’s depressing to admit, but it’s the norm. People will file bug tickets, sometimes offer suggestions on features, but don’t expect a herd of developers working for free and flocking to your project to make it better. Even the most hardcore Open Source advocates have their own pet projects they would rather work on than fixing bugs or writing features into yours. Not only that, the effort to fix a bug in a foreign code base is significantly higher than the effort required for the original developer of the code to fix it. Why spend 3 hours setting up the development environment and trying to fix a bug, when you can file a ticket and the guy that wrote the code can probably fix it in 3 minutes?

There are large Open Source projects (Linux, Open Office, Apache…) that have a bunch of dedicated developers. They’re the exceptions. From what I’ve seen, most Open Source projects are run by one person or a small core group.

Misconception: the community will take over maintaining projects even if the core developer leaves

We used a Node.js library called Nowjs quite a lot. It’s a wonderful package that takes away all the tedium of manual AJAX or work and makes Javascript RPC amazingly easy. It has over 2,000 followers on Github, and probably ten times that many people using it. One day the developer decided to abandon the project to work on other things; not that unusual for a pet project. Sadly, that was the death of the project. Github makes it trivial to clone the project, with a single press of a button someone could make a copy of the repository and take over maintaining and extending it. Dozens of people initially made forks of the project in order to do that, and dozens more made forks to fix bugs they found.

What’s left? A mess consisting of dozens of Github forks of the project, all with different bugs being fixed or features added, and the “official” project left abandoned in such a way no one can figure out which fork they should use. There’s no one left to merge in patches or to make project direction decisions. New users can’t figure out which fork to use and old users that actually write patches don’t know where to submit them anymore.

The developer of Nowjs moved on to develop Bridge-js. Then Bridge-js got abandoned too.

Bridge is still open source but the engineers behind Bridge have returned to school.

This pattern is almost an epidemic in Node.js. Someone creates a really amazing module, publishes it to Github and NPM, and then abandons it. Dozens of people try to take over the development, but in the end all fail (partly because Github’s lack of specifying which fork of the project is “official”, and the Open Source problem that there is no “official” fork). A dozen new people create the same basic module from scratch, most of which never become popular, and most of which also become abandoned… You see the picture.

If you sense a hint of frustration, you’d be right. On multiple occasions I had to dig through dozens of half abandoned projects trying to figure out which library I wanted to use to do something as common as SQL in Node.js.

The reason it’s an epidemic with Node is because no one is really sure what they want yet, and projects haven’t become popular enough to have momentum to continue after abandonment by their original authors. Hopefully at some point projects will acquire a big enough user base and enough developers that they can sustain themselves.

Fork is a four letter word

Even the biggest projects aren’t immune to the anarchy of forks. Libreoffice and Openoffice, GNU Emacs vs XEmacs, the list goes on. For the end user of these software suits, this is mainly annoying. I’ve switched between LibreOffice and OpenOffice more than once now, because I keep finding bugs in one but not the other.

Sometimes forks break out for ridiculous reasons. The popular IM client Pidgin was forked into the Carrier project. Why?

As of version 2.4 and later, the ability to manually resize the text input box of conversations has been altered—Pidgin now automatically resizes between a number of lines set in ‘Preferences’ and 50% of the window depending on how much is typed. Some users find this an annoyance rather than a feature and find this solution unacceptable. The inability to manually resize the input area eventually led to a fork, Carrier (originally Funpidgin). –

You can view the 300+ post argument about the issue on the original Pidgin ticket here.

The fact there’s no single “official” version of a project and the sometimes trivial reasons that forks break out cause a lot of inefficiency as bug are fixed in some forks but not others, and eventually code bases diverge so much that they also develop in one fork or another.

Misconception: people outside of software development understand Open Source

I once heard someone ask in confusion how Open Source software can possibly be secure, because can’t anyone upload backdoors into it? They thought Open Source projects were like Wikipedia, where anyone could edit the code and their changes would be somehow instantly applied without review. After all, people keep telling them, “Open Source is great, anyone can modify the code!”.

A half dozen times, ranging from family members to customers and business people, I’ve had to try and explain how Open Source security products can work even though the code is available. If people can see the code, they can figure out how to break and bypass it, right? Ugh…

And don’t even get me started on the people that will start comparing Open Source to communism.

Concluding thoughts

I believe Open Source software has plenty of advantages, but I also think there’s a lot of hype surrounding it. The vast majority of Open Source projects are hacked together as hobby projects and abandoned shortly after. A study of Sourceforge projects showed that less than 17% of projects actually become successful; the other 83% are abandoned in the early stages. Most projects only have a few core developers and the project won’t outlive their interest in it. The community may submit occasional patches, but are unlikely to do serious feature development.

Why release Open Source software then? I think the answer often becomes, “why not?”. Plenty of developers write code in their spare time. They don’t plan to make money directly from it: selling software is hard. They do it to sharpen their saws. They do it for fun, self improvement, learning, future career opportunities, and to release something into the world that just might be useful and make people’s lives better. If you’re writing code with no intention of making money off it, there’s really no reason not to release it as Open Source.

What if you do want to make money off it? Well, why not dual licence your code and have a free Open Source trial version along with an Enterprise version? You’ll get the advantages of free marketing, testing, and user feedback. There is the risk that someone will come along and extend the Open Source trial version into a program that has all of the same features, or even more features, as your Enterprise version, and this is something that needs to be considered. However, as I mentioned before, it’s hard to find people that will take over the development and maintenance of Open Source projects. I think it’s more likely that someone will steal your idea and create their own implementation than bother with trying to extend your trial version code, but I don’t have any proof or evidence of that.

Jade Mixins (blocks, attributes, and more)

24 07 2013

Thought I’d note this down for anyone else having problems with Jade mixins. It’s fairly undocumented at the moment and if you follow the documentation on the Jade github it will actually break with obscure errors which took lot of trial and error to figure it out.

Note: I’m using Jade 0.32. You’ll probably need that version or newer.

What is a mixin?

A mixin is simple method to allow reuse of HTML snippets inside of Jade templates. Lets go ahead and explain with an example. Suppose you have a page of quotes. Each quote is in it’s own section, with the author’s name in bold, and a like button that keeps track of the most liked quotes with some Javascript.

The basic syntax to define a mixin that takes in a couple of arguments is as follows,

mixin section(quote, author)
      p #{quote} – said by
        b #{author}

    span.buttonSpan Like this quote

Then to actually use the mixin, we use the (somewhat undocumented) “+” symbol as follows,

+section(“Imagination is more important than knowledge”, “Albert Einstein”)
+section(“Writing, to me, is simply thinking through my fingers.”, “Isaac Asimov”)

This will generate the HTML,

<div class=”contentBox”>
  <div class=”quoteText”>
    <p>Imagination is more important than knowledge. – <b>Albert Einstein</b></p>

  <a onclick=”quoteLiked()” class=”likeButton”>
    <img src=”images/like.png” class=”buttonIconLeft”/>
    <span class=”buttonSpan”>Like this quote</span>

Mixin arguments can be objects too

You don’t have to just pass in strings to the mixins, but you can use any Javascript objects you passed into the render call or that you created earlier in the template. This can lead to some useful mixins like this one to convert a Javascript array to a select dropdown list.

mixin listData(selectId, options)

    each obj in options>
      option(value=”#{obj}”) #{obj}
– var countries= [‘UK’, ‘USA’, ‘CANADA’, ‘MEXICO’]
+listData(“countrySelect”, countries)

What are block mixins?

The need for block mixins came up when I had my page divided into sections, with each section having a title and a few containing divs, as well as a help icon.

mixin headerWithHelp(title, helpAnchor)
      span #{title}
      a(href=”help##{helpAnchor}”, target=”_blank”)
        img.helpIcon(src=”/images/help.png”, style=”float: right;”)
    div.cardOuter(style=”display: inline-block”)

The important thing to note here is the “block” keyword at the end of the mixin definition. This will make it so the indent block after the mixin will included in that location, so you can do things like,

+headerWithHelp(“Test Section”, “testHelp”)
   p All of my content can go here now

It’s important to note that as of right now, block mixins DO NOT WORK if you use the mixin keyword to use the mixin instead of the “+” symbol shorthand (which is all I showed you in this tutorial). I believe this is a bug, you can track the status of it on the ticket I made here. Trying to use the mixin keyword to use the mixin instead of + when using a block after it will result in “Error at new JS_Parse_Error” and a stack trace.

What are mixin attributes?

Mixins have the ability to let you modify the attributes of one of the tags inside of it when you modify the attributes of the mixin. For example, suppose that you have a mixin to define a section of the page with a header that you use a lot, but you change some of the style attributes like the width and display type a lot.

mixin header(title)
    h3.sectionHeader #{title}
    div.content(style=”display: block”)

Notice the attributes keyword? Now you can use the mixin like so,

+header(“Section Title”)(style=’text-align:center; display: block; width: 500px;’)

And the style attribute will now be applied to the container div.


A final comment is that you may want to have a mixins folder inside views for the sake of organization. Then in your other jade files, you can just include the mixins you need.

include mixins/headers.jade

include mixins/quotes.jade

Neighbor Cache Fingerprinter: Operating System Version Detection with ARP

30 12 2012

I’ve released the first prototype (written in C++) of an Open Source tool called the Neighbor Cache Fingerprinter on Github today. A few months ago, I was watching the output of a lightweight honeypot in a Wireshark capture and noticed that although it had the capability to fool nmap’s operating system scanner into thinking it was a certain operating system, there were subtle differences in the ARP behavior that could be detected. This gave me the idea to explore the possibility of doing OS version detection with nothing except ARP. The holidays provided a perfect time to destroy my sleep schedule and get some work done on this little side project (see commit punchcard, note best work done Sunday at 2:00am).


The tool is currently capable of uniquely identifying the following operating systems,

Windows 7
Windows XP (fingerprint from Service Pack 3)
Linux 3.x (fingerprint from Ubuntu 12.10)
Linux 2.6 (fingerprint from Century Link Q1000 DSL Router)
Linux 2.6 (newer than 2.6.24) (fingerprint from Ubuntu 8.04)
Linux 2.6 (older than 2.6.24) (fingerprint from Knoppix 5)
Linux 2.4 (fingerprint from Damn Small Linux 4.4.10)
Android 4.0.4
Android 3.2
Minix 3.2
ReactOS 0.3.13

More operating systems should follow as I get around to spinning up more installs on Virtual Machines and adding to the fingerprints file. Although it’s still a fairly early prototype, I believe it’s already a useful enough tool that it can be beneficial, so install it and let me know via the Github issues page if you find any bugs. There’s very little existing research on this; arp-fingerprint (a perl script that uses arp-scan) is the only thing remotely close, and it attempts to identify the OS only by looking at responses to ARP REQUEST packets. The Neighbor Cache Fingerprinter focuses on sending different types of ARP REPLY packets as well as analyzing several other behavioral quirks of ARP discussed in the notes below.

The main advantage of the Neighbor Cache Fingerprinter versus an Nmap OS scan is that the tool can do OS version detection on a machine that has all closed ports. The next big feature I’m working on is expanding the probe types to allow it to work on machines that respond to ICMP pings, OR have open TCP ports, OR have closed TCP ports, OR have closed UDP ports. The tool just needs the ability to elicit a reply from the target being scanned, and a pong, TCP/RST, TCP/ACK, or ICMP unreachable message will all provide that.

The following are my notes taken from the README file,


What is the Neighbor Cache? The Neighbor Cache is an operating system’s mapping of network addresses to link layer addresses maintained and updated via the protocol ARP (Address Resolution Protocol) in IPv4 or NDP (Neighbor Discovery Protocol) in IPv6. The neighbor cache can be as simple as a lookup table updated every time an ARP or NDP reply is seen, to something as complex as a cache that has multiple timeout values for each entry, which are updated based on positive feedback from higher level protocols and usage characteristics of that entry by the operating system’s applications, along with restrictions on malformed or unsolicited update packets.

This tool provides a mechanism for remote operating system detection by extrapolating characteristics of the target system’s underlying Neighbor Cache and general ARP behavior. Given the non-existence of any standard specification for how the Neighbor Cache should behave, there several differences in operating system network stack implementations that can be used for unique identification.

Traditional operating system fingerprinting tools such as Nmap and Xprobe2 rely on creating fingerprints from higher level protocols such as TCP, UDP, and ICMP. The downside of these tools is that they usually require open TCP ports and responses to ICMP probes. This tool works by sending a TCP SYN packet to a port which can be either open or closed. The target machine will either respond with a SYN/ACK packet or a SYN/RST packet, but either way it must first discover the MAC address to send the reply to via queries to the ARP Neighbor Cache. This allows for fingerprinting on target machines that have nothing but closed TCP ports and give no ICMP responses.

The main disadvantage of this tool versus traditional fingerprinting is that because it’s based on a Layer 2 protocol instead of a Layer 3 protocol, the target machine that is being tested must reside on the same Ethernet broadcast domain (usually the same physical network). It also has the disadvantage of being fairly slow compared to other OS scanners (a scan can take ~5 minutes).

Fingerprint Technique: Number of ARP Requests

When an operating system performs an ARP query it will often resend the request multiple times in case the request or the reply was lost. A simple count of the number of requests that are sent can provide a fingerprint feature. In addition, there can be differences in the number of responses to open and closed ports due to multiple retries on the higher level protocols, and attempting to send a probe multiple times can result in different numbers of ARP requests (Android will initially send 2 ARP requests, but the second time it will only send 1).

For example,

Windows XP: Sends 1 request

Windows 7: Sends 3 if probe to closed port (9 if probe to open port)

Linux: Sends 3 requests

Android 3: Sends 2 requests the first probe, then 1 request after
A minimum and maximum number of requests seen is recorded in the fingerprint.

Fingerprint Technique: Timing of ARP Request Retries

On hosts that retry ARP requests, the timing values can be used to deduce more information. Linux hosts generally have a constant retry time of 1 second, while Windows hosts generally back off on the timing, sending their first retry after between 500ms and 1s, and their second retry after 1 second.

The fingerprint contains the minimum time difference between requests seen, maximum time difference, and a boolean value indicating if the time differences are constant or changing.

Fingerprint Technique: Time before cache entry expires

After a proper request/reply ARP exchange, the Neighbor Cache gets an entry put in it for the IP address and for a certain amount of time communication will continue without additional ARP requests. At some point, the operating system will decide the entry in the cache is stale and make an attempt to update it by sending a new ARP request.

To test this a SYN packet is sent, an ARP exchange happens, and then SYN packets are sent once per second until another ARP request is seen.

Operating system response examples,

Windows XP : Timeout after 10 minutes (if referred to)

Windows 7/Vista/Server 2008 : Timeout between 15 seconds and 45 seconds

Freebsd : Timeout after 20 minutes

Linux : Timeout usually around 30 seconds
More research needs to be done on the best way to capture the values of delay_first_probe_time and differences between stale timing and actually falling out of the table and being gc’ed in Linux.

Waiting 20 minutes to finish the OS scan is unfeasible in most cases, so the fingerprinting mode only waits about 60 seconds. This may be changed later to make it easier to detect an oddity in older windows targets where cache entries expire faster if they aren’t used (TODO).

Fingerprint Technique: Response to Gratuitous ARP Replies

A gratuitous or unsolicited ARP reply is an ARP reply for which there was no request. The usual use case for this is notification of machines on the network of IP changes or systems coming online. The problem for implementers is that several of the fields in the ARP packet no longer make much sense.

Who is the Target Protocol Address for the ARP packet? The broadcast address? Zero? The specification surprisingly says neither: the target Protocol address should be the same IP address as the Sender Protocol Address.

When there’s no specific target for the ARP packet, the Target Hardware Address also becomes a confusing field. The specification says it’s value shouldn’t matter, but should be set to zero. However, most implementations will use the Ethernet broadcast address of FF:FF:FF:FF:FF instead, because internally they have some function to send an ARP reply that only takes one argument for the destination MAC address (and is put in both the Ethernet frame destination and the ARP packet’s Target Hardware Address). We can also experiment with setting the Target Hardware Address to the same thing as the Sender Hardware Address (the same method the spec says to use for the Target Protocol field).

Even the ARP opcode becomes confusing in the case of unsolicited ARP packets. Is it a “request” for other machines to update their cache? Or is it a “reply”, even though it isn’t a reply to anything? Most operating systems will update their cache no matter the opcode.

There are several variations of the gratuitous ARP packet that can be generated by changing the following fields,

Ethernet Frame Destination Address : Bcast or the MAC of our target

ARP Target Hardware Address : 0, bcast, or the MAC of our target

ARP Target Protocol Address : 0 or the IP address of our target

This results in 36 different gratuitous packet permutations.

Most operating systems have the interesting behavior that they will ignore gratuitous ARP packets if the sender is not in the Neighbor Cache already, but if the sender is in the Neighbor Cache, they will update the MAC address, and in some operating systems also update the timeouts.
The following sequence shows the testing technique for this feature,

Send ARP packet that is known to update most caches with srcmac = srcMacArg Send gratuitous ARP packet that is currently being tested with srcmac = srcMacArg + 1 Send probe packet with a source MAC address of srcMacArg in the Ethernet frame

The first packet attempts to get the cache entry into a known state: up to date and storing the source MAC address that is our default or the command line argument –srcmac. The following ARP packet is the actual probe permutation that’s being tested.

If the reply to the probe packet is to (srcMacArg + 1), then we know the gratuitous packet successfully updated the cache entry. If the reply to the probe is just (srcMacArg), then we know the cache was not updated and still contains the old value.

The reason the Ethernet frame source MAC address in the probe is set to the original srcMacArg is to ensure the target isn’t just replying to the MAC address it sees packets from and is really pulling the entry out of ARP.

Sometimes the Neighbor Cache entry will get into a state that makes it ignore gratuitous packets even though, given a normal state, it would accept them and update the entry. This can result in some timing related result changes. For now I haven’t made an attempt to fix this as it’s actually useful as a fingerprinting method in itself.

Fingerprint Technique: Can we get put into the cache with a gratuitous packet?

As mentioned in the last section, most operating systems won’t add a new entry to the cache given a gratuitous ARP packet, but they will update existing entries. One of the few differences between Windows XP and FreeBSD’s fingerprint is that we can place an entry in the cache by sending a certain gratuitous packet to a FreeBSD machine, and test if it was in the cache by seeing if a probe gets a response or not.

Fingerprint Technique: ARP Flood Prevention (Ignored rapid ARP replies)

RFC1122 (Requirements for Internet Hosts) states,

“A mechanism to prevent ARP flooding (repeatedly sending an ARP Request for the same IP address, at a high rate) MUST be included. The recommended maximum rate is 1 per second per destination.”

Linux will not only ignore duplicate REQUEST packets within a certain time, but also duplicate REPLY packets. We can test this by sending a set of unsolicited ARP replies within a short time range with difference MAC addresses being reported by each reply. Sending a probe will reveal in the probe response destination MAC address if the host responds to the first MAC address we ARPed or the last, indicating if it ignored the later rapid replies.

Fingerprint Technique: Correct Reply to RFC5227 ARP Probe

This test sends an “ARP Probe” as defined by RFC 5227 (IPv4 Address Conflict Detection) and checks the response to see if it confirms to the specification. The point of the ARP Probe is to check if an IP address is being used without the risk of accidentally causing someone’s ARP cache to update with your own MAC address when it sees your query. Given that you’re likely trying to tell if an IP address is being used because you want to claim it, you likely don’t have an IP address of your own yet, so the Sender Protocol Address field is set to 0 in the ARP REQUEST.

The RFC specifies the response as,

“(the probed host) MAY elect to attempt to defend its address by … broadcasting one single ARP Announcement, giving its own IP and hardware addresses as the sender addresses of the ARP, with the ‘target IP address’ set to its own IP address, and the ‘target hardware address’ set to all zeroes.”

But any Linux kernel older than 2.6.24 and some other operating systems will respond incorrectly, with a packet that has tpa == spa and tha == sha. Checking if tpa == 0 has proven sufficient for a boolean fingerprint feature.


Feedback from higher protocols extending timeout values

Linux has the ability to extend timeout values if there’s positive feedback from higher level protocols, such as a 3 way TCP handshake. Need to write tests for this and do some source diving in the kernel to see what else counts besides a 3 way handshake for positive feedback.


Infer Neighbor Cache size by flooding to cause entry dumping

Can we fill the ARP table with garbage entries in order for it to start dumping old ones? Can we reliably use this to infer the table size, even with Linux’s near random cache garbage collection rules? Can we do this on class A networks, or do we really need class B network subnets in order to make this a viable test?