Paul SmithHealthCare.gov, DNC, EveryBlockhttps://pauladamsmith.com/2023-10-18T14:12:15ZPaul Smithpaulsmith@pobox.comThe 10 Year Anniversary of the HealthCare.gov Rescuehttps://pauladamsmith.com/blog/2023/10/the-10-year-anniversary-of-the-healthcare.gov-rescue.html2023-10-18T14:12:15Z<p>Ten years ago today, on Friday, October 18, 2013, the effort to <a href="/blog/2014/03/fixing-healthcare.gov.html">fix
HealthCare.gov</a> began in earnest. At
7:15 A.M. Eastern time, a small group assembled next to the entrance to the West
Wing of the White House. The group included Todd Park, Brian Holcomb, Gabriel Burt,
Ryan Panchadsaram, Greg Gershman, and myself. Later in the day we were joined by
Mikey Dickerson via a long-running speakerphone call. Some of us were from
outside of government (Gabe, Brian, Mikey, and me), and the others had jobs in
government at the time (Todd, Ryan, and Greg). What we all had in common was
that we were experienced technologists, having been at startups or at large
established technology organizations.</p>
<p>The members of our group were selected by Todd, working with Greg and Ryan and
others behind the scenes to identify people who could help because they had that
kind of technology experience. HealthCare.gov, having launched days earlier on
October 1, 2013, wasn't working. From the vantage from the top of the political
leadership in the country, it was clear outside help was needed. Todd, the CTO
of the United States at the time, which is a position in the White House, was
tapped to help fix it. His plan was to provide reinforcements to complement the
team of government employees and contractors that had built HealthCare.gov and
were in the midst of operating it. We were to be a small team, very discreet.
Todd was our leader. It was already a high-pressure, stressful situation, so
insertion into that context meant melding in, not blowing things up. It was to
be a low-key mission of information gathering and assessment, not the cavalry
storming in. Todd told us the next 30 days would be critical. The goals were to
enroll 7 million people by March 31, 2014, the end of the period known as open
enrollment. When the media was eventually informed of our existence by the White
House, we were referred to as "the tech surge".</p>
<p>When Todd called me two days prior on October 16 to ask me if I would join the
effort, he didn't have to explain the stakes. I understood what it meant for the
website to work. I immediately agreed, and put on hold what I was doing, which
happened to be raising money for a startup I had founded. I took Todd's call
while walking the grounds of the Palace of Fine Arts in San Francisco, having
met with VCs earlier in the day. I was living in Baltimore at the time with my
wife and toddler daughter. I flew back home right away, and before I knew it I
was taking the earliest morning train I could from Camden Yards to DC's Union
Station. I thought I had timed it right, but still wound up running across
Pennsylvania Avenue so as not to be late.</p>
<figure>
<img src="/images/hc.gov-10-years/whitehouse.jpg"
alt="Photo of the White House">
<figcaption>Photo by me as I hustled. Metadata says taken at 7:11 A.M., so must have just made it</figcaption>
</figure>
<p>We couldn't have started any sooner even if we had wanted to. The federal
government had shut down on October 1, the same day HealthCare.gov had launched.
The shutdown prevented anyone from outside the main team working on
HealthCare.gov from coming in to help. So while days passed with the news
dominated by the twin stories of the shutdown and the slow-moving catastrophe of
the launch, a vacuum of information formed, as well as a surplus of speculation
and worry. The White House couldn't figure out what was wrong with it, and the
implications of it failing were troubling. The Affordable Care Act, the
signature domestic policy achievement of President Obama's tenure, had gone into
effect, and the website was to be the main vehicle for delivering the benefits
of the law to millions of people. If they didn't know what was wrong with
HealthCare.gov, other than that it was manifestly not working, plain for
everyone to see, and therefore they might not be able to fix it, what would that
mean for the the fate of health care reform? Fortunately, the shutdown ended on
October 17, which meant we could get to work and finally understand what was
going on.</p>
<p>Some of us already knew each other, but everyone was new to someone else. We
introduced ourselves, headed inside, and after breakfast in the Navy Mess,
headed upstairs to the Chief of Staff's office. Denis McDonough shook our hands
as we somewhat awkwardly stood in a line. He asked us directly, "can you fix
it"? In our nervous energy, I remember some of us blurting out, "yes". We had
confidence, but we also were eager to dive in, learn as much as we could, and
get going.</p>
<p>A van was procured, along with a driver. We piled in and headed across the Mall
to the Hubert H. Humphrey Building, headquarters of the US Department of Health
& Human Services (HHS). Entering the lobby, we passed Secretary Kathleen
Sebelius. She wasn't there for us, she was welcoming federal employees back to
work after the shutdown. Our meeting there was with Marilyn Tavenner and her
staff. Tavenner was the Administrator of the Centers for Medicare & Medicaid
Services (CMS), the largest organization in HHS, and the owner of
HealthCare.gov.</p>
<figure>
<img src="/images/hc.gov-10-years/visitor-badge.jpg"
alt="Photo of my faded visitor badge to HHS">
<figcaption>My thermal printed visitor badge photo is pretty faded after 10 years</figcaption>
</figure>
<p>Our meeting with Tavenner and her team yielded our first details of the system
that was HealthCare.gov. We learned how it was structured and what its main
functional components were. We heard first hand what CMS leadership knew about
what was wrong, or at least where in the system they could see that things were
not working. It was our first sense of the size and complexity of the site,
both in terms of functionality, but also in terms of the number of contractors,
sub-components, and business rules such as for eligibility. But remember, these
were not technical people - they were health policy experts and administrators.
Most of the things they were reporting to us were business metrics about the
site, descriptions of high-level performance. It was helpful to hear this
perspective, and indeed many of these metrics would drive our later work. But
this would not yet be the time to learn about the technical challenges the team
was facing.</p>
<p>Our morning continued back in the van, leaving DC for CMS headquarters in
Woodlawn, Maryland, just outside Baltimore, a 45-minute drive. Here we met with
the leadership of HealthCare.gov itself, a group of people including Michelle
Snyder, CMS's COO, Henry Chao, one of its main architects, and Dave Nelson, a
director with a telecom background, who was being elevated and would oversee
much of the rescue work from CMS's perspective. The sketchy picture of
HealthCare.gov we had was coming into greater relief. We learned about
deployment challenges and bottlenecks, more about where specifically users were
getting "stuck" using the site, and we started to hear bits and pieces about the
particular technologies being used, including something called MarkLogic, an XML
database, which was new to us. We even started to get some details about the
deployment architecture and the types of servers involved. Again, the theme of
complexity stood out. But we were also still talking at a fairly high level. To
really understand what was wrong with HealthCare.gov, we'd have to move on.</p>
<p>The afternoon was spent a few miles down the road back toward DC in Columbia,
Maryland, at something called the XOC, or Exchange Operations Center. This was
to be the mission-control-style hub of operations for HealthCare.gov. We found a
room staffed with a few contractors and some CMS employees. (The XOC eventually
would be the site of so much activity during the rescue that a scheme to keep
people out was needed.) Here was where, away from the leadership-filled rooms,
we finally heard from technologists who were directly working on the site. What
we heard was troubling. There was a lack of confidence in making changes to the
source code. A complex code generation process governed much of the site and
produced huge volumes of code, requiring careful coordination between teams.
Testing was largely a manual process. Releases and deployment required lengthy
overnight downtimes and often failed, required equally-lengthy rollbacks. The
core data layer lacked a schema. Provisioning hosting resources was slow and not
automated. Critically, there was a pervasive lack of monitoring of the site. No
APM, no bog-standard metrics about CPU, memory, disk, network, etc., that could
be viewed by anyone on the team. By this point, the looks we exchanged amongst
ourselves betrayed our fears that the situation was much worse than we initially
expected. I carried with me that day a Moleskin notebook that I furiously
scribbled in so as not to forget what I was hearing. You can see as the day worn
on my panic start to rise reflected in my writing.</p>
<figure>
<img src="/images/hc.gov-10-years/no-metrics.jpg"
alt="Photo of notebook writing that says 'No metrics!!'">
<figcaption>Incredulous that they didn't have monitoring</figcaption>
</figure>
<p>With that dose of reality and the evening setting in, we collected ourselves
back in the van and set off for our final field trip of the day, to suburban
Virginia for the offices of CGI, the prime contractor of HealthCare.gov. In
Herndon, we were greeted by many people on their leadership team and the
technical teams who had built the site, even though it was getting on past
business hours at this point. This was as much an interview of us by them as it
was a chance for us to ask questions. We had a brief opportunity to ingratiate
ourselves and win their trust. We did that in part by showing our eagerness to
dig in on some of the things we had learned earlier in the day, proving our
engineering bona fides (specifically with high-traffic websites), and
emphasizing that we were not there for any other reason than to help. This was a
team on edge, under the gun and exhausted. We needed them to succeed, and they
weren't going to work with us if they thought we were there to cast blame.</p>
<p>We asked many questions, but they mostly boiled down to, what's wrong with the
site, and how do you know what's wrong? Show us where in the system this or that
component isn't performing the way you expected. And they and CMS mostly
couldn't do that. They had daily reporting and analytics that produced those
high level business metrics. But again there was that lack of monitoring of the
system itself, real-time under load. So we focused on that.</p>
<p>It was getting late. Could we throw a Hail Mary? Walk out of that building
leaving behind something of tangible value, something promising they could build
on? We said, there's lots of APM-style monitoring services, but we're familiar
with New Relic. Could we install it on a portion of the servers? Yes, we know
there's a complicated and fragile release process, but if we bypassed that and
just directly connected to some of the machines, we could install the agent on
them and be receiving metrics almost immediately. Glances were exchanged. A CMS
leader in the room made a call on their cell phone - we actually have some
New Relic licenses already, we can use those. There was also a hestitancy to see
if this kind of extraordinary, out-of-the-norm request would be approved -
clearly, even during this period of turmoil, all the stakeholders stuck to the
regular release script. The CMS leader nodded their approval. A small group
assembled to marshall the change through. Many folks had stuck around, even
though it was nearing midnight. Then on a flatscreen in the conference room, we
pulled up the New Relic admin, and within moments, the "red shard" (the subset of
servers we had chosen for this test) was reporting in. And there it was - we
could see clearly the spike in request latency, even at this late hour, that
indicated a struggling website, along with error rate, requests per minute, and
other critical details that had basically been invisible to the team to that
point. Imagine a hospital ICU ward without an EKG monitor. Now that they knew
exactly in what ways it was bad, they could start to correlate them with the
business metrics and other aspects of the site that needed improving. They could
then prioritize the fixes and actions that would yield the biggest improvements.</p>
<figure>
<img src="/images/hc.gov-10-years/rooseveltroom.jpg"
alt="Photo of the Roosevelt Room in the White House">
<figcaption>Photo by me. Taken at 1:50 A.M. on Saturday, October 19, 2013.</figcaption>
</figure>
<p>That would come later. For now, exhausted, well past midnight, we left Herndon
and rode the van back to DC. At roughly 2 A.M. we reconvened in the Roosevelt
Room in a completely silent White House for a quick debrief. As we reflected on
what we had just experienced, another thought was settling in over the group -
that this was obviously not over by any stretch, that none of us were going home
any time soon, that the challenge was much larger in scope than we had imagined,
that any notion we may have had at the outset of possibly just offering some
suggested fixes and moving on was in retrospect hilariously naive, that this was
all we were going to be doing for the foreseeable future until the site was
turned around. Indeed, we all managed to find a few hours of sleep in a nearby
hotel and then were right back at it in the morning, heading straight out to
Herndon.</p>
<p>This was just day one, a roughly 18-hour day, and it certainly wasn't the last
such marathon. Over the next two-and-a-half months until the end of December,
the tech surge expanded and took on new team members, experienced many remarkable
events and surprises, and ultimately, successfully helped turn HealthCare.gov
around. Millions enrolled in health care coverage that year, many for the
first time. I hope to tell more stories of how that happened over the next
few weeks and months.</p>
<iframe src="https://www.google.com/maps/d/u/0/embed?mid=1fMkJP6sSr7p8hCDUXwNbX6cnCsNQuqg&ehbc=2E312F&noprof=1" width="770" height="480"></iframe>
Oppenheimerhttps://pauladamsmith.com/blog/2023/07/oppenheimer.html2023-07-24T23:00:00Z<p>I saw “<a href="https://letterboxd.com/film/oppenheimer-2023/">Oppenheimer</a>” this weekend at the <a href="https://musicboxtheatre.com">Music Box</a> in Chicago, in its theater capable of projecting 70mm film. The movie is a huge achievement, a complex piece of art that nonetheless tells an efficient story in spite of a 3 hour running time. Here are a few thoughts on the visual storytelling that was presented on screen.</p>
<p>The central question “Oppenheimer” asks is, to whom or what is J. Robert Oppenheimer bound? Director Christopher Nolan spares us a hoary answer like “the truth!”, but early on we are told it is to theory and mathematics, wherever they may lead. They first lead Oppenheimer out of the lab and into the arms of pre-war continental quantum physicists, who are forging a nascent field of inquiry that our hero immediately grasps and excels at. He becomes friends with physicist Isidor Rabi while in Germany, bonding over their shared New York Jewish backgrounds. We see Rabi nurture Oppenheimer several times, offering food from his pocket and quiet counsel, evincing an almost maternal protective quality. Oppenheimer, back in the States, forms allegiances with students and organizers of various labor movements, including the Communist Party, but weakly, always as or via a proxy (using the party as a channel to fund republicans of the Spanish Civil War; as a means to start his relationship with Jean Tatlock). Several scenes hint at deeper connections, but cut away before he does anything incriminating (“but that would be treason …”). Of course his “true” allegiance or lack of to the Communist Party and to the Soviet Union hangs over the balance of the post-test movie, which seems content to leave it unanswered, or perhaps, answered sufficiently by his other deeds, which several characters make voice of.</p>
<p>What of Oppenheimer’s non-professional bonds? He’s prepared to boil off his child like so many neutrons in a fission reaction, driving the colicky baby to a friend’s house in hopes of being relieved of parental duties. In spite of his affairs and scarce evidence of marital happiness, his relationship with his wife Kitty is shown to endure (“we’re adults, we’ve been through fire together”, their own form of fusion), surviving at least to his public image rehabilitation late in life. Frank Oppenheimer, kept to the periphery early on, becomes essential to the triumph at Los Alamos, reuniting the brothers on the same mesa where they forged a connection to the land, a feel for the weather of the desert. He pursues Jean Tatlock with flowers and is repelled; she later makes her own pursuit, reminding him of his off-hand oath, “you said you’d always answer” — a promise he’s now incapable of keeping, acted on by multiple forces much stronger than she.</p>
<p>Oppenheimer becomes an attractive force himself to build Los Alamos, cajoling scientists and convincing the US military, overcoming each group’s respective reservations, the former about the endeavor itself, the latter about him. He’s the nucleus of the most important thing that’s ever happened, to quote Gen. Groves, but we see him often distracted, gazing into the middle distance, drifting off to Chicago or San Francisco, giving misleading testimony to the army’s quietly menacing interrogator. Still, it’s Oppenheimer keeping the energetic particles of scientists around him from flying off or creating ruinous inter-personal explosions. They eke out just enough collaboration and luck to blow up the gadget before Potsdam — immediately, the military dissolves their bond to the man who secured their supremacy (“we got it from here”). We see Oppenheimer unmoored, isolated, radiating out his misgivings and the horror of his revelations.</p>
<p>In a pivotal scene, Oppenheimer dismisses a report of a nuclear chain reaction, stating that theory proves it can’t be so. When his neighbor colleague reproduces the experiment, he immediately forms a new theoretical understanding from it — the bond to pure theory is broken, and a new one connecting theory and practice is made. It takes him no time again to accelerate to the logical end of the implications, and this time, theory must wait for practice to catch up. When the new experiment is finished, first at Trinity and then at Hiroshima and Nagasaki, it’s no longer about what the physics demonstrates, but what it means for the notion of humanity and civilization: his revulsion at the reception of the bomb among his peers and the public is shown itself as a terrible blinding fire, one that now lives within him.</p>
<h2>Stray observations</h2>
<ul>
<li>“Oppenheimer” centers language throughout, and positions Oppenheimer as a language savant in technical ways, but deficient in others. He learns enough Dutch to teach physics in Europe. He reads Marx in the original German. He quotes Sanskrit to his lover. He’s also a translator, bringing the foreign language of quantum mechanics to the United States, and bridging the gap between academics and the military. He fails to learn the language of Washington, and his character is assassinated as a result. Kitty is bedeviled, who can’t understand how a man can be so proficient in one domain and so passive when it comes to himself and his family.</li>
<li>There’s a rich symbolic history to the apple, and one figures prominently in early scenes. First as an impulsive attempt on his professor’s life. We see Oppenheimer stab the apple, which is a pre-quantum apple, Newton’s apple; the needle with which he injects the cyanide is a dagger in Newtonian physics, which Einstein, whom we encounter multiple times including in despair in the final scene, inflicted the first mortal wound, but which quantum mechanics and the bomb finally killed. And then as the poison fruit of knowledge carried by Niels Bohr, who introduces him to quantum mechanics, which leads to our expulsion from Eden when the atomic weapon is used.</li>
<li>The circular badges that the scientists wore at Los Alamos, which had labels like “K-16” and “C-43”, which I’m sure were for some organizational purpose the film doesn't explain (as far as I remember), made me think of them like personified isotopes from the periodic table of the elements.</li>
<li>The act III Strauss plot was less successful to me than the rest of the film. The revelation that he was humiliated and vindictive and then used the apparatus of official power to seek his revenge didn’t land for me in the way I think was intended, perhaps because, while certainly despicable, it’s not shocking nor particularly novel, even discounting our recent experiences with vengeful politicians. Setting that aside, from a storytelling point of view, as an extended denouement after the wallop of the Trinity event, it has to work extra hard to sustain a clear narrative focus, and I felt the film suffered overall for it.</li>
<li>Kudos to Jennifer Lame, who edited the film, for making scene after scene of extensive dialogue so compelling and propulsive. And to Hoyte van Hoytema, the director of photography — it’s tonally gorgeous and lightly under-saturated in a way that serves the somber mood. What more can be said about 70mm film? It just glows. It’s very much worth seeing in a theater.</li>
</ul>
Fixing bufferbloat on your home network with OpenBSD 6.2 or newerhttps://pauladamsmith.com/blog/2018/07/fixing-bufferbloat-on-your-home-network-with-openbsd-6.2-or-newer.html2018-07-03T20:22:00Z<p>My home network (which is also my work network) is a standard-issue Comcast cable hookup. In spite of a tolerable 120 megabits down, my experience of daily Internet use is regularly frustrating. Video streams and video chats drop in quality inexplicably. SSH sessions become laggy. Web pages fail to load quickly, and then seem to appear all at once. Even though I should have plenty of bandwidth, the feeling is often one of slowness, waiting, data struggling to get through the pipes.</p>
<p>The reason for this is a phenomenon called "bufferbloat". I'm not going to explain it in detail, there are plenty of good resources to read about it, including the eponymous <a href="https://www.bufferbloat.net/projects/bloat/wiki/Introduction/">Bufferbloat.net</a>. Bufferbloat is the result of complex interactions between the software and hardware systems routing traffic around on the Internet. It causes higher latency in networks, even ones with plenty of bandwidth. In a nutshell, software queues in our routers are not letting certain packets through fast enough to ensure that things feel interactive and responsive. Pings, TCP ACKs, SSH connections, are all being held up behind a long line of packets that may not need to be delivered with the same urgency. There's enough bandwidth to process the queue, the trick is to do it more quickly and more fairly.</p>
<p>Fortunately, because bufferbloat is in part a function of how we configure our routers, it's within our ability to solve the problem. But first, we have to diagnose it, and establish a concrete baseline to improve from. The <a href="http://www.dslreports.com/speedtest/">speed test at dslreports.com</a> tests for bufferbloat in addition download and upload speeds, so we'll use that tool to see how we're doing.</p>
<p>First, I run the speed test, and get the following results:</p>
<p><img src="/images/bufferbloat/before.png" alt="speed test results - before fixes" /></p>
<p>Here you can see the issue starkly: 120 Mbps down and 12 Mbps up yields an "A+" grade (debatable), but we get an "F" for bufferbloat.</p>
<p>We define bufferbloat here as the increased latency of a standard ping while downloading or uploading a large file over ping times while otherwise quiescent.</p>
<p>In our case, our idle latency is 12ms average, a download bloat of about 660ms, and an upload bloat of about 280ms, on average.</p>
<p>The fix is to apply a queue management strategy to our router. Ordinarily, I'd be wary of this. In my experience, QoS administration tends to be fussy and full of unintended consequences. I always felt as if I had cast too broad a net, inadvertantly degrading overall network performance to get slightly better results from one application. And I wasn't sure around what fixed-point I was optimizing. In this case, bufferbloat gives us the measurable target. Administration is made much easier by the appearance of a new algorithm that's easy to apply to network interfaces. It doesn't require much tuning, and you don't need to futz with individual ports or percentages.</p>
<p>Details vary widely by router operating system and administrative UIs. In our case, the router is running <a href="http://openbsd.org/">OpenBSD</a>. (And if yours isn't, why not? Get a <a href="https://www.pcengines.ch/apu2.htm">PC Engines board</a>, throw obsd on it, and you have an inexpensive solution with world-class security, efficiency, and performance, that's simple to operate and well-documented.) The OpenBSD way of being a router is through its <a href="https://www.openbsd.org/faq/pf/"><code>pf</code></a> system, which is analogous to Linux's iptables, but much more capable and efficient. Since <a href="https://www.openbsd.org/62.html">6.2</a>, <code>pf</code> has implemented something called "FQ-CoDel", which is an algorithm for scheduling packets fairly and is designed to prevent bufferbloat. It is exposed via the <code>flows</code> option on a <code>queue</code> rule. In principle, all we need to do is add two rules, one to fix uplink bufferbloat and one to fix downlink. Let's see how this goes.</p>
<p>In our <code>/etc/pf.conf</code>, we first add a single line to handle the uplink. This will apply a FQ-CoDel queue to the network interface attached to our WAN link, or the cable modem in our case. The way to think about it is, FQ-CoDel is strategy applied to outbound packets only, as they exit the interface, so even though the WAN interface is duplex up and down, in order to handle the downlink part we'll apply it to the network interface connected to our LAN, which we'll do next.</p>
<p>An important detail. In order for the queue algorithm to do its thing, it needs to know the bandwidth of the outbound link. According to Mike Belopuhov, the implementor of FQ-CoDel in OpenBSD, we need to <a href="https://www.reddit.com/r/openbsd/comments/75ps6h/fqcodel_and_pf/doca4uv/">specify 90-95% of the available bandwidth</a>. Fortunately, we've just measured it.</p>
<p>The line to add to <code>pf.conf</code> to fix bufferbloat on the uplink is (assuming <code>em0</code> for the WAN interface):</p>
<pre><code>queue outq on em0 flows 1024 bandwidth 11M max 11M qlimit 1024 default
</code></pre>
<p>A couple of notes. <code>outq</code> is a label we give, but it's an opaque string to <code>pf</code>. <code>11M</code> means 11 megabits per second (92% of the uplink bandwidth). <code>qlimit</code> is also specified explicitly, because its default value of 50 is too low for FQ-CoDel. The <code>default</code> keyword is required.</p>
<p>And that's it: we don't need to alter our filtering rules to assign packets to a queue: all outbound packets on this interface are assigned to our new queue.</p>
<p>Now let's reload <code>pf</code> with the config change, and re-run the speed test.</p>
<pre><code>$ doas pfctl -n -f /etc/pf.conf && doas pfctl -f /etc/pf.conf
</code></pre>
<p><img src="/images/bufferbloat/after-uplink.png" alt="speed test results - after uplink fix" /></p>
<p>Uplink latency under load is now down to 17ms on average, from 280ms. That's a mere 5ms worse than idle.</p>
<p>(I discount the apparent decrease in uplink bandwidth from this test result. In my experience, dslreports.com could vary by 10-15% in reported bandwidth run-to-run, but over time it converged on 12 Mbps.)</p>
<p>The downlink fix is nearly the same, we just adjust for the name of the interface (the LAN NIC is called <code>em1</code>) and for its 90-95% bandwidth upper bound, which is 110 Mbps.</p>
<pre><code>queue inq on em1 flows 1024 bandwidth 110M max 110M qlimit 1024 default
</code></pre>
<p>Reload, re-run:</p>
<p><img src="/images/bufferbloat/after-downlink.png" alt="speed test results - after downlink fix" /></p>
<p>Always nice to get an A. Downlink latency under load is now 24ms, from 660ms.</p>
<p>I haven't elided much, I think that's a pretty decent result for two lines of config. If you want to go further, there's a <code>quantum</code> knob to turn (baseline is your NIC's MTU, but look at what OpenWRT does for guidance), but that's about it.</p>
<p>Post-fix, my observation is that things feel much snappier. Aside from the ping time improvements, I don't have other measurements to cite. But so far, FQ-CoDel seems to have fixed bufferbloat on my network and made for a substantially better experience.</p>
2016, my year in reviewhttps://pauladamsmith.com/blog/2017/01/2016-my-year-in-review.html2017-01-02T02:12:15Z<h2>Sewing</h2>
<p>After years of looking at the sewing machine in the closet and saying I should
learn how to use it, I did something about it this year. Michelle got me a class
at a local crafts store as a gift, and I made a pillow cover. It turned out
pretty good, and I enjoyed doing it, so I kept at it. I didn't have a ton of
time this year to devote to it, but by the end of the year I had made some
blankets for friends and <a href="http://www.instructables.com/id/How-to-make-a-Quillow-blanketpillow/">quillows</a> for the girls, a pair of pajama
pants for Michelle, and repaired the lining of a friend's handbag. I'd like to
keep at it, maybe tackling Halloween costumes next year, and doing projects with
Maxine, like adding circuits to fabric ala <a href="https://www.adafruit.com/flora">FLORA</a>.</p>
<p><img src="/images/2016-yir/IMG_1899-COLLAGE.jpg" alt="collage of sewing" /></p>
<h2>Running</h2>
<p>I've run for exercise before, but never with any consistency. This year I set
out to run at least 2 times a week, at least 20 minutes per run. Turns out, I
liked it a lot. So much so that I was running almost every weekday by
April. Unforunately, my knee didn't like it so much, and in the middle of a run
on the Bloomingdale Trail, it suddenly seized up and I had to take an Uber
home. Luckily, I hadn't done any damage, and an orthopedist said I was just
overdoing it. I took about a month off and ramped back up slowly. By October, I
peaked for the year, doing about 10 miles per week. In November, I ran in <a href="https://www.strava.com/activities/780925903">my
first 10k, the Lincolnwood Turkey Trot</a>. All told for 2016, I logged 156
miles. For 2017, I'd like to attempt a half-marathon, and double my yearly
mileage. <a href="https://www.strava.com/athletes/14254908">Here's my Strava profile</a>.</p>
<p><img src="/images/2016-yir/IMG_2195-COLLAGE.jpg" alt="collage of running" /></p>
<h2>Ad Hoc</h2>
<p><a href="https://adhocteam.us/">The company</a> I started with Greg in 2014 began the year with 7 people and ended
with 41. The growth was thanks to winning our first contracts: we had been
around long enough as a company and had enough "past performance", as they say
in the industry, to compete and be awarded tasks on our own, instead of only
working as subcontractors. The contracts were with <a href="https://www.cms.gov/">CMS</a>, to continue our
work on HealthCare.gov, and with the Department of Veterans Affairs, to help
build Vets.gov. We also earned spots on two highly sought-after contract
vehicles, the ADELE BPA with CMS, and FLASH with the Department of Homeland
Security, that have will have opportunities for us down the road to bid on. It
was a great year for Ad Hoc, and I'm proud of what we've built: a great team,
and useful software that is delivering actual benefits and services to real
people. For example, this year, we took over the core shopping part of
HealthCare.gov, known internally as Plan Compare. During open enrollment so far,
which is still going on until the end of January, over 3.5 million households
have enrolled for plans using our software. HealthCare.gov also saw its <a href="http://www.nytimes.com/2016/12/21/us/health-exchange-enrollment-jumps-even-as-gop-pledges-repeal.html">biggest
single-day enrollment tally ever</a>, on December 15th. I'm also proud that we're
proving out the model of providing modern software engineering and design
services to the government that are efficient, work well, cost less to build and
operate, and are just better than the status quo. There's a lot of uncertainty
ahead for the programs that we're working on. There's not much we can about
that, other than continue to do the work we have in front of us until it
changes, and look for additional opportunities in state, local, and maybe
outside of government.</p>
<p><img src="/images/2016-yir/IMG_2396-COLLAGE.jpg" alt="collage of Ad Hoc" /></p>
<h2>House</h2>
<p>Michelle and I bought <a href="https://evanston.house/">a house</a> in south Evanston in January, and
we've been renovating it since. It's an old Victorian-style home from the 1890s
with a good stone foundation and timber frame. We're doing extensive changes to
the interior, with a new layout and all new flooring, doors and windows, and
systems like electrical and HVAC. The exterior is mostly unchanged from a
framing perspective, but we're updating the siding, and we dormered out part of
the sloping roof so we could have a master bedroom on the third floor. We're
also converting the garage into a two-story garage/coach house combo: the plan
is to have an office on the second floor for Michelle and me. We had hoped, at
the beginning of the year, to be moved in by fall, but these things go the way
these things go. As of this writing, we're about a month away from being able to
pull up stakes here in Logan Square. We've been working with our friend and
architect David Burns, which has been great. We spent time together with him
talking about what we wanted, and he came up with a vision, drew up detailed
plans, and has been managing the overall construction process. We hired Conrad
Szajna of <a href="http://formedspace.com/">FormedSpace</a> to be our general contractor, and he's hired a great
team of subcontractors and tradespersons to do the work.</p>
<p><img src="/images/2016-yir/23575874262_535fed7a16_o-COLLAGE.jpg" alt="collage of house" /></p>
<h2>Family</h2>
<p>The best part of my year was spending time with my family. Maxine started
Kindergarten at Lincoln Elementary School in Evanston, and Veronica has grown
into a full-fledged toddler. We didn't do as much travel as we would have liked
this year, but we enjoyed biking together (we got a trailer this year for the
girls), exploring our fair city, and working on projects, like the homemade
arcade Maxine and I have been building together.</p>
<p><img src="/images/2016-yir/IMG_1572-COLLAGE.jpg" alt="collage of family" /></p>
<p>A few other things of note from the year:</p>
<ul>
<li>We participated in the Volkswagen settlement and chose to have them buy back
our Jetta TDi. Good riddance. We bought a new Mazda CX-9 as a replacement.</li>
<li>We volunteered for the Hillary Clinton campaign, including taking Maxine up to
Kenosha, WI, to knock on doors for GOTV on election day. Well.</li>
<li>I continue to feel grateful in so many different ways, for dear old friends
and new ones we made this year, for our families immediate and extended, for
our relative health and wealth, for our general dumb luck to be this
fortunate and safe, recognizing just how contigent, random, and unlikely that is.</li>
</ul>
Looking at your program’s structure in Go 1.7https://pauladamsmith.com/blog/2016/08/go-1.7-ssa.html2016-08-16T01:42:20Z<p>Go 1.7—<a href="https://golang.org/dl/">out today!</a>—features an new
<a href="https://docs.google.com/document/d/1szwabPJJc4J-igUZU4ZKprOrNRNJug2JPD8OYi3i1K0/edit">SSA-based compiler backend</a>. SSA is a method of describing
low-level operations like loads and stores that roughly map to machine
instructions, with the special difference that SSA acts as though it has
infinite number of registers. This is not especially interesting on its own,
except that it enables a class of well-understood optimization passes that make
the resulting binary smaller in code size and faster. The new release of Go is
an indication the implementation is maturing and starting to take advantage of
techniques and practices adopted in the <a href="http://llvm.org/">wider world of compiler technology</a>.</p>
<p>In addition to the performance benefits of the new SSA-based backend, there is a
suite of new tools that allow a developer to interact with the SSA
machinery. One such tool outputs the intermediate SSA statements, optimization
passes, and resulting Go-flavored assembly. This is done by setting the
environment <code>GOSSAFUNC</code> to the name of a function to disassemble when using the
<code>go</code> tool, for example:</p>
<pre><code class="language-shell">$ GOSSAFUNC=main go build
</code></pre>
<p>This invocation will output to the terminal, but the more interesting artifact
is an HTML file, named <code>ssa.html</code>, written out to the current directory. Open
the file in your web browser and you’ll see something like:</p>
<p><img src="/images/gossa/ssa.html.png" alt="screenshot of SSA" /></p>
<p>What you are looking at is a table with many columns extending to the right,
each one except for the first and last representing an optimization pass over
the preceding SSA form. (I counted 37 separate passes.) The first column is the
the compiler’s initial, unoptimized SSA output, and the last column is the
Go-flavored assembly that will be turned into machine code for the final
compiled binary executable or shared library.</p>
<p><img src="/images/gossa/ssahtmlscroll.gif" alt="anim gif of scrolling through SSA" /></p>
<p>While this can look intimidating to the uninitiated, SSA is relatively simple by
design -- each line represents a either a value being assigned the result of an
instruction (i.e., one of the infinite number of registers), or a label of a
"basic block" (a set of statements, aka, the things between curly braces in
source code), or the exit of a basic block which jumps execution to a different
basic block (eg., control flow like an if-statement or returning from a function
call).</p>
<p>For example:</p>
<pre><code>v4 = Const64 <int> [42]
</code></pre>
<p>Means assign the 64-bit integer constant value 42 to the register labeled <code>v4</code>.</p>
<pre><code>b5: ← b4
v15 = Copy <mem> v14
v16 = StaticCall <mem> {runtime.printnl} v15
Call v16 → b6
</code></pre>
<p>Means <code>b5</code> is the label for a basic block with two statements. It concludes with
an exit <code>Call</code> instruction, taking program execution to another basic block,
<code>b6</code>, when returning from the function call that produces the <code>v16</code> value.</p>
<p>The tokens like <code>Const64</code>, <code>Copy</code>, and <code>StaticCall</code> are analogous to assembly
instructions like <code>MOV</code> and <code>LEA</code>.</p>
<p>One special operation is <code>Phi</code>, or a "Phi node". Notice that a Phi node takes
two arguments, which are two values. Also notice that a basic block with a Phi
node has two basic block labels next to its own label, unlike every other basic
block:</p>
<pre><code>b3: ← b1 b2
v20 = Phi <int> v4 v6
...
</code></pre>
<p>This is an interesting construct and it relates to program control flow. A basic
block is defined by having a single entry and a single exit point, and having a
set of statements that execute sequentially (i.e., no branching logic) in
between. And "SSA" stands for "<a href="https://en.wikipedia.org/wiki/Static_single_assignment_form">single static assignment</a>", which means
that each value is assigned one and only one time. But what do you do if you
have a reference to a variable that could have different values depending on
which branch of an <code>if</code> statement the program took? A Phi node is a way of
resolving this apparent contradiction. Since each branch of an <code>if</code> statement by
definition assigns to a unique value, a Phi node coalesces them into the final
value depending on which branch was actually taken. So you can think of it as
the run-time retrieval of a value based on some condition. This is why the block
has two dependencies at the top rather than just one.</p>
<p>Let’s write a silly program to motivate some examples:</p>
<pre><code class="language-go">package main
func main() {
x := 5
if 1 < 0 {
x = -42
}
println(x)
}
</code></pre>
<p>Let’s start with the initial basic block, <code>b1</code>:</p>
<pre><code>b1:
v1 = InitMem <mem>
v2 = SP <uintptr>
v3 = SB <uintptr>
v4 = Const64 <int> [5]
v5 = ConstBool <bool> [false]
v6 = Const64 <int> [-42]
v11 = OffPtr <*int64> [0] v2
If v5 → b2 b3
</code></pre>
<p>After some program initialization, <code>v4</code> is the assignment to the local var <code>x</code>
in our code of the constant 5. Go knows at compile-time that <code>1 < 0</code> is always
false so it just assigns false to <code>v5</code>. <code>v6</code> is the assignment of -42 to <code>x</code>
that will happen during program execution.</p>
<p>At the end we have the basic block exit, <code>If v5 → b2 b3</code>. This tests the truth
value of <code>v5</code> to decide whether to jump program execution to either <code>b2</code> (if
true) or <code>b3</code> (if false). This is similar to the following chunk of assembly:</p>
<pre><code class="language-asm"> JNZ b2
b3:
...
b2:
...
</code></pre>
<p>One nice thing about the Go SSA HTML view is you can click on any token in the
SSA form and it will highlight the references to and from that element.</p>
<p>
<img alt="clicking on SSA elements" src="/images/gossa/ssabblocks.gif" class="no-100-pc-width">
</p>
<p>We can see from the different colors how the control flow will go. You can
visually connect the blocks of code that will execute and the assignments,
function calls, and additional branching that will result.</p>
<p>Clicking on the Phi node and its dependencies, you can see from where the
possible values come from in previous control flow.</p>
<p><img src="/images/gossa/phinodehl.png" alt="highlighted Phi node" /></p>
<p>Moving on, the function call prints out the integer value is in the following
basic block:</p>
<pre><code>b4: ← b3
v9 = Copy <int> v20
v10 = Copy <int64> v9
v12 = Copy <mem> v8
v13 = Store <mem> [8] v11 v10 v12
v14 = StaticCall <mem> {runtime.printint} [8] v13
Call v14 → b5
</code></pre>
<p>The <code>StaticCall</code> instruction invokes the function from the Go runtime that is
specialized to format integer values and print them to the terminal. One
interesting thing to note is that the preamble to call sets some things up in
memory, the location of which is fed to the <code>printint</code> function. If you notice,
<code>v11</code> refers back to the value set in <code>b1</code>, which is a pointer offset from <code>v2</code>,
which was set from the stack pointer <code>SP</code> near the top of the program
initialization. Which makes sense, because the generate assembly language needs
concrete memory locations to address when invoking functions taking pointers.</p>
<p>There’s much more to investigate here, including the particular optimization
passes, and tracing how individual instructions make their way through to the
final assembly or are eliminated. But hopefully this has given you an
introduction into SSA and how it maps to constructs in your applications.</p>
Modifying a Go slice in-place during iterationhttps://pauladamsmith.com/blog/2016/07/go-modify-slice-iteration.html2016-07-25T23:26:51Z<p><strong>Update:</strong> See a better way of doing this below.</p>
<hr />
<p>I'll often have a slice that I want to filter down on, removing elements based on some test, and I would prefer to modify the slice in-place for whatever reason, either because I want to retain the reference to the original slice or I don't want to allocate a new slice as destination for the desired values.</p>
<p>You might think that modifying a slice in-place during iteration should not be done, because while you can modify <em>elements</em> of the slice during iteration if they are pointers or if you index into the slice, changing the <em>slice itself</em> by removing elements during iteration would be dangerous.</p>
<p>Here's a straightforward way to accomplish it. The idea is that, when you encounter an element you want to remove from the slice, take the beginning portion of the slice that has values that have passed the test up to that point, and remaining portion of the slice, i.e., after that element to the end, and copy them <em>over</em> the original slice. Then, assign a slice expression up to the number of values that passed the test to the original variable.</p>
<p>Here's an example. Let's say I have a slice of integers, and I only want to retain the even ones.</p>
<pre><code class="language-go">
var x = []int{90, 15, 81, 87, 47, 59, 81, 18, 25, 40, 56, 8}
i := 0
l := len(x)
for i < l {
if x[i] % 2 != 0 {
x = append(x[:i], x[i+1:]...)
l--
} else {
i++
}
}
x = x[:i]
fmt.Println(x)
// [90 18 40 56 8]
</code></pre>
<p>The <code>i</code> variable is used to keep track of the number of even values found in the slice. When an element is odd, we create a temporary slice using <code>append</code> and two slice expressions on the original slice, skipping over the current element. The temporary smaller slice is copied over the existing, shifting down the remaining values. The <code>l</code> variable makes sure we make the right number of comparisons despite moving things around. It's important to note the memory location of the original slice is unchanged with the copy. No new heap allocations are performed, even with the temporary slice.</p>
<hr />
<p><strong>Update:</strong> A number of people, including here in comments and on <a href="https://www.reddit.com/r/golang/comments/4uoqr5/modifying_a_go_slice_inplace_while_iterating_over/">the golang reddit</a>, have pointed out that the method I outline here is pretty inefficient; it's doing a lot of extra work, due to the way I'm using <code>append</code>. A <em>much</em> better way to go about it is the following, which also happens to have already been pointed out in the <a href="https://github.com/golang/go/wiki/SliceTricks#filtering-without-allocating">official Go wiki</a>:</p>
<pre><code class="language-go">y := x[:0]
for _, n := range x {
if n % 2 != 0 {
y = append(y, n)
}
}
</code></pre>
<p>This also has the benefit of being simpler and shorter. Use it instead!</p>
A simple way to limit the number of simultaneous clients of a Go net/http serverhttps://pauladamsmith.com/blog/2016/04/max-clients-go-net-http.html2016-04-13T17:55:13Z<p>This is a simple and easily generalizable way to put an upper-bound on the
maximum number of simultaneous clients to a Go <code>net/http</code> server or handler.</p>
<p>The idea is to use a counting semaphore, modeled with a buffered channel, to
cause new clients to queue which arrive after the <code>n</code>th current client, where
<code>n</code> is the size of the buffer.</p>
<p>Ideally, we wouldn't want to limit the amount of concurrency to our application,
but practically, there are limits on underlying resources, and forcing clients
to queue after a certain limit gives us control over that resource utilization.</p>
<p>Let's say we have a simple HTTP handler that requests access to some expensive
resource, like a database or complex computation:</p>
<pre><code class="language-go">package main
import (
"io"
"log"
"net/http"
)
func main() {
http.Handle("/", http.HandleFunc(func(w http.ResponseWriter, r *http.Request) {
res := getExpensiveResource()
io.WriteString(w, res.String())
})
log.Fatal(http.ListenAndServe(":8080", nil))
}
</code></pre>
<p>The handler can be requested by an unbounded number of clients, potentially
exhausting our resources.</p>
<p>Let's add a counting semaphore that will gate entry into the handler:</p>
<pre><code class="language-go">func main() {
const maxClients = 10
sema := make(chan struct{}, maxClients)
http.Handle("/", http.HandleFunc(func(w http.ResponseWriter, r *http.Request) {
sema <- struct{}{}
defer func() { <-sema }()
res := getExpensiveResource()
io.WriteString(w, res.String())
})
</code></pre>
<p>We make a channel of type <code>struct{}</code>, because we are only interested in the
send/receive semantics of the channel, not its value. The first statement of the
handler is a send on the channel, which will succeed up to <code>maxClients</code> number
of simulatenous requests. Think of the buffered channel as having empty slots,
and being able to send on it means that you can fill a slot and proceed. If
there are no empty slots, in other words, the length of the channel is equal to
the buffer size, then the send will block, and will have to wait to proceed
until a slot frees up. The next statement defers until after the handler has
returned or panicked, and frees a slot by receiving from the channel.</p>
<p>If we have more than one handler to limit access to, we can move the semaphore
into a middleware and wrap the original handler, leaving the body of it
unchanged:</p>
<pre><code class="language-go">package main
import (
"io"
"log"
"net/http"
)
func maxClients(h http.Handler, n int) http.Handler {
sema := make(chan struct{}, n)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
sema <- struct{}{}
defer func() { <-sema }()
h.ServeHTTP(w, r)
})
}
func main() {
handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
res := getExpensiveResource()
io.WriteString(w, res.String())
})
http.Handle("/", maxClients(handler, 10))
log.Fatal(http.ListenAndServe(":8080", nil))
}
</code></pre>
<p>Note that this implementation will cause clients beyond the maximum number to
queue without bound, until they hit the system limit of the <code>listen(2)</code> backlog.</p>
<p>This pattern can be used to control the amount of concurrency to any resource,
not just <code>net/http</code> handlers.</p>
The Bloomingdale Trailhttps://pauladamsmith.com/blog/2015/06/bloomingdale_trail.html2015-06-05T16:00:00Z<p>There was a moment last Friday while I was on top of the soon-to-open
<a href="http://www.bloomingdaletrail.org/">Bloomingdale Trail</a> with a tour group when I had a strange feeling. We had
been walking for more than a mile to that point, 17 feet up above Chicago
streets, passing houses, factories, and alleyways in Logan Square. I paused to
consider the feeling, and I realized it was that I had been walking continuously
for half an hour through a Chicago neighborhood and not once had to contend with
an intersection or motor vehicle in all that time. Unless you live in the city
and walk or bike often, it's hard to convey how pleasantly odd that feeling was.
It's not something you can get from a typical park or trail. Parks are usually
compact open spaces, polygons boxed in by streets. Most other trails are at
grade level, so whatever flow or momentum you build up is periodically
interrupted by an intersection. The Bloomingdale Trail, however, is both apart
from and woven through the neighborhoods it is situated in. If you go the length
from the west trailhead to the east trailhead or vice versa, you'll have
travelled 2.7 miles -- a massive span across 4 Chicago neighborhoods -- in an
entirely human-mediated fashion. And yet you never feel as though you've taken
yourself out of the fabric of the city, as you might when going into a park.
Thanks to periodically spaced adjacent parks and access ramps, you can dip in
and out of the Trail as casually or deliberately as you choose. You gain both
a new vista on the city, and a deeper connection to the neighborhoods you've
always known. It's remarkable what a mere 17 feet of elevation can do to both
take you out of the city and give you greater access to it.</p>
<img class="img-responsive" alt="Photo of people walking on elevated Bloomingdale Trail and street below" src="/images/fbt_elevated.jpg">
<p>It is this embeddedness that I believe will ultimately make The Bloomingdale
Trail and the entire <a href="http://the606.org/">606</a> system of parks a success. It's not a jewel,
a thing to be admired, with its aesthetics upfront. It's a relentessly practical
bit of new human-scale infrastructure in a vibrant residential area. It will
materially improve the lives of its neighbors each day by enabling them to be
active, to commute, to play, and to discover in a new and unique way. It's worth
remembering that the project was funded largely by federal transportation
dollars, earmarked for reducing traffic congestion and air pollution. People
will wonder what this thing is, and the answer will be in its daily use.</p>
<img class="img-responsive" alt="Photo of rainbow over The Bloomingdale Trail" src="/images/fbt_rainbow.jpg">
<p>I remember walking around the old Bloomingdale Line, a disused elevated railroad
embankment, in 2002 with a group of work colleagues. We would sometimes take our
lunch up top, ducking under a fence at Milwaukee and Leavitt to gain access. The
germ of the Friends of the Bloomingdale Trail was planted there; the non-profit
community organization officially formed a year later. The circumstances at the
time were fortunate: the development of the High Line in New York provided
a template and a healthy competitive jolt; the railroad company was looking to
rid themselves of their responsibilities to the line; the City wanted to tear
down the embankment and spanning viaducts, providing further impetus; and
crucially, the rights-of-way were all contiguous and owned by the City: there
need be no time-consuming negotiating with private owners to acquire the trail's
property, as there was in New York. From there we held <a href="https://www.flickr.com/gp/psmith/32DCKi">community meetings</a>, <a href="http://www.bloomingdaletrail.org/img/Trailcleanup01.jpg">trash
pick-up days</a>, <a href="https://www.flickr.com/gp/psmith/0qWMDh">festivals</a>, goofy but earnest <a href="http://www.bloomingdaletrail.org/img/valentines.jpg">Valentine's Day events</a>, <a href="http://www.bloomingdaletrail.org/archive/#fbt-walking-tour-notes">led tours</a>,
<a href="https://www.flickr.com/gp/psmith/92r243">pitched aldermen and city planners</a>, <a href="http://www.bloomingdaletrail.org/archive/#bloomingdale-trail-mural-project">documented the Trail as it existed</a>, <a href="https://www.flickr.com/gp/psmith/71GZ13">helped open a new neighborhood park next to the Trail</a>, printed
<a href="http://www.bloomingdaletrail.org/archive/#walk-bike-run-poster">posters</a> and <a href="http://www.bloomingdaletrail.org/archive/#fbt-brochure">brochures</a>, <a href="http://www.bloomingdaletrail.org/archive/#chicago-public-art-group-albany-whipple-workshop-flyer">hosted arts events</a>, let <a href="http://www.bloomingdaletrail.org/reframing-ruin/david-schalliol/">David Schalliol do his magic</a>, connected with <a href="https://www.cityofchicago.org/city/en/depts/dcd/supp_info/logan_square_openspaceplan.html">open space plans</a>, and
started a partnership with the <a href="http://www.tpl.org/">Trust for Public Land</a> and the City of
Chicago to design and build the Trail. In 2007 and 2008, we <a href="https://www.flickr.com/photos/psmith/sets/72157600029547338">convened neighbors
in a series of meetings and
surveys</a> to listen
to, capture, and synthesize the community's vision for the project. The product
of this effort, the <a href="http://www.bloomingdaletrail.org/archive/#community-visioning-update">Community Visioning Update</a>, was perhaps our most important
practical work as an organization: this document was incorporated into the
City's official request for proposals for design and construction. To the best
of our ability, we made sure the future Trail would be reflective of the
community it came from and would serve.</p>
<img class="img-responsive" alt="Photo of ramp down from The Bloomingdale Trail" src="/images/fbt_ramp.jpg">
<p>It's time now to celebrate the opening of the Trail and begin a new phase in the
life of FBT. The original goals of the organization were to:</p>
<ul>
<li>Preserve the elevated right of way</li>
<li>Beautify the public space</li>
<li>Create a new, mixed-use trail/linear park</li>
<li>Establish a broad coalition that supports the proposed park</li>
<li>Connect with neighborhood schools and institutions</li>
</ul>
<p>Our <a href="http://www.bloomingdaletrail.org/about/">new mission</a> is to be the community stewards of the Trail, and to
that end, we recently applied and have been approved to be a Chicago Park
District Advisory Council, or PAC. As befits our unusual new park, we're breaking
new ground as a PAC. We're unique in that our bylaws state there will be board
representation from each of the 4 neighborhoods, and from each of the constituent
park PACs (Julia de Burgos, Walsh, Churchill Field, and Kimball). Because no
other park covers as much ground, cuts through as many neighborhoods, and links
up as many adjacent smaller parks, governance and community orginizing around
The Bloomingdale Trail will be a new experiment for all involved.</p>
<p>One last thought. There are very few good west-east routes in Chicago: most
transporation infrastructure radiates from and to the Loop. The Bloomingdale
Trail is a stroke across the spokes, and the physical, economic, and cultural
circulation it promotes will be fascinating to watch. But there are bigger
things at stake. Even before this new park was built, the Trail conspicuously
ended at the north branch of the Chicago River. (Now it ends at Ashland, that
street's bridge having been born-again over Western.) It's always been a dream
and a goal of FBT and the 606 partners to extend the Trail across the river in
a future phase. From there, on-street bicycle paths can be knit together,
ultimately arriving at the lakefront. However, there's an even bigger dream to
be dreamt. A few miles west of the western terminus of the Trail, the Illinois
Prarie Path has its eastern endpoint. The IPP carries you out due west 60 miles
past the outer suburbs. A network of rural trails beyond can be followed all the
way to Iowa. So while we celebrate the opening of Chicago's next great park
tomorrow, the notion of a bicycle trip that begins at the Mississippi River and
ends at Lake Michigan, on bike paths the entire span, should stay in the back of
our minds as a not-too-distant possibility.</p>
<img class="img-responsive" alt="Map of measurement from Mississippi River to Lake Michigan" src="/images/fbt_miss_river_lake_mich_map.png">
<p>Look up! It's The Bloomingdale Trail</p>
<img class="img-responsive" src="/images/fbt/320210120_f84ffca2ff_o.jpg">
<img class="img-responsive" src="/images/fbt/320221183_3866448949_o.jpg">
<img class="img-responsive" src="/images/fbt/3056808206_4ce94f3638_o.jpg">
<img class="img-responsive" src="/images/fbt/320213168_7aefe30df9_o.jpg">
Chicago wards & precincts shapefiles in 2015https://pauladamsmith.com/blog/2015/02/chicago-wards-precincts-shapefiles.html2015-02-27T19:53:00Z<p><strong>Update:</strong> On April 6, 2015, the City of Chicago updated its Data Portal with
the official <a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards-2015-/sp34-6z76">wards</a> and <a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Precincts-current-/uvpq-qeeq">precincts</a> shapefiles.</p>
<hr />
<p><strong>tl;dr:</strong> I tried to make a map of Chicago election results, I found only
out-of-date wards & precincts shapefiles, I had to FOIA the up-to-date versions,
I got them, I republished them so anyone can download them, and finally made
that map.</p>
<p>Read on for the full saga.</p>
<hr />
<p>After <a href="http://elections.chicagotribune.com/results/">this week’s municipal general elections in Chicago</a>, I was looking
for detailed results in the mayor’s race, which didn’t end Tuesday night but is
<a href="http://www.reuters.com/article/2015/02/25/us-usa-politics-chicago-idUSKBN0LS1B420150225">headed for a run-off</a> between Mayor <a href="http://www.chicagotogether.org/">Rahm Emanuel</a> and
challenger Cook County Commissioner <a href="http://www.chicagoforchuy.com/">Chuy Garcia</a> on April 7.
Specifically, I wanted to see where in the city the support for each candidate
was, and at as granular a level as possible.</p>
<p>The <a href="http://www.chicagoelections.com/en/home.html">Chicago Board of Elections</a> posts vote tallies by precinct (50 wards
in Chicago, with on average 40 precincts per ward). Precincts are the smallest
unit of political geography—in Chicago, they are roughly a few square city
blocks each. Given the neighborhoody nature of Chicago and the block-by-block
affinities that exist (which leads politicians to produce <a href="http://www.our2ndward.org/">carefully sculpted
gerrymanders like the 2nd Ward</a> in order to corral voters into favorable
pens), a map showing the relative intensity of voting percentages per candidate
by precinct would be a good tool for aiding detailed understanding of this
election or any election, and a building block for many possible similar
analyses in the future.</p>
<p>So I set out to make such a map. My plan was to gather the vote totals per
precinct, shapefiles of the city ward and precinct boundaries, and join them
together using tools like <a href="http://d3js.org/">d3</a> to draw a choropleth or thematic map in a web
browser. This is a straightforward plan and is well-trod ground. However,
I naïvely assumed the official source material I gathered would be accurate and
up-to-date.</p>
<p>After scraping the vote totals from the BOE site[<a href="#fn1-2015-02-27"
id="fnr1-2015-02-27" class="fn">1</a>], I downloaded the wards and precincts
shapefiles from the <a href="https://data.cityofchicago.org/">City of Chicago’s Data Portal site</a>, which is
a service that hosts many different types of data, from building permits to
restaurant inspections. I did this by typing “wards” and “precincts” into the
search box and downloading from the results pages the links titled “[Boundaries</p>
<ul>
<li>Wards]<a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards/bhcv-wqkf">dataportalwards</a>” and “<a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Ward-Precincts/sgsc-bb4n">Ward Precincts</a>”. There was
nothing to indicate that these files were out of date, nor anything else to
indicate that they were not the current, authoritative source of these data
sets.</li>
</ul>
<p>I put together a first draft of the map and shared it with <a href="https://twitter.com/joegermuska">some</a>
<a href="http://www.chicagocarto.com/">colleagues</a> who are experts in mapping and Chicago data. They quickly
pointed out that the map appeared to be using the old wards and precincts.[<a
href="#fn2-2015-02-27" id="fnr2-2015-02-27" class="fn">2</a>] In 2012, the <a href="http://www.wbez.org/no-sidebar/approved-ward-map-95662">city
council approved a new set of ward boundaries</a>, redrawing the city’s
political map. They were to go into effect in 2015, and this week’s election,
which included all 50 aldermanic races, were to be contested on this new
geography. The conspicuously missing new 2nd Ward was the tip-off my map was
wrong.</p>
<p>I searched for the updated boundaries, but came up with only unofficial sources,
and only for wards at that. There was the WBEZ map from their <a href="http://www.wbez.org/no-sidebar/approved-ward-map-95662">original 2012
story</a>, and the Tribune had created <a href="http://media.apps.chicagotribune.com/ward-redistricting-2012/index.html">a side-by-side comparison of the old
and new wards</a>. But I couldn’t trust these for my own use, because of
their uncertain provenance. And without matching updated precincts, I couldn’t
join vote totals for use in a map in any case.</p>
<p>Taking a page from the <a href="http://www.derivativeworks.com/2013/02/on-everyblock-and-the-open-data-movement.html">people person at my old job</a>, I made a phone call
to the Board of Elections: maybe I could just ask for the data and they would
give it to me? I stated my request very plainly and without explanation of
motive, and was told to “hold please” a couple of times while I bounced between
departments. A few moments later, I heard “Districts and Boundaries” on the
line. Success! Here was, literally, the person who could help me, right then. Or
so I thought. I repeated my request, and without a moment’s hesitation, the
Districts and Boundaries voice said that I would need to contact the BOE’s
<a href="http://www.foia.gov/">FOIA</a> officer, and here was their email address.[<a href="#fn3-2015-02-27" id="fnr3-2015-02-27" class="fn">3</a>]</p>
<p>It was hard to tell how much of this was bluffing, as in, let’s see you actually
bother to make a FOIA request, but I went ahead and stubbornly wrote an email to
the FOIA officer anyway. I was under no illusions that my request would be
fulfilled quickly enough to make my post-election map still relevant.</p>
<p><img src="/images/foia-email-request.png" alt="Email request to FOIA officer" /></p>
<p>I then <a href="https://twitter.com/paulsmith/status/571024506560647168">took to Twitter</a> to register my displeasure for this state of affairs—we
just had a citywide election for our top local offices, operating on the
assumption of the new city council-vouched districts, and yet, despite nearly
a decade of the open data movement, despite official portaldom, the key base
layers of the political strata were still available only to the learned
monks—and moved on.</p>
<p>Lo, but was my request not answered but a few scant hours later! I can’t tell
you how surprised I was to see this in my inbox:</p>
<p><img src="/images/foia-email-response.png" alt="Email response from FOIA officer" /></p>
<p>I thanked the officer and downloaded the payload, which was a set of 50 folders,
each corresponding to a ward and containing a shapefile of that ward’s precincts
therein. I eyeballed the boundaries with <a href="http://www2.qgis.org/en/site/">QGIS</a> and was satisfied that
they appeared to be legit. (Again, the shape of the notorious 2nd Ward was the
main clue.)</p>
<p>In the absence of official publication, I was determined to at least not have
the next person who goes looking for wards and precincts to wind up in FOIA
land. As relatively pain-free as this episode was, the fact that I had to engage
with the FOIA plumbing in order to fulfill a minor data request is not good. And
there is every reason to think that a typical FOIA request will take orders of
magnitude longer to fulfill than my jackpot.</p>
<p>My approach was to self-publish the data, but to be clear about its source
and my methodology for any transformations. While I’d much rather prefer this
data appear on Data Portal, I’d also prefer not for our collective energies to
be wasted in pursuits such as these.</p>
<p>Regarding those transformations, I had a set of precincts, but I also wanted the
wards that derive from them (a ward is completely defined by its constituent
precincts). I imported the precincts into a PostgreSQL database with the
<a href="http://postgis.net/">PostGIS</a> extension. From there I created wards by grouping precincts
by their ward number, and unioning their geometries (i.e., merging a bunch of
small precinct polygons into one large ward polygon). Then I exported from the
database into various geospatial data formats—Shapefile, TopoJSON, GeoJSON, KML,
etc.</p>
<p>I made these <a href="https://paulsmith.github.io/chicago_wards_and_precincts/"><strong>exports available for download by anyone</strong></a>, hosted on
<a href="https://github.com/paulsmith/chicago_wards_and_precincts">GitHub</a>.</p>
<p>I finally was able to make the map I wanted, at least, the first-order map,
a basic voter preference density map. I hope to build on this data
infrastructure with different overlays, result sets, future elections, and so
on.</p>
<p>You can <a href="http://bl.ocks.org/paulsmith/1564a99cc7b5d3f8e90c"><strong>view the map here</strong></a>; choose between mayoral candidates in the
drop-down selector to update the map with their vote percentages.</p>
<p>With several candidates, in can be useful to see them arrayed as <a href="http://en.wikipedia.org/wiki/Small_multiple">small
multiples</a> for easier comparison[<a href="#fn4-2015-02-27"
id="fnr4-2015-02-27" class="fn">4</a>]:</p>
<p><img src="/images/chi-2015-mayoral-small-multiples.png" alt="Side-by-side maps of Chicago mayoral election results" /></p>
<p>I’d like to see the left hand of the operators of the Chicago Data Portal talk
with the right hand of the Chicago Board of Elections, and simply take down the
pre-2015 ward and precinct boundaries (or better yet, rename them to something
that won’t be mistaken for the most recent version and leave them up for
historical research) and get the current shapefiles uploaded as soon as
possible. In the meantime, I hope that interested parties will avail themselves
of <a href="https://paulsmith.github.io/chicago_wards_and_precincts/">my hosted shapefiles</a>.</p>
<p>More generally I’d like for stakeholders in the world of government data to
reflect on the state of the open data movement, and consider examples such as
these as the tiny abrasions that impede all sorts of productivity, beyond my
modest map-making efforts. On one hand, we’ve made enormous progress; on the
other, we’re still fighting the same 10-year-old battles.</p>
<p>And to the FOIA officer at the BOE who responded so promptly, many thanks!</p>
<hr />
<ol class="footnotes">
<li id="fn1-2015-02-27">
It is 2015 and the third-largest U.S. city is still
publishing official election results on a decade-old system that doesn’t lend
itself to machine-readability without substantial friction, which violates #5 of
the <a href="https://public.resource.org/8_principles.html">8 Principles of Open Government Data</a>.
I wrote <a href="https://gist.githubusercontent.com/paulsmith/1564a99cc7b5d3f8e90c/raw/scrape.py">a
Python program to extract the data</a> from the particular formatting of the BOE site.
<a href="#fnr1-2015-02-27">↩</a>
</li>
<li id="fn2-2015-02-27">
In my defense, while I’ve
lived in Chicago for more than 10 years, I only recently moved back after
a 5-year hiatus, so my map intuitions are a little stale.
<a href="#fnr2-2015-02-27">↩</a>
</li>
<li id="fn3-2015-02-27">
Thus arguably in violation of #1, #3, #4, and #6 of
the <a href="https://public.resource.org/8_principles.html">8 Principles of Open Government Data</a>.
<a href="#fnr3-2015-02-27">↩</a>
</li>
<li id="fn4-2015-02-27">
For this I just screenshotted and collaged them in an image editor.
<a href="#fnr4-2015-02-27">↩</a>
</li>
</ol>
How to get started with the LLVM C APIhttps://pauladamsmith.com/blog/2015/01/how-to-get-started-with-llvm-c-api.html2015-01-20T19:53:00Z<p>I enjoy making toy programming languages to better understand how compilers
(and, ultimately, the underlying machine) work and to experiment with techniques
that aren’t in my repertoire. <a href="http://llvm.org/">LLVM</a> is great because I can tinker, and
then wire it up as the backend to have it generate fast code that runs on most
platforms. If I just wanted to see my code execute, I could get away with
a simple hand-rolled interpreter, but having access to LLVM’s JIT, suite of
optimizations, and platform support is like having a superpower — your little
toy can perform impressively well. Plus, LLVM is the foundation of things like
<a href="https://github.com/kripken/emscripten/wiki">Emscripten</a> and <a href="http://www.rust-lang.org/">Rust</a>, so I like developing intuition about how new
technologies I’m interested in are implemented.</p>
<p>I’m going to show how to use the LLVM API to programmatically
construct a function that you can invoke like any other and have it execute
directly in the machine language of your platform.</p>
<p>In this example, I’m going to use <a href="http://llvm.org/docs/doxygen/html/group__LLVMC.html">the C API</a>, because it is
available in the LLVM distribution, along with a C++ API, and so is the simplest
way to get started. There are bindings to the LLVM API in other languages
— Python, OCaml, Go, Rust — but the concepts behind using LLVM to generate code
are the same across the wrapper APIs.</p>
<p>This example sort of skips to the middle phase of compiler construction. Assume
the frontend (lexer, parser, type-checker) has built an <a href="http://en.wikipedia.org/wiki/Abstract_syntax_tree">AST</a> and we’re now
walking it to emit the intermediate representation of the code for the backend
to take and optimize and spit out machine code.</p>
<p>In this case, we’ll just type out the straight-line procedural code for a simple
function that would normally be dynamically cobbled together in a AST walker
function, calling the LLVM API when it encounters certain nodes in the tree.</p>
<p>For the example, we’ll build a simple adder function, which takes two integers
as arguments and returns their sum, the equivalent of, in C:</p>
<pre><code class="language-c">int sum(int a, int b) {
return a + b;
}
</code></pre>
<p>To be clear about what we are doing here: we are using LLVM to dynamically build
an in-memory representation of this function, using its API to set up things
like function entry and exit, return and parameter types, and the actual integer
add instruction. Once this in-memory representation is complete, we can instruct
LLVM to jump to it and execute it with arguments we supply, just as if it was
a executable we had compiled from a language like C.</p>
<p><a href="https://github.com/paulsmith/getting-started-llvm-c-api/blob/master/sum.c"><strong>Click here to view the final code.</strong></a></p>
<h2>Modules</h2>
<p>The first step is to create a module. A module is a collection of the global
variables, functions, external references, and other data in LLVM. Modules aren’t
quite like, say, modules in Python, in that they don’t provide separate
namespaces. But they are the top-level container for all things built in LLVM,
so we start by creating one.</p>
<pre><code class="language-c">LLVMModuleRef mod = LLVMModuleCreateWithName("my_module");
</code></pre>
<p>The string <code>"my_module"</code> passed to the module factory function is an identifier
of your choosing.</p>
<p>Note that as you’re navigating the <a href="http://llvm.org/docs/doxygen/html/group__LLVMC.html">LLVM C API documentation</a>, different
aspects are grouped together under different header includes. Most of what I’m
detailing here, such as modules and functions, is contained in the <code>Core.h</code>
header, but I’ll include others as we move along.</p>
<h2>Types</h2>
<p>Next, I create the <code>sum</code> function and add it to the module. A function consists of:</p>
<ul>
<li>its type (return type),</li>
<li>a vector of its parameter types, and</li>
<li>a set of basic blocks.</li>
</ul>
<p>I’ll get to basic blocks in a moment. First, we’ll handle the type and parameter
types of the function — its prototype, in C terms — and add it to the module.</p>
<pre><code class="language-c">LLVMTypeRef param_types[] = { LLVMInt32Type(), LLVMInt32Type() };
LLVMTypeRef ret_type = LLVMFunctionType(LLVMInt32Type(), param_types, 2, 0);
LLVMValueRef sum = LLVMAddFunction(mod, "sum", ret_type);
</code></pre>
<p>LLVM types correspond to the types that are native to the platforms we’re
targeting, such as integers and floats of fixed bit width, pointers, structs,
and arrays. (There’s no platform-dependent <code>int</code> type like in C, where the actual
size of the integer, 32- or 64-bit, depends on the underlying machine
architecture.)</p>
<p>LLVM types have constructors, and follow the form "LLVM<em>TYPE</em>Type()". In our
example, both the arguments passed to the sum function and the function’s type
itself are 32-bit integers, so we use <code>LLVMInt32Type()</code> for each.</p>
<p>The arguments to <code>LLVMFunctionType()</code> are, in order;</p>
<ol>
<li>the function’s type (return type),</li>
<li>the function’s parameter type vector (the arity of the function should match
the number of types in the array), and</li>
<li>the function’s arity, or parameter count,</li>
<li>a boolean whether the function is variadic, or accepts a variable number of
arguments.</li>
</ol>
<p>Notice that the function type constructor returns a type reference. This
reinforces the notion that what we did here is the LLVM equivalent of declaring
a function prototype in C.</p>
<p>The third line in here adds the function type to the module, and gives it the
name <code>sum</code>. We get a value reference in return, which can be thought of as
a concrete location in the code (ultimately, memory) upon which to add the
function’s body, which we do below.</p>
<h2>Basic blocks</h2>
<p>The next step is to add a basic block to the function. Basic blocks are parts of
code that only have one entry and exit point - in other words, there is no other
way execution can go than by single stepping through a list of instructions. No
if/else, while, loops, or jumps of any kind. Basic blocks are the key to
modeling control flow and creating optimizations later on, so LLVM has
first-class support for adding these to our in-progress module.</p>
<pre><code class="language-c">LLVMBasicBlockRef entry = LLVMAppendBasicBlock(sum, "entry");
</code></pre>
<p>Note the "append" in the name of the function: it’s helpful to think of what
we’re doing as growing a running tally of chunks of code, and so our basic block
is appended relative to the function we added to the module previously.</p>
<h2>Instruction builders</h2>
<p>This notion of a running tally fits with the instruction builder, which is how
we add instructions to our function’s one and only basic block.</p>
<pre><code class="language-c">LLVMBuilderRef builder = LLVMCreateBuilder();
LLVMPositionBuilderAtEnd(builder, entry);
</code></pre>
<p>Similar to appending the basic block to the function, we’re positioning the
builder to start writing instructions where we left off with the entry to the
basic block.</p>
<h3>LLVM IR</h3>
<p>Sidebar: LLVM’s main stock-in-trade is the LLVM intermediate representation, or
IR. I’ve seen it referred to as a midway point between assembly and C. The LLVM
IR is a very strictly defined language that is meant to facilitate the
optimizations and platform portability that LLVM is known for. If you look at
IR, you can see how individual instructions can be translated into the loads,
stores, and jumps of the ultimate assembly that will be generated. The IR has
3 representations:</p>
<ul>
<li>as an in-memory set of objects, which is what we’re using in this example,</li>
<li>as a textual language like assembly,</li>
<li>as a string of bytes in a compact binary encoding, called bitcode.</li>
</ul>
<p>You may see clang or other tools emit LLVM IR as text or bitcode.</p>
<p>Back to our example. Now comes the crux of our function, the actual instructions
to add the two integers passed in as arguments and return them to the caller.</p>
<pre><code class="language-c">LLVMValueRef tmp = LLVMBuildAdd(builder, LLVMGetParam(sum, 0), LLVMGetParam(sum, 1), "tmp");
LLVMBuildRet(builder, tmp);
</code></pre>
<p><code>LLVMBuildAdd()</code> takes a reference to the builder, the two integers to add, and
a name to give the result. (The name is required due to LLVM IR’s restriction
that all instructions produce intermediate results. This can further be
simplified or optimized away by LLVM later, but while generating IR, we follow
its strictures.) Since the numbers we wish to add are the arguments that were
supplied to the function by the caller, we can retrieve them in the form of the
function’s parameters using <code>LLVMGetParam()</code>: the second argument to is the
index of the parameter we seek from the function.</p>
<p>We call <code>LLVMBuildRet()</code> to generate the return statement and arrange for the
temporary result of the add instruction to be the value returned.</p>
<h2>Analysis & execution</h2>
<p>That concludes the building instructions phase of creating our function; the
module is now complete. The next phase of the example is setting it up for
execution.</p>
<p>First, let’s verify the module. This will ensure that our module was correctly
built and will abort if we missed or mixed up any steps.</p>
<pre><code class="language-c">char *error = NULL;
LLVMVerifyModule(mod, LLVMAbortProcessAction, &error);
LLVMDisposeMessage(error);
</code></pre>
<p>LLVM provides either a JIT or an interpreter to execute the IR we’ve built. It
will create a JIT if it can for the target platform, and fall back to an
interpreter otherwise. In any case, the thing that will run our code is called
the <em>execution engine</em>.</p>
<pre><code class="language-c">LLVMExecutionEngineRef engine;
error = NULL;
LLVMLinkInJIT();
LLVMInitializeNativeTarget();
if (LLVMCreateExecutionEngineForModule(&engine, mod, &error) != 0) {
fprintf(stderr, "failed to create execution engine\n");
abort();
}
if (error) {
fprintf(stderr, "error: %s\n", error);
LLVMDisposeMessage(error);
exit(EXIT_FAILURE);
}
</code></pre>
<p>We could hard-code some integers to be summed, but it’s easy enough to have our
program receive them from the command line.</p>
<pre><code class="language-c">if (argc < 3) {
fprintf(stderr, "usage: %s x y\n", argv[0]);
exit(EXIT_FAILURE);
}
long long x = strtoll(argv[1], NULL, 10);
long long y = strtoll(argv[2], NULL, 10);
</code></pre>
<p>Now that we have two integers in the representation of our host language, we
need to transform them into the analogous representation in LLVM. LLVM provides
factory functions that convert values into the types we need to pass to our
function:</p>
<pre><code class="language-c">LLVMGenericValueRef args[] = {
LLVMCreateGenericValueOfInt(LLVMInt32Type(), x, 0),
LLVMCreateGenericValueOfInt(LLVMInt32Type(), y, 0)
};
</code></pre>
<p>Now for the moment of truth: we can call our (JIT’d) function!</p>
<pre><code class="language-c">LLVMGenericValueRef res = LLVMRunFunction(engine, sum, 2, args);
</code></pre>
<p>We have a result, but it’s still in LLVM-land. We recover it to a C type, the
reverse operation from above, and print the sum:</p>
<pre><code class="language-c">printf("%d\n", (int)LLVMGenericValueToInt(res, 0));
</code></pre>
<p>And there we have it. We’ve programmatically constructed a function from the
ground up, and had it run directly in machine code native to our platform. There
is much more to LLVM, including control flow (eg., implementing if/else) and
optimization passes, but we’ve covered the basics that would be in any
LLVM-IR-to-code program.</p>
<h2>Compiling</h2>
<p>In order to compile our program, we need to reference the LLVM includes and link
its libraries. Even though we’ve written a C program, the linking step requires
the C++ linker. (LLVM is a C++ project, and the C API is a wrapper thereof.)</p>
<pre><code class="language-console">$ cc `llvm-config --cflags` -c sum.c
$ c++ `llvm-config --cxxflags --ldflags --libs core executionengine jit interpreter analysis native bitwriter --system-libs` sum.o -o sum
$ ./sum 42 99
141
</code></pre>
<h2>Bitcode</h2>
<p>One final thing. I mentioned previously that LLVM IR has three representations,
including bitcode. Once you have a completed module, you can emit bitcode and
write it out to a file.</p>
<pre><code class="language-c">if (LLVMWriteBitcodeToFile(mod, "sum.bc") != 0) {
fprintf(stderr, "error writing bitcode to file, skipping\n");
}
</code></pre>
<p>From there, you can use tools to manipulate it, like <code>llvm-dis</code> to disassemble the
bitcode into the textual LLVM IR assembly language.</p>
<pre><code class="language-console">$ llvm-dis sum.bc
$ cat sum.ll
; ModuleID = 'sum.bc'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define i32 @sum(i32, i32) {
entry:
%tmp = add i32 %0, %1
ret i32 %tmp
}
</code></pre>
<h2>Source code of example</h2>
<p>Here is the complete source of the program from above:</p>
<pre><code class="language-c">/**
* LLVM equivalent of:
*
* int sum(int a, int b) {
* return a + b;
* }
*/
#include <llvm-c/Core.h>
#include <llvm-c/ExecutionEngine.h>
#include <llvm-c/Target.h>
#include <llvm-c/Analysis.h>
#include <llvm-c/BitWriter.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char const *argv[]) {
LLVMModuleRef mod = LLVMModuleCreateWithName("my_module");
LLVMTypeRef param_types[] = { LLVMInt32Type(), LLVMInt32Type() };
LLVMTypeRef ret_type = LLVMFunctionType(LLVMInt32Type(), param_types, 2, 0);
LLVMValueRef sum = LLVMAddFunction(mod, "sum", ret_type);
LLVMBasicBlockRef entry = LLVMAppendBasicBlock(sum, "entry");
LLVMBuilderRef builder = LLVMCreateBuilder();
LLVMPositionBuilderAtEnd(builder, entry);
LLVMValueRef tmp = LLVMBuildAdd(builder, LLVMGetParam(sum, 0), LLVMGetParam(sum, 1), "tmp");
LLVMBuildRet(builder, tmp);
char *error = NULL;
LLVMVerifyModule(mod, LLVMAbortProcessAction, &error);
LLVMDisposeMessage(error);
LLVMExecutionEngineRef engine;
error = NULL;
LLVMLinkInJIT();
LLVMInitializeNativeTarget();
if (LLVMCreateExecutionEngineForModule(&engine, mod, &error) != 0) {
fprintf(stderr, "failed to create execution engine\n");
abort();
}
if (error) {
fprintf(stderr, "error: %s\n", error);
LLVMDisposeMessage(error);
exit(EXIT_FAILURE);
}
if (argc < 3) {
fprintf(stderr, "usage: %s x y\n", argv[0]);
exit(EXIT_FAILURE);
}
long long x = strtoll(argv[1], NULL, 10);
long long y = strtoll(argv[2], NULL, 10);
LLVMGenericValueRef args[] = {
LLVMCreateGenericValueOfInt(LLVMInt32Type(), x, 0),
LLVMCreateGenericValueOfInt(LLVMInt32Type(), y, 0)
};
LLVMGenericValueRef res = LLVMRunFunction(engine, sum, 2, args);
printf("%d\n", (int)LLVMGenericValueToInt(res, 0));
// Write out bitcode to file
if (LLVMWriteBitcodeToFile(mod, "sum.bc") != 0) {
fprintf(stderr, "error writing bitcode to file, skipping\n");
}
LLVMDisposeBuilder(builder);
LLVMDisposeExecutionEngine(engine);
}
</code></pre>
<p>See the <a href="https://github.com/paulsmith/getting-started-llvm-c-api">GitHub repo</a> for the Makefile and details on how to build the example
on your machine.</p>