<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><generator>Hugo -- gohugo.io</generator><link href="https://www.joeshaw.org/" rel="alternate" type="text/html"/><link href="https://www.joeshaw.org/atom.xml" rel="self" type="application/atom+xml"/><updated>2023-11-06T18:00:00+00:00</updated><id>https://www.joeshaw.org/</id><title type="html">joe shaw</title><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><icon>https://www.joeshaw.org/favicon.png</icon><entry><title type="html">Error handling in Go HTTP applications</title><link href="https://www.joeshaw.org/error-handling-in-go-http-applications/" rel="alternate" type="text/html" title="Error handling in Go HTTP applications"/><published>2021-05-16T21:00:00-04:00</published><updated>2021-05-16T21:00:00-04:00</updated><id>https://www.joeshaw.org/error-handling-in-go-http-applications/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="errors"/><summary type="html">A scheme for consistent and safe HTTP API errors</summary><content type="html" xml:base="https://www.joeshaw.org/error-handling-in-go-http-applications/"><![CDATA[<p>Nate Finch had a <a href="https://npf.io/2021/04/errorflags/">nice blog post on error flags</a>
recently, and it caused me to think about error handling in my own
greenfield Go project at work.</p>
<p>Much of the Go software I write follows a common pattern: an HTTP JSON
API fronting some business logic, backed by a data store of some sort.
When an error occurs, I typically want to present a context-aware HTTP
status code and an a JSON payload containing an error message.  I want
to avoid 400 Bad Request and 500 Internal Server Errors whenever
possible, and I also don&rsquo;t want to expose internal implementation
details or inadvertently leak information to API consumers.</p>
<p>I&rsquo;d like to share the pattern I&rsquo;ve settled on for this type of application.</p>
<h1 id="an-api-safe-error-interface">An API-safe error interface</h1>
<p>First, I define a new interface that will be used throughout the
application for exposing &ldquo;safe&rdquo; errors through the API:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">package</span> app
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> APIError <span style="color:#6ab825;font-weight:bold">interface</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// APIError returns an HTTP status code and an API-safe error message.</span>
</span></span><span style="display:flex;"><span>    <span style="color:#447fcf">APIError</span>() (<span style="color:#6ab825;font-weight:bold">int</span>, <span style="color:#6ab825;font-weight:bold">string</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h1 id="common-sentinel-errors">Common sentinel errors</h1>
<p>In practice, most of the time there are a limited set of errors that I
want to return through the API.  Things like a 401 Unauthorized for a
missing or invalid API token, or a 404 Not Found when referring to a
resource that doesn&rsquo;t exist in the data store.  For these I create a
create a private <code>struct</code> that implements <code>APIError</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> sentinelAPIError <span style="color:#6ab825;font-weight:bold">struct</span> {
</span></span><span style="display:flex;"><span>    status <span style="color:#6ab825;font-weight:bold">int</span>
</span></span><span style="display:flex;"><span>    msg    <span style="color:#6ab825;font-weight:bold">string</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (e sentinelAPIError) <span style="color:#447fcf">Error</span>() <span style="color:#6ab825;font-weight:bold">string</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> e.msg
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (e sentinelAPIError) <span style="color:#447fcf">APIError</span>() (<span style="color:#6ab825;font-weight:bold">int</span>, <span style="color:#6ab825;font-weight:bold">string</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> e.status, e.msg
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>And then I publicly define common sentinel errors:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">var</span> (
</span></span><span style="display:flex;"><span>    ErrAuth      = &amp;sentinelAPIError{status: http.StatusUnauthorized, msg: <span style="color:#ed9d13">&#34;invalid token&#34;</span>}
</span></span><span style="display:flex;"><span>    ErrNotFound  = &amp;sentinelAPIError{status: http.StatusNotFound, msg: <span style="color:#ed9d13">&#34;not found&#34;</span>}
</span></span><span style="display:flex;"><span>    ErrDuplicate = &amp;sentinelAPIError{status: http.StatusBadRequest, msg: <span style="color:#ed9d13">&#34;duplicate&#34;</span>}
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><h1 id="wrapping-sentinels">Wrapping sentinels</h1>
<p>The sentinel errors provide a good foundation for reporting basic
information through the API, but how can I associate real errors with
them?  <code>ErrNoRows</code> from the <code>database/sql</code> package is never going to
implement my <code>APIError</code> interface, but I can leverage the
<a href="https://blog.golang.org/go1.13-errors">error wrapping functionality introduced in Go 1.13</a>.</p>
<p>One of the lesser-known features of error wrapping is the ability to
write a custom <code>Is</code> method on your own types.  This is perhaps because
the implementation is <a href="https://github.com/golang/go/blob/d137b745398e8313c0f086d4d044751295be6163/src/errors/wrap.go#L49-L51">privately hidden</a>
within the <code>errors</code> package, and the <a href="https://golang.org/pkg/errors/">package documentation</a>
doesn&rsquo;t give much information about why you&rsquo;d want to use it.  But it&rsquo;s
a perfect fit for these sentinel errors.</p>
<p>First, I define a sentinel-wrapped error type:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> sentinelWrappedError <span style="color:#6ab825;font-weight:bold">struct</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">error</span>
</span></span><span style="display:flex;"><span>    sentinel *sentinelAPIError
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (e sentinelWrappedError) <span style="color:#447fcf">Is</span>(err <span style="color:#6ab825;font-weight:bold">error</span>) <span style="color:#6ab825;font-weight:bold">bool</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> e.sentinel == err
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (e sentinelWrappedError) <span style="color:#447fcf">APIError</span>() (<span style="color:#6ab825;font-weight:bold">int</span>, <span style="color:#6ab825;font-weight:bold">string</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> e.sentinel.<span style="color:#447fcf">APIError</span>()
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This associates an error from elsewhere in the application with one of
my predefined sentinel errors.  A key thing to note here is that
<code>sentinelWrappedError</code> embeds the original error, meaning its <code>Error</code>
method returns the original error&rsquo;s message, while implementing
<code>APIError</code> with the sentinel&rsquo;s API-safe message.  The <code>Is</code> method allows
for comparisons of these wrapping errors with the sentinel errors using
<code>errors.Is</code>.</p>
<p>Then I need a public function to do the wrapping:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">WrapError</span>(err <span style="color:#6ab825;font-weight:bold">error</span>, sentinel *sentinelAPIError) <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> sentinelWrappedError{<span style="color:#6ab825;font-weight:bold">error</span>: err, sentinel: sentinel}
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>(If you wanted to include additional context in the <code>APIError</code>, such as a resource name, this would be a good place to add it.)</p>
<p>When other parts of the application encounter an error, they wrap the
error with one of the sentinel errors.  For example, the database layer
might have its own <code>wrapError</code> function that looks something like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">package</span> db
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">import</span> <span style="color:#ed9d13">&#34;example.com/app&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">wrapError</span>(err <span style="color:#6ab825;font-weight:bold">error</span>) <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">switch</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">case</span> errors.<span style="color:#447fcf">Is</span>(err, sql.ErrNoRows):
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> app.<span style="color:#447fcf">WrapError</span>(err, app.ErrNotFound)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">case</span> <span style="color:#447fcf">isMySQLError</span>(err, codeDuplicate):
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> app.<span style="color:#447fcf">WrapError</span>(err, app.ErrDuplicate)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">default</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Because the wrapper implements <code>Is</code> against the sentinel, you can
compare errors to sentinels regardless of what the original error is:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span>err := db.<span style="color:#447fcf">DoAThing</span>()
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">switch</span> {
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">case</span> errors.<span style="color:#447fcf">Is</span>(err, ErrNotFound):
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// do something specific for Not Found errors</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">case</span> errors.<span style="color:#447fcf">Is</span>(err, ErrDuplicate):
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// do something specific for Duplicate errors</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h1 id="handling-errors-in-the-api">Handling errors in the API</h1>
<p>The final task is to handle these errors and send them safely back
through the API.  In my <code>api</code> package, I define a helper function that
takes an error and serializes it to JSON:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">package</span> api
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">import</span> <span style="color:#ed9d13">&#34;example.com/app&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">JSONHandleError</span>(w http.ResponseWriter, err <span style="color:#6ab825;font-weight:bold">error</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">var</span> apiErr app.APIError
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> errors.<span style="color:#447fcf">As</span>(err, &amp;apiErr) {
</span></span><span style="display:flex;"><span>        status, msg := apiErr.<span style="color:#447fcf">APIError</span>()
</span></span><span style="display:flex;"><span>        <span style="color:#447fcf">JSONError</span>(w, status, msg)
</span></span><span style="display:flex;"><span>    } <span style="color:#6ab825;font-weight:bold">else</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#447fcf">JSONError</span>(w, http.StatusInternalServerError, <span style="color:#ed9d13">&#34;internal error&#34;</span>)
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>(The elided <code>JSONError</code> function is the one responsible for setting the
HTTP status code and serializing the JSON.)</p>
<p>Note that this function can take <em>any</em> <code>error</code>.  If it&rsquo;s not an
<code>APIError</code>, it falls back to returning a 500 Internal Server Error.
This makes it safe to pass unwrapped and unexpected errors without
additional care.</p>
<p>Because <code>sentinelWrappedError</code> embeds the original error, you can also
log any error you encounter and get the original error message.  This
can aid debugging.</p>
<h1 id="an-example">An example</h1>
<p>Here&rsquo;s an example HTTP handler function that generates an error, logs
it, and returns it to a caller.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">package</span> api
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">exampleHandler</span>(w http.ResponseWriter, r *http.Request) {
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// A contrived example that always throws an error.  Imagine this</span>
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// is actually a function that calls into a data store.</span>
</span></span><span style="display:flex;"><span>    err := app.<span style="color:#447fcf">WrapError</span>(fmt.<span style="color:#447fcf">Errorf</span>(<span style="color:#ed9d13">&#34;user ID %q not found&#34;</span>, <span style="color:#ed9d13">&#34;archer&#34;</span>), app.ErrNotFound)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        log.<span style="color:#447fcf">Printf</span>(<span style="color:#ed9d13">&#34;exampleHandler: error fetching user: %v&#34;</span>, err)
</span></span><span style="display:flex;"><span>        <span style="color:#447fcf">JSONHandleError</span>(w, err)
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// Happy path elided...</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Hitting this endpoint will give you this HTTP response:</p>
<pre tabindex="0"><code>HTTP/1.1 404 Not Found
Content-Type: application/json

{&#34;error&#34;: &#34;not found&#34;}
</code></pre><p>And send to your logs:</p>
<pre tabindex="0"><code>exampleHandler: error fetching user: user ID &#34;archer&#34; not found
</code></pre><p>If I had forgotten to call <code>app.WrapError</code>, the response instead would
have been:</p>
<pre tabindex="0"><code>HTTP/1.1 500 Internal Server Error
Content-Type: application/json

{&#34;error&#34;: &#34;internal error&#34;}
</code></pre><p>But the message to the logs would have been the same.</p>
<h1 id="impact">Impact</h1>
<p>Adopting this pattern for error handling has reduced the number of error
types and scaffolding in my code &ndash; the same problems that Nate
experienced before adopting his error flags scheme.  It&rsquo;s centralized
the errors I expose to the user, reduced the work to expose appropriate
and consistent error codes and messages to API consumers, and has an
always-on safe fallback for unexpected errors or programming mistakes.
I hope you can take inspiration to improve the error handling in your
own code.</p>
]]></content></entry><entry><title type="html">Abusing go:linkname to customize TLS 1.3 cipher suites</title><link href="https://www.joeshaw.org/abusing-go-linkname-to-customize-tls13-cipher-suites/" rel="alternate" type="text/html" title="Abusing go:linkname to customize TLS 1.3 cipher suites"/><published>2020-06-11T09:00:00-04:00</published><updated>2021-02-07T15:30:00-05:00</updated><id>https://www.joeshaw.org/abusing-go-linkname-to-customize-tls13-cipher-suites/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="tls"/><summary type="html">Don't do this.</summary><content type="html" xml:base="https://www.joeshaw.org/abusing-go-linkname-to-customize-tls13-cipher-suites/"><![CDATA[<div class="notices note" >This post has been translated into Chinese by a Gopher in Beijing.  <a href="https://studygolang.com/articles/32683">巧用go:linkname 定制 TLS 1.3 加密算法套件</a></div>
<p>When Go 1.12 was released, I was very excited to test out the new opt-in
support for <a href="https://davidwong.fr/tls13/">TLS 1.3</a>.  TLS 1.3 is a major
improvement to the main security protocol of the web.</p>
<p>I was eager to try it out in a tool I had written for work which allowed
me to scan what TLS parameters were supported by a server.  In TLS, the
client presents a set of cipher suites to the server that it supports,
and the server chooses the best one to use, where &ldquo;best&rdquo; is typically a
reasonable trade-off of security and performance.</p>
<p>In order to enumerate what cipher suites a server supports, a client
must make individual connections, each offering a single cipher suite at
a time.  If the server rejects the handshake, you know the cipher suite
is not supported.</p>
<p>For TLS 1.2 and below, this is pretty straightforward:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">supportedTLS12Ciphers</span>(hostname <span style="color:#6ab825;font-weight:bold">string</span>) []<span style="color:#6ab825;font-weight:bold">uint16</span> {
</span></span><span style="display:flex;"><span>	<span style="color:#999;font-style:italic">// Taken from https://golang.org/pkg/crypto/tls/#pkg-constants</span>
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">var</span> allCiphers = []<span style="color:#6ab825;font-weight:bold">uint16</span>{
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_RC4_128_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_3DES_EDE_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_AES_128_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_AES_256_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_AES_128_CBC_SHA256,
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_AES_128_GCM_SHA256,
</span></span><span style="display:flex;"><span>		tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_RC4_128_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,
</span></span><span style="display:flex;"><span>		tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">var</span> supportedCiphers []<span style="color:#6ab825;font-weight:bold">uint16</span>
</span></span><span style="display:flex;"><span>    
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">for</span> _, c := <span style="color:#6ab825;font-weight:bold">range</span> allCiphers {
</span></span><span style="display:flex;"><span>		cfg := &amp;tls.Config{
</span></span><span style="display:flex;"><span>			ServerName:   hostname,
</span></span><span style="display:flex;"><span>			CipherSuites: []<span style="color:#6ab825;font-weight:bold">uint16</span>{c},
</span></span><span style="display:flex;"><span>			MinVersion:   tls.VersionTLS12,
</span></span><span style="display:flex;"><span>			MaxVersion:   tls.VersionTLS12,
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>		conn, err := net.<span style="color:#447fcf">Dial</span>(<span style="color:#ed9d13">&#34;tcp&#34;</span>, hostname+<span style="color:#ed9d13">&#34;:443&#34;</span>)
</span></span><span style="display:flex;"><span>		<span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>			<span style="color:#24909d">panic</span>(err)
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>		client := tls.<span style="color:#447fcf">Client</span>(conn, cfg)
</span></span><span style="display:flex;"><span>		client.<span style="color:#447fcf">Handshake</span>()
</span></span><span style="display:flex;"><span>		client.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>		<span style="color:#6ab825;font-weight:bold">if</span> client.<span style="color:#447fcf">ConnectionState</span>().CipherSuite == c {
</span></span><span style="display:flex;"><span>			supportedCiphers = <span style="color:#24909d">append</span>(supportedCiphers, c)
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">return</span> supportedCiphers
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>After writing the barebones code to support TLS 1.3 in the tool, I
discovered something unfortunate: <a href="https://github.com/golang/go/issues/29349#issuecomment-448927334">Go does not allow you to select what
TLS 1.3 cipher suites are sent to the
server.</a>
The rationale makes sense: TLS 1.3 greatly simplified both what is
contained within a cipher suite and how many are supported.  Unless and
until there is a weakness in a TLS 1.3 cipher suite, there&rsquo;s nothing to
be gained in allowing them to be customized.</p>
<p>Still, this tool was one of the rare situations where it makes sense,
and I wanted to see if I could hack it in.  Enter <code>go:linkname</code>.  Buried
deep in Go&rsquo;s <a href="https://golang.org/cmd/compile/">compiler documentation</a>:</p>
<blockquote>
<p><code>//go:linkname localname importpath.name</code></p>
<p>The //go:linkname directive instructs the compiler to use
“importpath.name” as the object file symbol name for the variable or
function declared as “localname” in the source code. Because this
directive can subvert the type system and package modularity, it is only
enabled in files that have imported &ldquo;unsafe&rdquo;.</p></blockquote>
<p>Well hello!  This looks promising.  If there is a function or variable
in Go&rsquo;s standard library that specifies what the list of TLS 1.3 ciphers
are, we can override that in our tool by instructing the Go complier to
use our local implementation instead of the one in the standard library.</p>
<p>Let&rsquo;s dig into the standard library&rsquo;s TLS 1.3 implementation.  In
<code>crypto/tls/handshake_client.go</code>
[<a href="https://github.com/golang/go/blob/06472b99cdf59f00049f3cd8c9e05ba283cb2c56/src/crypto/tls/handshake_client.go#L121-L122">link</a>],
we have:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">if</span> hello.supportedVersions[<span style="color:#3677a9">0</span>] == VersionTLS13 {
</span></span><span style="display:flex;"><span>		hello.cipherSuites = <span style="color:#24909d">append</span>(hello.cipherSuites, <span style="color:#447fcf">defaultCipherSuitesTLS13</span>()...)
</span></span><span style="display:flex;"><span>        <span style="color:#999;font-style:italic">// ...</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Great!  Let&rsquo;s just override this <code>defaultCipherSuitesTLS13()</code> function.
In <code>crypto/tls/common.go</code>
[<a href="https://github.com/golang/go/blob/a6a7b148f874b32a34e833893971b471cd9cdeb7/src/crypto/tls/common.go#L1120-L1123">link</a>]:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">defaultCipherSuitesTLS13</span>() []<span style="color:#6ab825;font-weight:bold">uint16</span> {
</span></span><span style="display:flex;"><span>	once.<span style="color:#447fcf">Do</span>(initDefaultCipherSuites)
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">return</span> varDefaultCipherSuitesTLS13
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This complicates things a bit.  This calls an initialization function
lazily on first use, and that function manipulates a bunch of internal
default lists beyond just the TLS 1.3 cipher suites list.  We don&rsquo;t want
to mess with any of that.  But in that <code>initDefaultCipherSuites</code>
function, we have this
[<a href="https://github.com/golang/go/blob/a6a7b148f874b32a34e833893971b471cd9cdeb7/src/crypto/tls/common.go#L1150-L1154">link</a>]:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span>varDefaultCipherSuitesTLS13 = []<span style="color:#6ab825;font-weight:bold">uint16</span>{
</span></span><span style="display:flex;"><span>    TLS_AES_128_GCM_SHA256,
</span></span><span style="display:flex;"><span>    TLS_CHACHA20_POLY1305_SHA256,
</span></span><span style="display:flex;"><span>    TLS_AES_256_GCM_SHA384,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Ah ha!  A package global variable is assigned the cipher suite values.
And because this initialization function is only ever called once, we
can initialize the list and then take control of it in our code.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Using go:linkname requires us to import unsafe</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">import</span> (
</span></span><span style="display:flex;"><span>    <span style="color:#ed9d13">&#34;crypto/tls&#34;</span>
</span></span><span style="display:flex;"><span>    _ <span style="color:#ed9d13">&#34;unsafe&#34;</span> 
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// We bring the real defaultCipherSuitesTLS13 function from the</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// crypto/tls package into our own package.  This lets us perform</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// that lazy initialization of the cipher list when we want.</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">//go:linkname defaultCipherSuitesTLS13 crypto/tls.defaultCipherSuitesTLS13</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">defaultCipherSuitesTLS13</span>() []<span style="color:#6ab825;font-weight:bold">uint16</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Next we bring the `varDefaultCipherSuitesTLS13` slice into our</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// package.  This is what we manipulate to get the cipher suites.</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">//go:linkname varDefaultCipherSuitesTLS13 crypto/tls.varDefaultCipherSuitesTLS13</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">var</span> varDefaultCipherSuitesTLS13 []<span style="color:#6ab825;font-weight:bold">uint16</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Also keep a variable around for the real default set, so we</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// can reset it once we&#39;re finished.</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">var</span> realDefaultCipherSuitesTLS13 []<span style="color:#6ab825;font-weight:bold">uint16</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">init</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// Initialize the TLS 1.3 ciphersuite set; this populates</span>
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// varDefaultCipherSuitesTLS13 under the covers</span>
</span></span><span style="display:flex;"><span>    realDefaultCipherSuitesTLS13 = <span style="color:#447fcf">defaultCipherSuitesTLS13</span>()
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">supportedTLS13Ciphers</span>(hostname <span style="color:#6ab825;font-weight:bold">string</span>) []<span style="color:#6ab825;font-weight:bold">uint16</span> {
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">var</span> supportedCiphers []<span style="color:#6ab825;font-weight:bold">uint16</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">for</span> _, c := <span style="color:#6ab825;font-weight:bold">range</span> realDefaultCipherSuitesTLS13 {
</span></span><span style="display:flex;"><span>		cfg := &amp;tls.Config{
</span></span><span style="display:flex;"><span>			ServerName: hostname,
</span></span><span style="display:flex;"><span>			MinVersion: tls.VersionTLS13,
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>		<span style="color:#999;font-style:italic">// Override the internal slice!</span>
</span></span><span style="display:flex;"><span>		varDefaultCipherSuitesTLS13 = []<span style="color:#6ab825;font-weight:bold">uint16</span>{c}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>		conn, err := net.<span style="color:#447fcf">Dial</span>(<span style="color:#ed9d13">&#34;tcp&#34;</span>, hostname+<span style="color:#ed9d13">&#34;:443&#34;</span>)
</span></span><span style="display:flex;"><span>		<span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>			<span style="color:#24909d">panic</span>(err)
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        client := tls.<span style="color:#447fcf">Client</span>(conn, cfg)
</span></span><span style="display:flex;"><span>		client.<span style="color:#447fcf">Handshake</span>()
</span></span><span style="display:flex;"><span>		client.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>		<span style="color:#6ab825;font-weight:bold">if</span> client.<span style="color:#447fcf">ConnectionState</span>().CipherSuite == c {
</span></span><span style="display:flex;"><span>			supportedCiphers = <span style="color:#24909d">append</span>(supportedCiphers, c)
</span></span><span style="display:flex;"><span>		}
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>	<span style="color:#999;font-style:italic">// Reset the internal slice back to the full set</span>
</span></span><span style="display:flex;"><span>	varDefaultCipherSuitesTLS13 = realDefaultCipherSuitesTLS13
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">return</span> supportedCiphers
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>As you can see, we used <code>go:linkname</code> to subvert package modularity for
both a function and a variable.  We use a package <code>init</code> function to
populate the default cipher suites list, and then we override it as we
iterate and attempt connections with only a single supported cipher
suite.  Finally, we make sure to clean things up and set the default
list back to the full set for any future uses.</p>
<p>Lastly, let&rsquo;s glue things together:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">main</span>() {
</span></span><span style="display:flex;"><span>    hostname := os.Args[<span style="color:#3677a9">1</span>]
</span></span><span style="display:flex;"><span>	fmt.<span style="color:#447fcf">Println</span>(<span style="color:#ed9d13">&#34;Supported TLS 1.2 ciphers&#34;</span>)
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">for</span> _, c := <span style="color:#6ab825;font-weight:bold">range</span> <span style="color:#447fcf">supportedTLS12Ciphers</span>(hostname) {
</span></span><span style="display:flex;"><span>		fmt.<span style="color:#447fcf">Printf</span>(<span style="color:#ed9d13">&#34;  %s\n&#34;</span>, tls.<span style="color:#447fcf">CipherSuiteName</span>(c))
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>	fmt.<span style="color:#447fcf">Println</span>()
</span></span><span style="display:flex;"><span>	fmt.<span style="color:#447fcf">Println</span>(<span style="color:#ed9d13">&#34;Supported TLS 1.3 ciphers&#34;</span>)
</span></span><span style="display:flex;"><span>	<span style="color:#6ab825;font-weight:bold">for</span> _, c := <span style="color:#6ab825;font-weight:bold">range</span> <span style="color:#447fcf">supportedTLS13Ciphers</span>(hostname) {
</span></span><span style="display:flex;"><span>		fmt.<span style="color:#447fcf">Printf</span>(<span style="color:#ed9d13">&#34;  %s\n&#34;</span>, tls.<span style="color:#447fcf">CipherSuiteName</span>(c))
</span></span><span style="display:flex;"><span>	}
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><pre tabindex="0"><code>$ go run cipherlist.go joeshaw.org
Supported TLS 1.2 ciphers
  TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

Supported TLS 1.3 ciphers
  TLS_AES_128_GCM_SHA256
  TLS_CHACHA20_POLY1305_SHA256
  TLS_AES_256_GCM_SHA384
</code></pre><p>There you have it.</p>
<p><code>go:linkname</code> should be used <em>very</em> sparingly.  Consider carefully
whether you must use it, or whether you can solve your problem another
way. For me, the alternative was to import all of <code>crypto/tls</code> to make
some minor edits.  It would also freeze me into a point in time of the
Go TLS stack and put the burden of upgrading onto me.  While I know that
there are no compatibility guarantees with Go&rsquo;s <code>crypto/tls</code> internals,
using <code>go:linkname</code> allows me to use the TLS stack provided by current
and future versions of Go as long as the particular pieces I am using
don&rsquo;t change.  I can live with that.</p>
<p>The full code for this test program lives in <a href="https://github.com/joeshaw/cipherlist">this Github
repository</a>.</p>
]]></content></entry><entry><title type="html">Understanding Go panic output</title><link href="https://www.joeshaw.org/understanding-go-panic-output/" rel="alternate" type="text/html" title="Understanding Go panic output"/><published>2017-11-09T23:00:00-05:00</published><updated>2023-11-06T18:00:00+00:00</updated><id>https://www.joeshaw.org/understanding-go-panic-output/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="panic"/><summary type="html">Decoding all those hex values</summary><content type="html" xml:base="https://www.joeshaw.org/understanding-go-panic-output/"><![CDATA[<div class="notices note" ><em>Update November 2023</em>: Go 1.17 and 1.18 improved the panic output from my original post.  See the <a href="#update">update</a> below for details.</div>
<p>My code has a bug. 😭</p>
<pre tabindex="0"><code>panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x751ba4]
goroutine 58 [running]:
github.com/joeshaw/example.UpdateResponse(0xad3c60, 0xc420257300, 0xc4201f4200, 0x16, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/joeshaw/example/resp.go:108 +0x144
github.com/joeshaw/example.PrefetchLoop(0xacfd60, 0xc420395480, 0x13a52453c000, 0xad3c60, 0xc420257300)
        /go/src/github.com/joeshaw/example/resp.go:82 +0xc00
created by main.runServer
        /go/src/github.com/joeshaw/example/cmd/server/server.go:100 +0x7e0
</code></pre><p>This panic is caused by dereferencing a nil pointer, as indicated by
the first line of the output.  These types of errors are much less
common in Go than in other languages like C or Java thanks to Go&rsquo;s
idioms around error handling.</p>
<p>If a function <em>could</em> fail, the function must return an <code>error</code> as its
last return value.  The caller should immediately check for errors
from that function.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#999;font-style:italic">// val is a pointer, err is an error interface value</span>
</span></span><span style="display:flex;"><span>val, err := <span style="color:#447fcf">somethingThatCouldFail</span>()
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// Deal with the error, probably pushing it up the call stack</span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// By convention, nearly all the time, val is guaranteed to not be</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// nil here.</span>
</span></span></code></pre></div><p>However, there must be a bug somewhere that is violating this implicit
API contract.</p>
<p>Before I go any further, a caveat: this is architecture- and operating
system-dependent stuff, and I am only running this on amd64 Linux and
macOS systems.  Other systems can and will do things differently.</p>
<p>Line two of the panic output gives information about the UNIX signal
that triggered the panic:</p>
<pre><code>[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x751ba4]
</code></pre>
<p>A segmentation fault (<code>SIGSEGV</code>) occurred because of the nil pointer
dereference.  The <code>code</code> field maps to the UNIX <code>siginfo.si_code</code>
field, and a value of <code>0x1</code> is <code>SEGV_MAPERR</code> (&ldquo;address not mapped to
object&rdquo;) in Linux&rsquo;s <code>siginfo.h</code> file.</p>
<p><code>addr</code> maps to <code>siginfo.si_addr</code> and is <code>0x30</code>, which isn&rsquo;t a valid
memory address.</p>
<p><code>pc</code> is the program counter, and we could use it to figure out where
the program crashed, but we conveniently don&rsquo;t need to because a
goroutine trace follows.</p>
<pre tabindex="0"><code>goroutine 58 [running]:
github.com/joeshaw/example.UpdateResponse(0xad3c60, 0xc420257300, 0xc4201f4200, 0x16, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/joeshaw/example/resp.go:108 +0x144
github.com/joeshaw/example.PrefetchLoop(0xacfd60, 0xc420395480, 0x13a52453c000, 0xad3c60, 0xc420257300)
        /go/src/github.com/joeshaw/example/resp.go:82 +0xc00
created by main.runServer
        /go/src/github.com/joeshaw/example/cmd/server/server.go:100 +0x7e0
</code></pre><p>The deepest stack frame, the one where the panic happened, is listed
first.  In this case, <code>resp.go</code> line 108.</p>
<p>The thing that catches my eye in this goroutine backtrace are the
arguments to the <code>UpdateResponse</code> and <code>PrefetchLoop</code> functions,
because the number doesn&rsquo;t match up to the function signatures.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">UpdateResponse</span>(c Client, id <span style="color:#6ab825;font-weight:bold">string</span>, version <span style="color:#6ab825;font-weight:bold">int</span>, resp *Response, data []<span style="color:#6ab825;font-weight:bold">byte</span>) <span style="color:#6ab825;font-weight:bold">error</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">PrefetchLoop</span>(ctx context.Context, interval time.Duration, c Client)
</span></span></code></pre></div><p><code>UpdateResponse</code> takes 5 arguments, but the panic shows that it takes
more than 10.  <code>PrefetchLoop</code> takes 3, but the panic shows 5.  What&rsquo;s
going on?</p>
<p>To understand the argument values, we have to understand a little bit
about the data structures underlying Go types.  Russ Cox has two great
blog posts on this, one on <a href="https://research.swtch.com/godata">basic types, structs and pointers,
strings, and slices</a> and another on
<a href="https://research.swtch.com/interfaces">interfaces</a> which describe how
these are laid out in memory.  Both posts are essential reading for Go
programmers, but to summarize:</p>
<ul>
<li>Strings are two words (a pointer to string data and a length)</li>
<li>Slices are three words (a pointer to a backing array, a length, and a capacity)</li>
<li>Interfaces are two words (a pointer to the type and a pointer to the value)</li>
</ul>
<p>When a panic happens, the arguments we see in the output include the
&ldquo;exploded&rdquo; values of strings, slices, and interfaces.  In addition,
the return values of a function are added onto the end of the argument
list.</p>
<p>To go back to our <code>UpdateResponse</code> function, the <code>Client</code> type is an
interface, which is 2 values.  <code>id</code> is a string, which is 2 values (4
total).  <code>version</code> is an int, 1 value (5).  <code>resp</code> is a pointer, 1
value (6).  <code>data</code> is a slice, 3 values (9).  The <code>error</code> return value
is an interface, so add 2 more for a total of 11.  The panic output
limits the number to 10, so the last value is truncated from the
output.</p>
<p>Here is an annotated <code>UpdateResponse</code> stack frame:</p>
<pre tabindex="0"><code>github.com/joeshaw/example.UpdateResponse(
    0xad3c60,      // c Client interface, type pointer
    0xc420257300,  // c Client interface, value pointer
    0xc4201f4200,  // id string, data pointer
    0x16,          // id string, length (0x16 = 22)
    0x1,           // version int (1)
    0x0,           // resp pointer (nil!)
    0x0,           // data slice, backing array pointer (nil)
    0x0,           // data slice, length (0)
    0x0,           // data slice, capacity (0)
    0x0,           // error interface (return value), type pointer
    ...            // truncated; would have been error interface value pointer
)
</code></pre><p>This helps confirm what the source suggested, which is that <code>resp</code> was
<code>nil</code> and being dereferenced.</p>
<p>Moving up one stack frame to <code>PrefetchLoop</code>: <code>ctx context.Context</code> is
an interface value, <code>interval</code> is a <code>time.Duration</code> (which is just an
<code>int64</code>), and <code>Client</code> again is an interface.</p>
<p><code>PrefetchLoop</code> annotated:</p>
<pre tabindex="0"><code>github.com/joeshaw/example.PrefetchLoop(
    0xacfd60,       // ctx context.Context interface, type pointer
    0xc420395480,   // ctx context.Context interface, value pointer
    0x13a52453c000, // interval time.Duration (6h0m)
    0xad3c60,       // c Client interface, type pointer
    0xc420257300,   // c Client interface, value pointer
)
</code></pre><p>As I mentioned earlier, it should not have been possible for <code>resp</code> to
be <code>nil</code>, because that should only happen when the returned error is
not <code>nil</code>.  The culprit was in code which was erroneously using the
<code>github.com/pkg/errors</code> <code>Wrapf()</code> function instead of <code>Errorf()</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Function returns (*Response, []byte, error)</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">if</span> resp.StatusCode != http.StatusOK {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> <span style="color:#6ab825;font-weight:bold">nil</span>, <span style="color:#6ab825;font-weight:bold">nil</span>, errors.<span style="color:#447fcf">Wrapf</span>(err, <span style="color:#ed9d13">&#34;got status code %d fetching response %s&#34;</span>, resp.StatusCode, url)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><code>Wrapf()</code> returns <code>nil</code> if the error passed into it is <code>nil</code>.  This
function erroneously returned <code>nil, nil, nil</code> when the HTTP status
code was not <code>http.StatusOK</code>, because a non-200 status code is not an
error and thus <code>err</code> was <code>nil</code>.  Replacing the <code>errors.Wrapf()</code> call
with <code>errors.Errorf()</code> fixed the bug.</p>
<p>Understanding and contextualizing panic output can make tracking down
errors much easier!  Hopefully this information will come in handy for
you in the future.</p>
<p>Thanks to Peter Teichman, Damian Gryski, and Travis Bischel who all
helped me decode the panic output argument lists.</p>
<h2 id="update">Update</h2>
<p>From the <a href="https://go.dev/doc/go1.17#compiler">Go 1.17 release notes</a>:</p>
<blockquote>
<p>The format of stack traces from the runtime (printed when an uncaught panic occurs, or when runtime.Stack is called) is improved. Previously, the function arguments were printed as hexadecimal words based on the memory layout. Now each argument in the source code is printed separately, separated by commas. Aggregate-typed (struct, array, string, slice, interface, and complex) arguments are delimited by curly braces. A caveat is that the value of an argument that only lives in a register and is not stored to memory may be inaccurate. Function return values (which were usually inaccurate) are no longer printed.</p></blockquote>
<p>And from the <a href="https://go.dev/doc/go1.18#runtime">1.18 release notes</a>:</p>
<blockquote>
<p>Go 1.17 generally improved the formatting of arguments in stack traces, but could print inaccurate values for arguments passed in registers. This is improved in Go 1.18 by printing a question mark (<code>?</code>) after each value that may be inaccurate.</p></blockquote>
<p>A colleague recently had a crash similar to our example above.  The relevant methods looked like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (s *Service) <span style="color:#447fcf">GetCount</span>(repo <span style="color:#6ab825;font-weight:bold">string</span>) (count <span style="color:#6ab825;font-weight:bold">int64</span>, errors []<span style="color:#6ab825;font-weight:bold">error</span>)
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (s *Service) <span style="color:#447fcf">request</span>(method <span style="color:#6ab825;font-weight:bold">string</span>, url <span style="color:#6ab825;font-weight:bold">string</span>, body []<span style="color:#6ab825;font-weight:bold">byte</span>) (status <span style="color:#6ab825;font-weight:bold">int</span>, response []<span style="color:#6ab825;font-weight:bold">byte</span>, errors []<span style="color:#6ab825;font-weight:bold">error</span>)
</span></span></code></pre></div><p>where <code>s.GetCount(...)</code> calls <code>s.request(...)</code>.</p>
<p>The stack trace looked like this:</p>
<pre tabindex="0"><code>github.com/example/service.(*Service).request(0x0, {0x118368d?, 0xc000cd9b20?}, {0xc000588180?, 0x1?}, {0x0, 0x0, 0x0})
	/go/src/github.com/example/service/service.go:38 +0xdc
github.com/example/service.(*Service).GetCount(0xc000896700?, {0xc00084bed0?, 0x1ba03c0?})
	/go/src/github.com/example/service/service.go:69 +0xdc
</code></pre><p>You can see right away that the new output is a big improvement.  The aggregated types (strings and slices in this example) are grouped together.  Return values are omitted entirely.</p>
<p>Here it is again with my annotations:</p>
<pre tabindex="0"><code>github.com/example/service.(*Service).request(
    0x0,                          // *Service receiver, nil pointer (!)
    {0x118368d?, 0xc000cd9b20?},  // method string: pointer and length
    {0xc000588180?, 0x1?},        // url string: pointer and length
    {0x0, 0x0, 0x0}               // body []byte: pointer, length, capacity
)
	/go/src/github.com/example/service/service.go:38 +0xdc

github.com/example/service.(*Service).GetCount(
    0xc000896700?,                // *Service receiver, pointer
    {0xc00084bed0?, 0x1ba03c0?}   // repo string: pointer and length
)
	/go/src/github.com/example/service/service.go:69 +0xdc
</code></pre><p>Pretty clearly here you can see that the <code>nil</code> <code>*Service</code> receiver in the call to <code>request</code> is the problem.  Something on line 38 is trying to dereference it and causing the crash.</p>
<p>But wait.  <code>GetCount</code> calls <code>request</code> and its receiver is not <code>nil</code> (<code>0x0</code>).  What&rsquo;s going on?</p>
<p>The release notes above say that the stack trace could include &ldquo;inaccurate values for arguments passed in registers&rdquo; and signifies this by putting a question mark after such values.</p>
<p><code>GetCount</code> does nothing with the receiver value other than passing it along to the <code>request</code> method.  This means that when <code>GetCount</code> gets the receiver passed in as a register, it does not need to load it into RAM and we get the potentially inaccurate value in our stack trace.</p>
<p><code>request</code> <em>does</em> do something with the value &ndash; dereferences it &ndash; requiring it to be loaded into RAM.  That&rsquo;s why the value is accurate in the <code>request</code> stack frame.</p>
]]></content></entry><entry><title type="html">Testing with os/exec and TestMain</title><link href="https://www.joeshaw.org/testing-with-os-exec-and-testmain/" rel="alternate" type="text/html" title="Testing with os/exec and TestMain"/><published>2017-07-25T08:00:00-04:00</published><updated>2017-07-25T13:00:00-04:00</updated><id>https://www.joeshaw.org/testing-with-os-exec-and-testmain/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="testing"/><summary type="html">One weird trick to make cleaner tests with os/exec</summary><content type="html" xml:base="https://www.joeshaw.org/testing-with-os-exec-and-testmain/"><![CDATA[<p>If you look at the
<a href="https://github.com/golang/go/blob/master/src/os/exec/exec_test.go">tests</a> for
the Go standard library&rsquo;s <code>os/exec</code> package, you&rsquo;ll find a neat trick
for how it tests execution:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">helperCommandContext</span>(t *testing.T, ctx context.Context, s ...<span style="color:#6ab825;font-weight:bold">string</span>) (cmd *exec.Cmd) {
</span></span><span style="display:flex;"><span>    testenv.<span style="color:#447fcf">MustHaveExec</span>(t)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    cs := []<span style="color:#6ab825;font-weight:bold">string</span>{<span style="color:#ed9d13">&#34;-test.run=TestHelperProcess&#34;</span>, <span style="color:#ed9d13">&#34;--&#34;</span>}
</span></span><span style="display:flex;"><span>    cs = <span style="color:#24909d">append</span>(cs, s...)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> ctx != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        cmd = exec.<span style="color:#447fcf">CommandContext</span>(ctx, os.Args[<span style="color:#3677a9">0</span>], cs...)
</span></span><span style="display:flex;"><span>    } <span style="color:#6ab825;font-weight:bold">else</span> {
</span></span><span style="display:flex;"><span>        cmd = exec.<span style="color:#447fcf">Command</span>(os.Args[<span style="color:#3677a9">0</span>], cs...)
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    cmd.Env = []<span style="color:#6ab825;font-weight:bold">string</span>{<span style="color:#ed9d13">&#34;GO_WANT_HELPER_PROCESS=1&#34;</span>}
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> cmd
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// TestHelperProcess isn&#39;t a real test.</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">//</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Some details elided for this blog post.</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">TestHelperProcess</span>(*testing.T) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> os.<span style="color:#447fcf">Getenv</span>(<span style="color:#ed9d13">&#34;GO_WANT_HELPER_PROCESS&#34;</span>) != <span style="color:#ed9d13">&#34;1&#34;</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">defer</span> os.<span style="color:#447fcf">Exit</span>(<span style="color:#3677a9">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    args := os.Args
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">for</span> <span style="color:#24909d">len</span>(args) &gt; <span style="color:#3677a9">0</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">if</span> args[<span style="color:#3677a9">0</span>] == <span style="color:#ed9d13">&#34;--&#34;</span> {
</span></span><span style="display:flex;"><span>            args = args[<span style="color:#3677a9">1</span>:]
</span></span><span style="display:flex;"><span>            <span style="color:#6ab825;font-weight:bold">break</span>
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        args = args[<span style="color:#3677a9">1</span>:]
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> <span style="color:#24909d">len</span>(args) == <span style="color:#3677a9">0</span> {
</span></span><span style="display:flex;"><span>        fmt.<span style="color:#447fcf">Fprintf</span>(os.Stderr, <span style="color:#ed9d13">&#34;No command\n&#34;</span>)
</span></span><span style="display:flex;"><span>        os.<span style="color:#447fcf">Exit</span>(<span style="color:#3677a9">2</span>)
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    cmd, args := args[<span style="color:#3677a9">0</span>], args[<span style="color:#3677a9">1</span>:]
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">switch</span> cmd {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">case</span> <span style="color:#ed9d13">&#34;echo&#34;</span>:
</span></span><span style="display:flex;"><span>        iargs := []<span style="color:#6ab825;font-weight:bold">interface</span>{}{}
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">for</span> _, s := <span style="color:#6ab825;font-weight:bold">range</span> args {
</span></span><span style="display:flex;"><span>            iargs = <span style="color:#24909d">append</span>(iargs, s)
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        fmt.<span style="color:#447fcf">Println</span>(iargs...)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">//// etc...</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>When you run <code>go test</code>, under the covers the toolchain compiles your
test code into a temporary binary and runs it.  (As an aside, passing
<code>-x</code> to the <code>go</code> tool is a great way to learn what the toolchain is
actually doing.)</p>
<p>This helper function in <code>exec_test.go</code> sets a <code>GO_WANT_HELPER_PROCESS</code>
environment variable and <em>calls itself</em> with a parameter directing it
to run a specific test, named <code>TestHelperProcess</code>.</p>
<p>Nate Finch <a href="https://npf.io/2015/06/testing-exec-command/">wrote an excellent blog post</a>
in 2015 on this pattern in greater detail, and Mitchell
Hashimoto&rsquo;s <a href="https://youtu.be/8hQG7QlcLBk">2017 GopherCon talk</a> also
makes mention of this trick.</p>
<p>I think this can be improved upon somewhat with the
<a href="https://golang.org/pkg/testing/#hdr-Main"><code>TestMain</code> mechanism</a>
that was added in Go 1.4, however.</p>
<p>Here it is in action:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">package</span> myexec
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">import</span> (
</span></span><span style="display:flex;"><span>    <span style="color:#ed9d13">&#34;fmt&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ed9d13">&#34;os&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ed9d13">&#34;os/exec&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ed9d13">&#34;strings&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ed9d13">&#34;testing&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">TestMain</span>(m *testing.M) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">switch</span> os.<span style="color:#447fcf">Getenv</span>(<span style="color:#ed9d13">&#34;GO_TEST_MODE&#34;</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">case</span> <span style="color:#ed9d13">&#34;&#34;</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#999;font-style:italic">// Normal test mode</span>
</span></span><span style="display:flex;"><span>        os.<span style="color:#447fcf">Exit</span>(m.<span style="color:#447fcf">Run</span>())
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">case</span> <span style="color:#ed9d13">&#34;echo&#34;</span>:
</span></span><span style="display:flex;"><span>        iargs := []<span style="color:#6ab825;font-weight:bold">interface</span>{}{}
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">for</span> _, s := <span style="color:#6ab825;font-weight:bold">range</span> os.Args[<span style="color:#3677a9">1</span>:] {
</span></span><span style="display:flex;"><span>            iargs = <span style="color:#24909d">append</span>(iargs, s)
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        fmt.<span style="color:#447fcf">Println</span>(iargs...)
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">TestEcho</span>(t *testing.T) {
</span></span><span style="display:flex;"><span>    cmd := exec.<span style="color:#447fcf">Command</span>(os.Args[<span style="color:#3677a9">0</span>], <span style="color:#ed9d13">&#34;hello&#34;</span>, <span style="color:#ed9d13">&#34;world&#34;</span>)
</span></span><span style="display:flex;"><span>    cmd.Env = []<span style="color:#6ab825;font-weight:bold">string</span>{<span style="color:#ed9d13">&#34;GO_TEST_MODE=echo&#34;</span>}
</span></span><span style="display:flex;"><span>    output, err := cmd.<span style="color:#447fcf">Output</span>()
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        t.<span style="color:#447fcf">Errorf</span>(<span style="color:#ed9d13">&#34;echo: %v&#34;</span>, err)
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> g, e := <span style="color:#24909d">string</span>(output), <span style="color:#ed9d13">&#34;hello world\n&#34;</span>; g != e {
</span></span><span style="display:flex;"><span>        t.<span style="color:#447fcf">Errorf</span>(<span style="color:#ed9d13">&#34;echo: want %q, got %q&#34;</span>, e, g)
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>We still set an environment variable and self-execute, but by moving
the dispatching to <code>TestMain</code> we avoid the somewhat-hacky special test
which only ran when a certain environment variable is set, and which
needed to do extra command-line argument handling.</p>
<p><em>Update</em>: Chris Hines wrote about
<a href="http://cs-guy.com/blog/2015/01/test-main/">this and other useful things</a>
you can do with <code>TestMain</code> in a post from 2015 that I did not know
about!</p>
]]></content></entry><entry><title type="html">Don't defer Close() on writable files</title><link href="https://www.joeshaw.org/dont-defer-close-on-writable-files/" rel="alternate" type="text/html" title="Don't defer Close() on writable files"/><published>2017-06-12T08:00:00-04:00</published><updated>2017-06-13T11:20:00-04:00</updated><id>https://www.joeshaw.org/dont-defer-close-on-writable-files/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="defer"/><summary type="html">It'll bite you some day</summary><content type="html" xml:base="https://www.joeshaw.org/dont-defer-close-on-writable-files/"><![CDATA[<div class="notices note" ><em>Update</em>: Another approach suggested by the inimitable
<a href="https://twitter.com/benbjohnson/status/874286396411961345">Ben Johnson</a>
has been added to the end of the post.</div>
<div class="notices note" ><em>Update 2</em>: Discussion about <code>fsync()</code> added to the end of the post.</div>
<p>It&rsquo;s an idiom that quickly becomes rote to Go programmers: whenever
you conjure up a value that implements the <code>io.Closer</code> interface,
after checking for errors you immediately <code>defer</code> its <code>Close()</code>
method.  You see this most often when making HTTP requests:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span>resp, err := http.<span style="color:#447fcf">Get</span>(<span style="color:#ed9d13">&#34;https://joeshaw.org&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">defer</span> resp.Body.<span style="color:#447fcf">Close</span>()
</span></span></code></pre></div><p>or opening files:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span>f, err := os.<span style="color:#447fcf">Open</span>(<span style="color:#ed9d13">&#34;/home/joeshaw/notes.txt&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">defer</span> f.<span style="color:#447fcf">Close</span>()
</span></span></code></pre></div><p>But this idiom is actually harmful for writable files because
deferring a function call ignores its return value, and the <code>Close()</code>
method can return errors.  <strong>For writable files, Go programmers should
avoid the <code>defer</code> idiom or very infrequent, maddening bugs will
occur.</strong></p>
<p>Why would you get an error from <code>Close()</code> but not an earlier <code>Write()</code>
call?  To answer that we need to take a brief, high-level detour into
the area of computer architecture.</p>
<p>Generally speaking, as you move outside and away from your CPU,
actions get orders of magnitude slower.  Writing to a CPU register is
very fast.  Accessing system RAM is quite slow in comparison.  Doing
I/O on disks or networks is an eternity.</p>
<p>If every <code>Write()</code> call committed the data to the disk synchronously,
the performance of our systems would be unusably slow.  While
synchronous writes are very important for certain types of software
(like databases), most of the time it&rsquo;s overkill.</p>
<p>The pathological case is writing to a file one byte at a time.  Hard
drives &ndash; brutish, mechanical devices &ndash; need to physically move a
magnetic head to the position on the platter and possibly wait for a
full platter revolution before the data could be persisted.  SSDs,
which store data in blocks and have a finite number of write cycles
for each block, would quickly burn out as blocks are repeatedly
written and overwritten.</p>
<p>Fortunately this doesn&rsquo;t happen because multiple layers within
hardware and software implement caching and write buffering.  When you
call <code>Write()</code>, your data is not immediately being written to media.
The operating system, storage controllers and the media itself are all
buffering the data in order to batch smaller writes together,
organizing the data optimally for storage on the medium, and deciding
when best to commit it.  This turns our writes from slow, blocking
synchronous operations to quick, asynchronous operations that don&rsquo;t
directly touch the much slower I/O device.  Writing a byte at a time
is never the most efficient thing to do, but at least we are not
wearing out our hardware if we do it.</p>
<p>Of course, the bytes do have to be committed to disk at some point.
The operating system knows that when we close a file, we are finished
with it and no subsequent write operations are going to happen.  It
also knows that closing the file is its last chance to tell us
something went wrong.</p>
<p>On POSIX systems like Linux and macOS, closing a file is handled by
the <code>close</code> system call.  The BSD man page for <code>close(2)</code> talks about
the errors it can return:</p>
<pre tabindex="0"><code>ERRORS
     The close() system call will fail if:

     [EBADF]            fildes is not a valid, active file descriptor.

     [EINTR]            Its execution was interrupted by a signal.

     [EIO]              A previously-uncommitted write(2) encountered an input/output
                        error.
</code></pre><p><code>EIO</code> is exactly the error we are worried about.  It means that we&rsquo;ve
lost data trying to save it to disk, and our Go programs should
absolutely not return a <code>nil</code> error in that case.</p>
<p>The simplest way to solve this is simply not to use <code>defer</code> when
writing files:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">helloNotes</span>() <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    f, err := os.<span style="color:#447fcf">Create</span>(<span style="color:#ed9d13">&#34;/home/joeshaw/notes.txt&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err = io.<span style="color:#447fcf">WriteString</span>(f, <span style="color:#ed9d13">&#34;hello world&#34;</span>); err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        f.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> f.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This does mean additional bookkeeping of the file in the case of
errors: we must explicitly close it in the case where
<code>io.WriteString()</code> fails (and ignore its error, because the write
error takes precedence).  But it&rsquo;s clear, straightforward, and
properly checks the error from the <code>f.Close()</code> call.</p>
<p>There <em>is</em> a way to handle this case with <code>defer</code> by using named
return values and a closure:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">helloNotes</span>() (err <span style="color:#6ab825;font-weight:bold">error</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">var</span> f *os.File
</span></span><span style="display:flex;"><span>    f, err = os.<span style="color:#447fcf">Create</span>(<span style="color:#ed9d13">&#34;/home/joeshaw/notes.txt&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">defer</span> <span style="color:#6ab825;font-weight:bold">func</span>() {
</span></span><span style="display:flex;"><span>        cerr := f.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">if</span> err == <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>            err = cerr
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>    }()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    err = io.<span style="color:#447fcf">WriteString</span>(f, <span style="color:#ed9d13">&#34;hello world&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The main benefit of this pattern is that it&rsquo;s not possible to forget
to close the file because the deferred closure always executes.  In
longer functions with more <code>if err != nil</code> conditional branches, this
pattern can also result in fewer lines of code and less repetition.</p>
<p>Still, I find this pattern to be a little too magical.  I dislike
using named return values, and modifying the return value after the
core function finishes is not intuitively clear even to experienced Go
programmers.</p>
<p>I am willing to accept the tradeoff of more readable and easily
understandable code for needing to obsessively review code to ensure
that the file is closed in all cases, and that&rsquo;s the approach I
recommend in code reviews I give to others.</p>
<h2 id="update">Update</h2>
<p>On Twitter, <a href="https://twitter.com/benbjohnson/status/874286396411961345">Ben Johnson
suggested</a>
that <code>Close()</code> may be safe to run multiple times on files, like so:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">doSomething</span>() <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    f, err := os.<span style="color:#447fcf">Create</span>(<span style="color:#ed9d13">&#34;foo&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">defer</span> f.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> _, err := f.<span style="color:#447fcf">Write</span>([]<span style="color:#24909d">byte</span>(<span style="color:#ed9d13">&#34;bar&#34;</span>); err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err := f.<span style="color:#447fcf">Close</span>(); err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> <span style="color:#6ab825;font-weight:bold">nil</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><a href="https://gist.github.com/benbjohnson/9eebd201ec096ab6430e1f33411e6427">(gist)</a></p>
<p>The Go docs on <code>io.Closer</code> <a href="https://golang.org/pkg/io/#Closer">explicitly say</a> that
at an interface level behavior after the first call is unspecificed,
but specific implementations may document their own behavior.</p>
<p>The docs for <code>*os.File</code> <a href="https://golang.org/pkg/os/#File.Close">unfortunately aren&rsquo;t clear</a>
on its behavior, saying only, &ldquo;Close closes the File, rendering it
unusable for I/O.  It returns an error, if any.&rdquo;  The implemenation as
of 1.8, however, shows:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (f *File) <span style="color:#447fcf">Close</span>() <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> f == <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> ErrInvalid
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> f.file.<span style="color:#24909d">close</span>()
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (file *file) <span style="color:#24909d">close</span>() <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> file == <span style="color:#6ab825;font-weight:bold">nil</span> || file.fd == badFd {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> syscall.EINVAL
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">var</span> err <span style="color:#6ab825;font-weight:bold">error</span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> e := syscall.<span style="color:#447fcf">Close</span>(file.fd); e != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        err = &amp;PathError{<span style="color:#ed9d13">&#34;close&#34;</span>, file.name, e}
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    file.fd = -<span style="color:#3677a9">1</span> <span style="color:#999;font-style:italic">// so it can&#39;t be closed again</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#999;font-style:italic">// no need for a finalizer anymore</span>
</span></span><span style="display:flex;"><span>    runtime.<span style="color:#447fcf">SetFinalizer</span>(file, <span style="color:#6ab825;font-weight:bold">nil</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>For clarity, <code>badFd</code> is defined as -1, so subsequent attempts to close
an <code>*os.File</code> will do nothing and return <code>syscall.EINVAL</code>.  But since
we are ignoring the error from the <code>defer</code>, this doesn&rsquo;t matter.  It&rsquo;s
not idempotent, exactly, but as Ben put later in the Twitter thread,
it <a href="https://twitter.com/benbjohnson/status/874289044800368640">&ldquo;won&rsquo;t blow shit up if you call it
twice.&rdquo;</a></p>
<p>The implementation is a good, common-sense one and it seems unlikely
to change in the future and cause problems.  But the lack of
documentation about this outcome makes me a little nervous.  Maybe a
doc update to codify this behavior would be a good task for Go 1.10.</p>
<h2 id="update-2">Update 2</h2>
<p>Closing the file is the last chance the OS has to tell us about
problems, but the buffers are not necessarily going to be flushed when
you close the file.  It&rsquo;s entirely possible that flushing the write
buffer to disk will happen <em>after</em> you close the file, and a failure
there cannot be caught.  If this happens, it usually means you have
something seriously wrong, like a failing disk.</p>
<p>However, you can force the write to disk with the <code>Sync()</code> method on
<code>*os.File</code>, which calls the <code>fsync</code> system call.  You should check for
errors from that call, but then I think it&rsquo;s safe to ignore an error
from <code>Close()</code>.  Calling <code>fsync</code> has serious implications on
performance: it&rsquo;s flushing write buffers out to slow disks.  But if
you really, really want the data on disk, the best pattern to follow
is probably:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">helloNotes</span>() <span style="color:#6ab825;font-weight:bold">error</span> {
</span></span><span style="display:flex;"><span>    f, err := os.<span style="color:#447fcf">Create</span>(<span style="color:#ed9d13">&#34;/home/joeshaw/notes.txt&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">defer</span> f.<span style="color:#447fcf">Close</span>()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> err = io.<span style="color:#447fcf">WriteString</span>(f, <span style="color:#ed9d13">&#34;hello world&#34;</span>); err != <span style="color:#6ab825;font-weight:bold">nil</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">return</span> err
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> f.<span style="color:#447fcf">Sync</span>()
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div>]]></content></entry><entry><title type="html">Revisiting context and http.Handler for Go 1.7</title><link href="https://www.joeshaw.org/revisiting-context-and-http-handler-for-go-17/" rel="alternate" type="text/html" title="Revisiting context and http.Handler for Go 1.7"/><published>2016-08-30T16:10:00-04:00</published><updated>2016-08-30T16:10:00-04:00</updated><id>https://www.joeshaw.org/revisiting-context-and-http-handler-for-go-17/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="context"/><summary type="html">I forget what I was told by myself</summary><content type="html" xml:base="https://www.joeshaw.org/revisiting-context-and-http-handler-for-go-17/"><![CDATA[<p>Go 1.7 <a href="https://blog.golang.org/go1.7">was released earlier this
month</a>, and the thing I&rsquo;m most excited
about is the incorporation of the <a href="https://golang.org/pkg/context/"><code>context</code>
package</a> into the Go standard
library.  Previously it lived in the <code>golang.org/x/net/context</code>
package.</p>
<p>With the move, other packages within the standard library can now use
it.  The <a href="https://golang.org/pkg/net/#Dialer.DialContext"><code>net</code> package&rsquo;s
Dialer</a> and <a href="https://golang.org/pkg/os/exec/#CommandContext"><code>os/exec</code>
package&rsquo;s Command</a> can
now utilize contexts for easy cancelation.  More on this can be found
in the <a href="https://golang.org/doc/go1.7#context">Go 1.7 release notes</a>.</p>
<p>Go 1.7 also brings contexts to the <code>net/http</code> package&rsquo;s <a href="https://golang.org/pkg/net/http/#Request.Context"><code>Request</code>
type</a> for both HTTP
clients and servers.  Last year <a href="/net-context-and-http-handler/">I wrote a
post</a> about using <code>context.Context</code>
with <code>http.Handler</code> when it lived outside the standard library, but Go
1.7 makes things much simpler and thankfully renders all of the
approaches from that post obsolete.</p>
<h2 id="a-quick-recap">A quick recap</h2>
<p>I suggest reading <a href="/net-context-and-http-handler/">my original post</a>
for more background, but one of the main uses of <code>context.Context</code> is
to pass around request-scoped data.  Things like request IDs,
authenticated user information, and other data useful for handlers and
middleware to examine in the scope of a single HTTP request.</p>
<p>In that post I examined three different approaches for incorporating
context into requests.  Since contexts are now attached to
<code>http.Request</code> values, this is no longer necessary.  As long as you&rsquo;re
willing to require at least Go 1.7, it&rsquo;s now possible to use the
standard <code>http.Handler</code> interface and common middleware patterns with
<code>context.Context</code>!</p>
<h2 id="the-new-approach">The new approach</h2>
<p>Recall that the <code>http.Handler</code> interface is defined as:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> Handler <span style="color:#6ab825;font-weight:bold">interface</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#447fcf">ServeHTTP</span>(ResponseWriter, *Request)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Go 1.7 adds new context-related methods on the <code>*http.Request</code> type.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (r *Request) <span style="color:#447fcf">Context</span>() context.Context
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (r *Request) <span style="color:#447fcf">WithContext</span>(ctx context.Context) *Request
</span></span></code></pre></div><p>The <a href="https://golang.org/pkg/net/http/#Request.Context"><code>Context</code>
method</a> returns the
current context associated with the request.  The <a href="https://golang.org/pkg/net/http/#Request.WithContext"><code>WithContext</code>
method</a> creates
a new <code>Request</code> value with the provided context.</p>
<p>Suppose we want each request to have an associated ID, pulling it from
the <code>X-Request-ID</code> HTTP header if present, and generating it if not.
We might implement the context functions like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> key <span style="color:#6ab825;font-weight:bold">int</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">const</span> requestIDKey key = <span style="color:#3677a9">0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">newContextWithRequestID</span>(ctx context.Context, req *http.Request) context.Context {
</span></span><span style="display:flex;"><span>    reqID := req.Header.<span style="color:#447fcf">Get</span>(<span style="color:#ed9d13">&#34;X-Request-ID&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> reqID == <span style="color:#ed9d13">&#34;&#34;</span> {
</span></span><span style="display:flex;"><span>        reqID = <span style="color:#447fcf">generateRandomID</span>()
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> context.<span style="color:#447fcf">WithValue</span>(ctx, requestIDKey, reqID)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">requestIDFromContext</span>(ctx context.Context) <span style="color:#6ab825;font-weight:bold">string</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> ctx.<span style="color:#447fcf">Value</span>(requestIDKey).(<span style="color:#6ab825;font-weight:bold">string</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>We can implement middleware that derives a new context with a request
ID, create a new Request value from it, and pass it onto the next
handler in the chain.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">middleware</span>(next http.Handler) http.Handler {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> http.<span style="color:#447fcf">HandlerFunc</span>(<span style="color:#6ab825;font-weight:bold">func</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>        ctx := <span style="color:#447fcf">newContextWithRequestID</span>(req.<span style="color:#447fcf">Context</span>(), req)
</span></span><span style="display:flex;"><span>        next.<span style="color:#447fcf">ServeHTTP</span>(rw, req.<span style="color:#447fcf">WithContext</span>(ctx))
</span></span><span style="display:flex;"><span>    })
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The final handler and any middleware lower in the chain have access to
all the previously request-scoped data set in middleware above it.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">handler</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>    reqID := <span style="color:#447fcf">requestIDFromContext</span>(req.<span style="color:#447fcf">Context</span>())
</span></span><span style="display:flex;"><span>    fmt.<span style="color:#447fcf">Fprintf</span>(rw, <span style="color:#ed9d13">&#34;Hello request ID %v\n&#34;</span>, reqID)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>And that&rsquo;s it!  It&rsquo;s no longer necessary to implement custom context
handlers, adapters to standard <code>http.Handler</code> implementations, or
hackily wrap <code>http.ResponseWriter</code>.  Everything you need is in the
standard library, and right there on the <code>*http.Request</code> type.</p>
]]></content></entry><entry><title type="html">Smaller Docker containers for Go apps</title><link href="https://www.joeshaw.org/smaller-docker-containers-for-go-apps/" rel="alternate" type="text/html" title="Smaller Docker containers for Go apps"/><published>2015-07-31T13:00:00-05:00</published><updated>2018-01-24T15:15:00-05:00</updated><id>https://www.joeshaw.org/smaller-docker-containers-for-go-apps/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="docker"/><category term="golang"/><category term="go"/><summary type="html">Reducing image sizes by more than 90%</summary><content type="html" xml:base="https://www.joeshaw.org/smaller-docker-containers-for-go-apps/"><![CDATA[<div class="notices note" ><em>Update January 2018</em>: <a href="https://docs.docker.com/engine/userguide/eng-image/multistage-build/">Multi-stage
builds</a>,
which were introduced in Docker 17.05, are an easier way to achieve
the same small Docker images.</div>
<p>At <a href="http://litl.com">litl</a> we use <a href="http://docker.com">Docker</a> images
to package and deploy our <a href="https://roomformore.com">Room for More</a>
services, using our <a href="https://github.com/litl/galaxy">Galaxy</a>
deployment platform.  This week I spent some time looking into how we
might reduce the size of our images and speed up container
deployments.</p>
<p>Most of our services are in Go, and thanks to the fact that compiled
Go binaries are mostly-statically linked by default, it&rsquo;s possible to
create containers with very few files within.  It&rsquo;s surely possible to
use these techniques to create tighter containers for other languages
that need more runtime support, but for this post I&rsquo;m only focusing on
Go apps.</p>
<h2 id="the-old-way">The old way</h2>
<p>We built images in a very traditional way, using a base image built on
top of Ubuntu with Go 1.4.2 installed.  For my examples I&rsquo;ll use
something similar.</p>
<p>Here&rsquo;s a <code>Dockerfile</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-docker" data-lang="docker"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">FROM</span><span style="color:#ed9d13"> golang:1.4.2</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">EXPOSE</span><span style="color:#ed9d13"> 1717</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">RUN</span> go get github.com/joeshaw/qotd<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#999;font-style:italic"># Don&#39;t run network servers as root in Docker</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">USER</span><span style="color:#ed9d13"> nobody</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">CMD</span> qotd<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span></code></pre></div><p>The <code>golang:1.4.2</code> base image is built on top of Debian Jessie.  Let&rsquo;s
build this bad boy and see how big it is.</p>
<pre tabindex="0"><code>$ docker build -t qotd .
...
Successfully built ae761b93e656

$ docker images qotd
REPOSITORY     TAG         IMAGE ID          CREATED           VIRTUAL SIZE
qotd           latest      ae761b93e656      3 minutes ago     520.3 MB
</code></pre><p>Yikes.  Half a gigabyte.  Ok, what leads us to a container this size?</p>
<pre tabindex="0"><code>$ docker history qotd
IMAGE               CREATED BY                                      SIZE
ae761b93e656        /bin/sh -c #(nop) CMD [&#34;/bin/sh&#34; &#34;-c&#34; &#34;qotd&#34;]   0 B
b77d0ca3c501        /bin/sh -c #(nop) USER [nobody]                 0 B
a4b2a01d3e42        /bin/sh -c go get github.com/joeshaw/qotd       3.021 MB
c24802660bfa        /bin/sh -c #(nop) EXPOSE 1717/tcp               0 B
124e2127157f        /bin/sh -c #(nop) COPY file:56695ddefe9b0bd83   2.481 kB
69c177f0c117        /bin/sh -c #(nop) WORKDIR /go                   0 B
141b650c3281        /bin/sh -c #(nop) ENV PATH=/go/bin:/usr/src/g   0 B
8fb45e60e014        /bin/sh -c #(nop) ENV GOPATH=/go                0 B
63e9d2557cd7        /bin/sh -c mkdir -p /go/src /go/bin &amp;&amp; chmod    0 B
b279b4aae826        /bin/sh -c #(nop) ENV PATH=/usr/src/go/bin:/u   0 B
d86979befb72        /bin/sh -c cd /usr/src/go/src &amp;&amp; ./make.bash    97.4 MB
8ddc08289e1a        /bin/sh -c curl -sSL https://golang.org/dl/go   39.69 MB
8d38711ccc0d        /bin/sh -c #(nop) ENV GOLANG_VERSION=1.4.2      0 B
0f5121dd42a6        /bin/sh -c apt-get update &amp;&amp; apt-get install    88.32 MB
607e965985c1        /bin/sh -c apt-get update &amp;&amp; apt-get install    122.3 MB
1ff9f26f09fb        /bin/sh -c apt-get update &amp;&amp; apt-get install    44.36 MB
9a61b6b1315e        /bin/sh -c #(nop) CMD [&#34;/bin/bash&#34;]             0 B
902b87aaaec9        /bin/sh -c #(nop) ADD file:e1dd18493a216ecd0c   125.2 MB
</code></pre><p>This is not a very lean container, with a lot of intermediate layers.
To reduce the size of our containers, we did two additional steps:</p>
<p>(1) Every repo has a <code>clean.sh</code> script that is run inside the
container after it is initially built.  Here&rsquo;s part of a script for
one of our Ubuntu-based Go images:</p>
<pre tabindex="0"><code>apt-get purge -y software-properties-common byobu curl git htop man unzip vim \
python-dev python-pip python-virtualenv python-dev python-pip python-virtualenv \
python2.7 python2.7 libpython2.7-stdlib:amd64 libpython2.7-minimal:amd64 \
libgcc-4.8-dev:amd64 cpp-4.8 libruby1.9.1 perl-modules vim-runtime \
vim-common vim-tiny libpython3.4-stdlib:amd64 python3.4-minimal xkb-data \
xml-core libx11-data fonts-dejavu-core groff-base eject python3 locales \
python-software-properties supervisor git-core make wget cmake gcc bzr mercurial \
libglib2.0-0:amd64 libxml2:amd64

apt-get clean autoclean
apt-get autoremove -y

rm -rf /usr/local/go
rm -rf /usr/local/go1.*.linux-amd64.tar.gz
rm -rf /var/lib/{apt,dpkg,cache,log}/
rm -rf /var/{cache,log}
</code></pre><p>(2) We run <a href="http://jasonwilder.com/">Jason Wilder</a>&rsquo;s excellent
<a href="http://jasonwilder.com/blog/2014/08/19/squashing-docker-images/">docker-squash</a>
tool.  It is especially helpful when combined with the <code>clean.sh</code>
script above.</p>
<p>These steps are time intensive.  Cleaning and squashing take minutes
and dominate the overall build and deploy time.</p>
<p>In the end, we have built a mostly-statically linked Go binary sitting
alongside an entire Debian or Ubuntu operating system.  We can do
better.</p>
<h2 id="separating-containers-for-building-and-running">Separating containers for building and running</h2>
<p>There have been a handful of good blog posts about how to do this in
the past, including <a href="https://developer.atlassian.com/blog/2015/07/osx-static-golang-binaries-with-docker/">one by
Atlassian</a>
this week.  Here&rsquo;s <a href="http://blog.xebia.com/2014/07/04/create-the-smallest-possible-docker-container/">another one from
Xebia</a>,
and <a href="https://blog.codeship.com/building-minimal-docker-containers-for-go-applications/">another from
Codeship</a>.</p>
<p>However, all these posts focus on building a <em>completely</em> static Go
binary.  This means you eschew <code>cgo</code> by setting <code>CGO_ENABLED=0</code> and
the benefits that go along with it.  On OS X, you lose access to the
system&rsquo;s SSL root CA certificates.  On Linux, <code>user.Current()</code> from
the <code>os/user</code> package no longer works.  And in both cases you must use
the Go DNS resolver rather than the one provided by the operating
system.  If you are not testing your application with <code>CGO_ENABLED=0</code>
prior to building a Docker container with it then <em>you are not testing
the code you ship</em>.</p>
<p>We can use a few purpose-built base Docker images and the tricks from
Jamie McCrindle&rsquo;s
<a href="https://github.com/jamiemccrindle/dockerception">Dockerception</a> to
build two separate Docker containers: one larger container to build
our software and another smaller one to run it.</p>
<h3 id="the-builder">The builder</h3>
<p>We create a <code>Dockerfile.build</code>, which is responsible for initializing
the build environment and building the software:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-docker" data-lang="docker"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">FROM</span><span style="color:#ed9d13"> golang:1.4.2</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">RUN</span> go get github.com/joeshaw/qotd<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">COPY</span> / Dockerfile.run<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#999;font-style:italic"># This command outputs a tarball which can be piped into</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#999;font-style:italic"># `docker build -f Dockerfile.run -`</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">CMD</span> tar -cf - -C / Dockerfile.run -C <span style="color:#40ffff">$GOPATH</span>/bin qotd<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span></code></pre></div><p>This container, when run, will output a tarball to standard out,
containing only our <code>qotd</code> binary and <code>Dockerfile.run</code>, used to build
the runner.</p>
<h3 id="dynamically-linked-binary">Dynamically linked binary</h3>
<p>Notice that we did not set <code>CGO_ENABLED=0</code> here, so our binary is
still dynamically linked against GNU <code>libc</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>$ ldd <span style="color:#40ffff">$GOPATH</span>/bin/qotd
</span></span><span style="display:flex;"><span>	linux-vdso.so.1 (0x00007ffea6b8a000)
</span></span><span style="display:flex;"><span>	libpthread.so.0 =&gt; /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6e76e50000)
</span></span><span style="display:flex;"><span>	libc.so.6 =&gt; /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6e76aa7000)
</span></span><span style="display:flex;"><span>        /lib64/ld-linux-x86-64.so.2 (0x00007f6e7706d000)
</span></span></code></pre></div><p>We need to run this binary in an environment that has <code>glibc</code>
available to us.  That means we cannot use stock BusyBox (which uses
<code>uClibc</code>) or Alpine (which uses <code>musl</code>).  However, the BusyBox
distribution that ships with Ubuntu <em>is</em> linked against <code>glibc</code>, and
that&rsquo;ll be the foundation for our running container.</p>
<p>The <code>busybox:ubuntu-14.04</code> image only has a root user, but you should
never run network-facing servers as root, even in a container.  Use
<a href="https://registry.hub.docker.com/u/joeshaw/busybox-nonroot/">my <code>joeshaw/busybox-nonroot</code>
image</a>
— which adds a <code>nobody</code> user with UID 1 — instead.</p>
<h3 id="the-runner">The runner</h3>
<p>Now we create a <code>Dockerfile.run</code>, which is responsible for creating
the environment in which to run our app:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-docker" data-lang="docker"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">FROM</span><span style="color:#ed9d13"> joeshaw/busybox-nonroot</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">EXPOSE</span><span style="color:#ed9d13"> 1717</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">COPY</span> qotd /bin/qotd<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">USER</span><span style="color:#ed9d13"> nobody</span><span style="color:#a61717;background-color:#e3d2d2">
</span></span></span><span style="display:flex;"><span><span style="color:#a61717;background-color:#e3d2d2"></span><span style="color:#6ab825;font-weight:bold">CMD</span> qotd<span style="color:#a61717;background-color:#e3d2d2">
</span></span></span></code></pre></div><h3 id="putting-them-together">Putting them together</h3>
<p>First, create the builder image:</p>
<pre tabindex="0"><code>docker build -t qotd-builder -f Dockerfile.build .
</code></pre><p>Next, run the builder container, piping its output into the creation
of the runner container:</p>
<pre tabindex="0"><code>docker run --rm qotd-builder | docker build -t qotd -f Dockerfile.run -
</code></pre><p>Now we have a <code>qotd</code> container which has the basic BusyBox
environment, plus our <code>qotd</code> binary.  The size?</p>
<pre tabindex="0"><code>$ docker images qotd
REPOSITORY     TAG         IMAGE ID          CREATED           VIRTUAL SIZE
qotd           latest      92e7def8f105      3 minutes ago     8.611 MB
</code></pre><p>Under 9 MB.  Much improved.  Better still, it doesn&rsquo;t require
squashing, which saves us a lot of time.</p>
<h2 id="conclusion">Conclusion</h2>
<p>In this example, we were able to go from a 500 MB image built from
<code>golang:1.4.2</code> and containing a whole Debian installation down to a 9
MB image of just BusyBox and our binary.  That&rsquo;s a 98% reduction in
size.</p>
<p>For one of our real services at litl, we reduced the image size from
300 MB (squashed) to 25 MB and the time to build and deploy the
container from 8 minutes to 2.  That time is now dominated by building
the container and software, and not by cleaning and squashing the
resulting image.  We didn&rsquo;t have to give up on using <code>cgo</code> and
<code>glibc</code>, as some of its features are essential to us.  If you&rsquo;re using
Docker to deploy services written in Go, this approach can save you a
lot of time and disk space.  Good luck!</p>
]]></content></entry><entry><title type="html">Go's net/context and http.Handler</title><link href="https://www.joeshaw.org/net-context-and-http-handler/" rel="alternate" type="text/html" title="Go's net/context and http.Handler"/><published>2015-05-06T09:38:01-05:00</published><updated>2021-02-07T15:30:00-05:00</updated><id>https://www.joeshaw.org/net-context-and-http-handler/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="go"/><category term="golang"/><category term="context"/><summary type="html">Two great tastes that taste great together</summary><content type="html" xml:base="https://www.joeshaw.org/net-context-and-http-handler/"><![CDATA[<div class="notices note" >The approaches in this post are now obsolete thanks to Go 1.7, which
adds the <code>context</code> package to the standard library and uses it in the
<code>net/http *http.Request</code> type.  The background info here may still be
helpful, but <a href="/revisiting-context-and-http-handler-for-go-17/">I wrote a follow-up post that revisits things for Go 1.7
and beyond.</a></div>
<p>The <code>golang.org/x/net/context</code> package (hereafter referred as
<code>net/context</code> although it&rsquo;s not yet in the standard library) is a
wonderful tool for the Go programmer&rsquo;s toolkit.  The <a href="https://blog.golang.org/context">blog post that
introduced it</a> shows how useful it is
when dealing with external services and the need to cancel requests,
set deadlines, and send along request-scoped key/value data.</p>
<p>The request-scoped key/value data also makes it very appealing as a
means of passing data around through middleware and handlers in Go web
servers.  Most Go web frameworks have their own concept of context,
although none yet use <code>net/context</code> directly.</p>
<p>Questions about using <code>net/context</code> for this kind of server-side
context keep popping up on the <a href="http://www.reddit.com/r/golang/comments/30ai3b/is_there_an_open_source_complex_real_world_web/">/r/golang
subreddit</a>
and the <a href="http://blog.gopheracademy.com/gophers-slack-community/">Gopher&rsquo;s Slack
community</a>.
Having recently ported a fairly large API surface from Martini to
<code>http.ServeMux</code> and <code>net/context</code>, I hope this post can answer those
questions.</p>
<h2 id="about-httphandler">About <code>http.Handler</code></h2>
<p>The basic unit in Go&rsquo;s HTTP server is its <a href="http://golang.org/pkg/net/http/#Handler"><code>http.Handler</code>
interface</a>, which is defined as:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> Handler <span style="color:#6ab825;font-weight:bold">interface</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#447fcf">ServeHTTP</span>(ResponseWriter, *Request)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><code>http.ResponseWriter</code> is <a href="http://golang.org/pkg/net/http/#ResponseWriter">another simple
interface</a> and
<code>http.Request</code> is a <code>struct</code> that <a href="http://golang.org/pkg/net/http/#Request">contains data corresponding to the
HTTP request</a>, things like
URL, headers, body if any, etc.</p>
<p>Notably, there&rsquo;s no way to pass anything like a <code>context.Context</code>
here.</p>
<h2 id="about-contextcontext">About <code>context.Context</code></h2>
<p>Much more detail about contexts can be found in the <a href="https://blog.golang.org/context">introductory blog
post</a>, but the main aspect I want to
call attention to in this post is that contexts are derived from other
contexts.  Context values become arranged as a tree, and you only have
access to values set on your context or one of its ancestor nodes.</p>
<p>For example, let&rsquo;s take <code>context.Background()</code> as the root of the
tree, and derive a new context by attaching the content of the
<code>X-Request-ID</code> HTTP header.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> key <span style="color:#6ab825;font-weight:bold">int</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">const</span> requestIDKey key = <span style="color:#3677a9">0</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">newContextWithRequestID</span>(ctx context.Context, req *http.Request) context.Context {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> context.<span style="color:#447fcf">WithValue</span>(ctx, requestIDKey, req.Header.<span style="color:#447fcf">Get</span>(<span style="color:#ed9d13">&#34;X-Request-ID&#34;</span>))
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">requestIDFromContext</span>(ctx context.Context) <span style="color:#6ab825;font-weight:bold">string</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> ctx.<span style="color:#447fcf">Value</span>(requestIDKey).(<span style="color:#6ab825;font-weight:bold">string</span>)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>ctx := context.<span style="color:#447fcf">Background</span>()
</span></span><span style="display:flex;"><span>ctx = <span style="color:#447fcf">newContextWithRequestID</span>(ctx, req)
</span></span></code></pre></div><p>This derived context is the one we would then pass to the next layer
of the system.  Perhaps that would create its own contexts with
values, deadlines, or timeouts, or it could extract values we
previously stored.</p>
<h2 id="approaches">Approaches</h2>
<div class="notices note" >These approaches are now obsolete as of Go 1.7.  <a href="/revisiting-context-and-http-handler-for-go-17/">Read my follow-up
post that revisits this topic for Go 1.7 and
beyond.</a></div>
<p>So, without direct support for <code>net/context</code> in the standard library,
we have to find another way to get a <code>context.Context</code> into our
handlers.</p>
<p>There are three basic approaches:</p>
<ol>
<li>Use a global request-to-context mapping</li>
<li>Create a <code>http.ResponseWriter</code> wrapper struct</li>
<li>Create your own handler types</li>
</ol>
<p>Let&rsquo;s examine each.</p>
<h2 id="global-request-to-context-mapping">Global request-to-context mapping</h2>
<p>In this approach we create a global map of requests to contexts, and
wrap our handlers in a middleware that handles the lifetime of the
context associated with a request.  This is the approach taken by
<a href="http://www.gorillatoolkit.org/pkg/context">Gorilla&rsquo;s context
package</a>, although with its
own context type rather than <code>net/context</code>.</p>
<p>Because every HTTP request is processed in its own goroutine and Go&rsquo;s
maps are not safe for concurrent access for performance reasons, it is
crucial that we protect all map accesses with a <code>sync.Mutex</code>.  This
also introduces lock contention among concurrently processed requests.
Depending on your application and workload, this could become a
bottleneck.</p>
<p>In general, though, this approach works well for Gorilla&rsquo;s context,
because its context value is simply a map of key/value pairs.  Our
context is arranged like a tree, and it&rsquo;s important that the map
always hold a reference to the leaf.  This places a burden on the
programmer to manually update the pointer&rsquo;s value as new contexts are
derived.</p>
<p>An example usage might look like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">var</span> cmap = <span style="color:#6ab825;font-weight:bold">map</span>[*http.Request]*context.Context{}
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">var</span> cmapLock sync.Mutex
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Note that we are returning a pointer to the context, not the</span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// context itself.</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">contextFromRequest</span>(req *http.Request) *context.Context {
</span></span><span style="display:flex;"><span>    cmapLock.<span style="color:#447fcf">Lock</span>()
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">defer</span> cmapLock.<span style="color:#447fcf">Unlock</span>()
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> cmap[req]
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#999;font-style:italic">// Necessary wrapper around all handlers.  Must be the first middleware.</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">contextHandler</span>(ctx context.Context, h http.Handler) http.Handler {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> http.<span style="color:#447fcf">HandlerFunc</span>(<span style="color:#6ab825;font-weight:bold">func</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>        ctx2 := ctx <span style="color:#999;font-style:italic">// make a copy of the root context reference</span>
</span></span><span style="display:flex;"><span>        cmapLock.<span style="color:#447fcf">Lock</span>()
</span></span><span style="display:flex;"><span>        cmap[req] = &amp;ctx2
</span></span><span style="display:flex;"><span>        cmapLock.<span style="color:#447fcf">Unlock</span>()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        h.<span style="color:#447fcf">ServeHTTP</span>(rw, req)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        cmapLock.<span style="color:#447fcf">Lock</span>()
</span></span><span style="display:flex;"><span>        <span style="color:#24909d">delete</span>(cmap, req)
</span></span><span style="display:flex;"><span>        cmapLock.<span style="color:#447fcf">Unlock</span>()
</span></span><span style="display:flex;"><span>    })
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">middleware</span>(h http.Handler) http.Handler {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> http.<span style="color:#447fcf">HandlerFunc</span>(<span style="color:#6ab825;font-weight:bold">func</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>        ctxp := <span style="color:#447fcf">contextFromRequest</span>(req)
</span></span><span style="display:flex;"><span>        *ctxp = <span style="color:#447fcf">newContextWithRequestID</span>(*ctxp, req)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        h.<span style="color:#447fcf">ServeHTTP</span>(rw, req)
</span></span><span style="display:flex;"><span>    })
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">handler</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>    ctxp := <span style="color:#447fcf">contextFromRequest</span>(req)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    reqID := <span style="color:#447fcf">requestIDFromContext</span>(*ctxp)
</span></span><span style="display:flex;"><span>    fmt.<span style="color:#447fcf">Fprintf</span>(rw, <span style="color:#ed9d13">&#34;Hello request ID %s\n&#34;</span>, reqID)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">main</span>() {
</span></span><span style="display:flex;"><span>    h := <span style="color:#447fcf">contextHandler</span>(context.<span style="color:#447fcf">Background</span>(), <span style="color:#447fcf">middleware</span>(http.<span style="color:#447fcf">HandlerFunc</span>(handler)))
</span></span><span style="display:flex;"><span>    http.<span style="color:#447fcf">ListenAndServe</span>(<span style="color:#ed9d13">&#34;:8080&#34;</span>, h)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Dereferencing the context pointer and updating it by hand is ugly,
tedious and error-prone, which is why I don&rsquo;t recommend this approach.</p>
<div class="notices note" ><strong>Update</strong>: Good question <a href="http://www.reddit.com/r/golang/comments/352rop/gos_netcontext_and_httphandler/cr0hcve">on
Reddit</a>
asking why use a pointer to a <code>context.Context</code> here.  It&rsquo;s not
necessary, but if you don&rsquo;t use a pointer you must modify the
underlying map any time you derive a new context.  Doing so greatly
increases the lock contention problem, because you must now lock
around the map any time you update the context for a request.</div>
<h2 id="httpresponsewriter-wrapper-struct"><code>http.ResponseWriter</code> wrapper struct</h2>
<p>In this approach we create a new struct type that embeds an existing
<code>http.ResponseWriter</code> and attaches additional functionality to it.
This approach is often used by Go web frameworks to do things like
capturing the status code for the purpose of logging it later.  Like
the first approach, you&rsquo;ll need to wrap handlers in a middleware that
wraps the <code>http.ResponseWriter</code> and passes it into subsequent
middleware and your handler.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> contextResponseWriter <span style="color:#6ab825;font-weight:bold">struct</span> {
</span></span><span style="display:flex;"><span>    http.ResponseWriter
</span></span><span style="display:flex;"><span>    ctx context.Context
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">contextHandler</span>(ctx context.Context, h http.Handler) http.Handler {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> http.<span style="color:#447fcf">HandlerFunc</span>(<span style="color:#6ab825;font-weight:bold">func</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>        crw := &amp;contextResponseWriter{rw, ctx}
</span></span><span style="display:flex;"><span>        h.<span style="color:#447fcf">ServeHTTP</span>(crw, req)
</span></span><span style="display:flex;"><span>    })
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">middleware</span>(h http.Handler) http.Handler {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> http.<span style="color:#447fcf">HandlerFunc</span>(<span style="color:#6ab825;font-weight:bold">func</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>        crw := rw.(*contextResponseWriter)
</span></span><span style="display:flex;"><span>        crw.ctx = <span style="color:#447fcf">newContextWithRequestID</span>(crw.ctx, req)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        h.<span style="color:#447fcf">ServeHTTP</span>(rw, req)
</span></span><span style="display:flex;"><span>    })
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">handler</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>    crw := rw.(*contextResponseWriter)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    reqID := <span style="color:#447fcf">requestIDFromContext</span>(crw.ctx)
</span></span><span style="display:flex;"><span>    fmt.<span style="color:#447fcf">Fprintf</span>(rw, <span style="color:#ed9d13">&#34;Hello request ID %s\n&#34;</span>, reqID)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">main</span>() {
</span></span><span style="display:flex;"><span>    h := <span style="color:#447fcf">contextHandler</span>(context.<span style="color:#447fcf">Background</span>(), <span style="color:#447fcf">middleware</span>(http.<span style="color:#447fcf">HandlerFunc</span>(handler)))
</span></span><span style="display:flex;"><span>    http.<span style="color:#447fcf">ListenAndServe</span>(<span style="color:#ed9d13">&#34;:8080&#34;</span>, h)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This approach just feels dirty to me.  The context is associated with
the request, not the response, so sticking it on <code>http.ResponseWriter</code>
feels out of place.  The <code>ResponseWriter</code>&rsquo;s purpose is simply to give
handlers a way to write data to the output socket.</p>
<p>Piggybacking on <code>http.ResponseWriter</code> requires a type assertion to
your wrapper struct type before you can access the context.  The
details of this can be hidden away in a safe helper function, but it
doesn&rsquo;t hide the fact that the runtime assertion is necessary.</p>
<p>There is also another hidden downside.  There is a concrete value
(with a type internal to package <code>net/http</code>) underlying the
<code>http.ResponseWriter</code> that is passed into your handler.  That value
also implements additional interfaces from the <code>net/http</code> package.  If
you simply wrap <code>http.ResponseWriter</code>, your wrapper will not be
implementing these additional interfaces.</p>
<p>You must implement these interfaces with wrapper functions if you hope
to match the base <code>http.ResponseWriter</code>&rsquo;s functionality.  In some
cases, like <code>http.Flusher</code>, this is easy with a simple conditional
type assertion:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (crw *contextResponseWriter) <span style="color:#447fcf">Flush</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">if</span> f, ok := crw.ResponseWriter.(http.Flusher); ok {
</span></span><span style="display:flex;"><span>        f.<span style="color:#447fcf">Flush</span>()
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>However, <code>http.CloseNotifier</code> is quite a bit harder.  <a href="http://golang.org/pkg/net/http/#CloseNotifier">Its
definition</a> contains a
method that returns a <code>&lt;-chan bool</code>.  That channel has certain
semantics that existing code likely depends
upon<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.  We have a couple different options
here, none of them good:</p>
<ul>
<li>
<p>Ignore the interface and don&rsquo;t implement it, making the
functionality unavailable even if the underlying
<code>http.ResponseWriter</code> supports it.</p>
</li>
<li>
<p>Implement the interface and wrap to the underlying implementation.
But what if the underlying <code>http.ResponseWriter</code> does not support
this interface?  We can&rsquo;t guarantee the proper semantics of the API.</p>
</li>
</ul>
<p>These are just two interfaces that the standard library implements
today.  This approach is not future-proof, because additional
interfaces may be added to the standard library and implemented
internally within <code>net/http</code>.</p>
<p>I don&rsquo;t recommend this approach because of the interface issue, but if
you&rsquo;re ok with ignoring them, this is probably the simplest to
implement.</p>
<h2 id="custom-context-handler-types">Custom context handler types</h2>
<p>In this approach, we eschew <code>http.Handler</code> for a new type of our own
creation.  This has obvious downsides: you cannot use existing <em>de
facto</em> middleware or handlers without wrappers.  Ultimately, though, I
think this is the cleanest way to pass a <code>context.Context</code> around.</p>
<p>Let&rsquo;s create a new <code>ContextHandler</code> type, following in the model of
<code>http.Handler</code>.  We&rsquo;ll also create an analog to <code>http.HandlerFunc</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> ContextHandler <span style="color:#6ab825;font-weight:bold">interface</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#447fcf">ServeHTTPContext</span>(context.Context, http.ResponseWriter, *http.Request)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> ContextHandlerFunc <span style="color:#6ab825;font-weight:bold">func</span>(context.Context, http.ResponseWriter, *http.Request)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (h ContextHandlerFunc) <span style="color:#447fcf">ServeHTTPContext</span>(ctx context.Context, rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>    <span style="color:#447fcf">h</span>(ctx, rw, req)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Middleware can now derive new contexts from the one passed to the
handler, and pass them onto the next middleware or handler in the
chain.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">middleware</span>(h ContextHandler) ContextHandler {
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">return</span> <span style="color:#447fcf">ContextHandlerFunc</span>(<span style="color:#6ab825;font-weight:bold">func</span>(ctx context.Context, rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>        ctx = <span style="color:#447fcf">newContextWithRequestID</span>(ctx, req)
</span></span><span style="display:flex;"><span>        h.<span style="color:#447fcf">ServeHTTPContext</span>(ctx, rw, req)
</span></span><span style="display:flex;"><span>    })
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The final context handler has access to all of the request data set by
middleware above it.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">handler</span>(ctx context.Context, rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>    reqID := <span style="color:#447fcf">requestIDFromContext</span>(ctx)
</span></span><span style="display:flex;"><span>    fmt.<span style="color:#447fcf">Fprintf</span>(rw, <span style="color:#ed9d13">&#34;Hello request ID %s\n&#34;</span>, reqID)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The last trick is converting our <code>ContextHandler</code> into something that
is <code>http.Handler</code> compatible, so we can use it anywhere standard
handlers are used.</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">type</span> ContextAdapter <span style="color:#6ab825;font-weight:bold">struct</span>{
</span></span><span style="display:flex;"><span>    ctx context.Context
</span></span><span style="display:flex;"><span>    handler ContextHandler
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> (ca *ContextAdapter) <span style="color:#447fcf">ServeHTTP</span>(rw http.ResponseWriter, req *http.Request) {
</span></span><span style="display:flex;"><span>    ca.handler.<span style="color:#447fcf">ServeHTTPContext</span>(ca.ctx, rw, req)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">func</span> <span style="color:#447fcf">main</span>() {
</span></span><span style="display:flex;"><span>    h := &amp;ContextAdapter{
</span></span><span style="display:flex;"><span>        ctx: context.<span style="color:#447fcf">Background</span>(),
</span></span><span style="display:flex;"><span>        handler: <span style="color:#447fcf">middleware</span>(<span style="color:#447fcf">ContextHandlerFunc</span>(handler)),
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    http.<span style="color:#447fcf">ListenAndServe</span>(<span style="color:#ed9d13">&#34;:8080&#34;</span>, h)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The <code>ContextAdapter</code> type also allows us to use existing
<code>http.Handler</code> middleware, as long as they run before it does.
Existing logging, panic recovery, and form validation middleware
should all continue to work great with our context handlers plus our
adapter.</p>
<p>This is my preferred method for integrating <code>net/context</code> with my
server.  I recently converted an approximately 30-route server from
Martini to this method, and things are working great.  The code is
much cleaner, easier to follow, and performs better.  This API service
does both HTTP basic and OAuth authentication, passing along client
and user information via contexts.  It extracts request IDs that are
passed across to other services via contexts.  Context-aware
middleware handles setting CORS headers, handling <code>OPTIONS</code> requests,
recovering from panics by returning JSON-encoded errors, logging
request and response info, and recording statsd metrics.</p>
<p>Give it a try and let me know <a href="https://twitter.com/joeshaw">on
Twitter</a> how it goes.  Maybe it&rsquo;ll become
the foundation for the next Go web framework. 😀</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>&ldquo;CloseNotify returns a channel that receives a single value when the client connection has gone way&rdquo;&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Contributing to GitHub projects</title><link href="https://www.joeshaw.org/contributing-to-github-projects/" rel="alternate" type="text/html" title="Contributing to GitHub projects"/><published>2015-04-20T11:20:28-04:00</published><updated>2015-04-20T11:20:28-04:00</updated><id>https://www.joeshaw.org/contributing-to-github-projects/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="github"/><summary type="html">A beginner's guide to pull requests</summary><content type="html" xml:base="https://www.joeshaw.org/contributing-to-github-projects/"><![CDATA[<p>I often see people asking how to contribute to an open source project
on GitHub.  Some are new programmers, some may be new to open source,
others aren&rsquo;t programmers but want to make improvements to
documentation or other parts of a project they use everyday.</p>
<p>Using GitHub means you&rsquo;ll need to use Git, and that means using the
command-line.  This post gives a gentle introduction using the <code>git</code>
command-line tool and a <a href="https://github.com/github/hub">companion tool for GitHub called
<code>hub</code></a>.</p>
<h2 id="workflow">Workflow</h2>
<p>The basic workflow for contributing to a project on GitHub is:</p>
<ol>
<li>Clone the project you want to work on</li>
<li>Fork the project you want to work on</li>
<li>Create a feature branch to do your own work in</li>
<li>Commit your changes to your feature branch</li>
<li>Push your feature branch to your fork on GitHub</li>
<li>Send a pull request for your branch on your fork</li>
</ol>
<h2 id="clone-the-project-you-want-to-work-on">Clone the project you want to work on</h2>
<pre><code>$ hub clone pydata/pandas
</code></pre>
<p>(Equivalent to <code>git clone https://github.com/pydata/pandas.git</code>)</p>
<p>This clones the project from the server onto your local machine.  When
working in git you make changes to your local copy of the repository.
Git has a concept of <em>remotes</em> which are, well, remote copies of the
repository.  When you clone a new project, a remote called <code>origin</code> is
automatically created that points to the repository you provide in the
command line above.  In this case, <code>pydata/pandas</code> on GitHub.</p>
<p>To upload your changes back to the main repository, you <em>push</em> to the
remote.  Between when you cloned and now changes may have been made to
upstream remote repository.  To get those changes, you <em>pull</em> from the
remote.</p>
<p>At this point you will have a <code>pandas</code> directory on your machine.  All
of the remaining steps take place inside it, so change into it now:</p>
<pre><code>$ cd pandas
</code></pre>
<h2 id="fork-the-project-you-want-to-work-on">Fork the project you want to work on</h2>
<p>The easiest way to do this is with <code>hub</code>.</p>
<pre><code>$ hub fork
</code></pre>
<p>This does a couple of things.  It creates a fork of pandas in your
GitHub account.  It establishes a new remote in your local repository
with the name of your github username.  In my case I now have two
remotes: <code>origin</code>, which points to the main upstream repository; and
<code>joeshaw</code>, which points to my forked repository.  We&rsquo;ll be pushing to
my fork.</p>
<h2 id="create-a-feature-branch-to-do-your-own-work-in">Create a feature branch to do your own work in</h2>
<p>This creates a place to do your work in that is separate from the main code.</p>
<pre><code>$ git checkout -b doc-work
</code></pre>
<p><code>doc-work</code> is what I&rsquo;m choosing to name this branch.  You can name it
whatever you like.  Hyphens are idiomatic.</p>
<p>Now make whatever changes you want for this project.</p>
<h2 id="commit-your-changes-to-your-feature-branch">Commit your changes to your feature branch</h2>
<p>If you are creating new files, you will need to explicitly add them to
the to-be-commited list (also called the <em>index</em>, or staging area):</p>
<pre><code>$ git add file1.md file2.md etc
</code></pre>
<p>If you are just editing existing files, you can add them all in one batch:</p>
<pre><code>$ git add -u
</code></pre>
<p>Next you need to commit the changes.</p>
<pre><code>$ git commit
</code></pre>
<p>This will bring up an editor where you type in your commit message.
The convention is usually to type a short summary in the first line
(50-60 characters max), then a blank line, then additional details if
necessary.</p>
<h2 id="push-your-feature-branch-to-your-fork-in-github">Push your feature branch to your fork in GitHub</h2>
<p>Ok, remember that your fork is a remote named after your github
username.  In my case, <code>joeshaw</code>.</p>
<pre><code>$ git push joeshaw doc-work
</code></pre>
<p>This pushes to the <code>joeshaw</code> remote only the <code>doc-work</code> branch.  Now
your work is publicly visible to anyone on your fork.</p>
<h2 id="send-a-pull-request-for-your-branch-on-your-fork">Send a pull request for your branch on your fork</h2>
<p>You can do this either on the web site or using the <code>hub</code> tool.</p>
<pre><code>$ hub pull-request
</code></pre>
<p>This will open your editor again.  If you only had one commit on your
branch, the message for the pull request will be the same as the
commit.  This might be good enough, but you might want to elaborate on
the purpose of the pull request.  Like commits, the first line is a
summary of the pull request and the other lines are the body of the
PR.</p>
<p>In general you will be requesting a pull from your current branch (in
this case <code>doc-work</code>) into the <code>master</code> branch of the <code>origin</code> remote.</p>
<p>If your pull request is accepted as-is, the maintainer will merge it
into the official upstream sources.  Congratulations!  You&rsquo;ve just
made your first open source contribution on GitHub.</p>
<p>(This was adapted from a <a href="https://mail.python.org/pipermail/centraloh/2015-March/002391.html">post I made to the Central Ohio Python User
Group mailing
list</a>.)</p>
]]></content></entry><entry><title type="html">Terrible Vagrant/Virtualbox performance on Mac OS X</title><link href="https://www.joeshaw.org/terrible-vagrant-virtualbox-performance-on-mac-os-x/" rel="alternate" type="text/html" title="Terrible Vagrant/Virtualbox performance on Mac OS X"/><published>2011-09-30T11:00:00-04:00</published><updated>2016-03-16T10:00:00-04:00</updated><id>https://www.joeshaw.org/terrible-vagrant-virtualbox-performance-on-mac-os-x/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="vagrant"/><category term="virtualbox"/><category term="mac"/><category term="osx"/><category term="macos"/><summary type="html">One weird trick to speed up your virtual machines</summary><content type="html" xml:base="https://www.joeshaw.org/terrible-vagrant-virtualbox-performance-on-mac-os-x/"><![CDATA[<div class="notices note" ><em>Update March 2016</em>: There&rsquo;s a much easier way to enable the host IO
cache from the command-line, but it only works for existing VMs.  <a href="#update">See
the update below</a>.</div>
<p>I recently started using <a href="http://vagrantup.com">Vagrant</a> to test our
auto-provisioning of servers with <a href="http://puppetlabs.com">Puppet</a>.
Having a simple-yet-configurable system for starting up and accessing
headless virtual machines really makes this a much simpler solution
than VMware Fusion.  (Although I wish Vagrant had a way to take and
rollback VM snapshots.)</p>
<p>Unfortunately, as soon as I tried to really do anything in the VM my
Mac would completely bog down.  Eventually the entire UI would stop
updating.  In Activity Monitor, the dreaded kernel_task was taking
100% of one CPU, and VBoxHeadless taking most of another.  Things
would eventually free up whenever the task in the VM (usually <code>apt-get install</code> or <code>puppet apply</code>) would crash with a segmentation fault.</p>
<p>Digging into this, I found an ominous message in the VirtualBox logs:</p>
<pre><code>AIOMgr: Host limits number of active IO requests to 16. Expect a performance impact.
</code></pre>
<p>Yeah, no kidding.  I tracked this message down to the &ldquo;Use host I/O
cache&rdquo; setting being off on the SATA Controller in the box.  (This is
a per-VM setting, and I am using the stock Vagrant &ldquo;lucid64&rdquo; box, so
the exact setting may be somewhere else for you.  It&rsquo;s probably a good
idea to turn this setting on for all storage controllers.)</p>
<p>When it comes to Vagrant VMs, this setting in the VirtualBox UI is not
very helpful, though, because Vagrant brings up new VMs automatically
and without any UI.  To get this to work with the Vagrant workflow,
you have to do the following hacky steps:</p>
<ol>
<li>Turn off any IO-heavy provisioning in your Vagrantfile</li>
<li><code>vagrant up</code> a new VM</li>
<li><code>vagrant halt</code> the VM</li>
<li>Open the VM in the VirtualBox UI and change the setting</li>
<li>Re-enable the provisioning in your Vagrantfile</li>
<li><code>vagrant up</code> again</li>
</ol>
<p>This is not going to work if you have to bring up new VMs often.</p>
<p>Fortunately this setting is easy to tweak in the base box.  Open up
<code>~/.vagrant.d/boxes/base/box.ovf</code> and find the <code>StorageController</code> node.
You&rsquo;ll see an attribute <code>HostIOCache=&quot;false&quot;</code>.  Change that value to
<code>true</code>.</p>
<p>Lastly, you&rsquo;ll have to update the SHA1 hash of the <code>.ovf</code> file in
<code>~/.vagrant.d/boxes/base/box.mf</code>.  Get the new hash by running
<code>openssl dgst -sha1 ~/.vagrant.d/boxes/base/box.ovf</code> and replace the
old value in <code>box.mf</code> with it.</p>
<p>That&rsquo;s it.  All subsequent VMs you create with <code>vagrant up</code> will now
have the right setting.</p>
<p><a name="update"></a></p>
<h5 id="update">Update</h5>
<p>Thanks to <a href="https://github.com/mitchellh/vagrant/issues/565#issuecomment-4436139">this comment on a Vagrant bug
report</a>
you can enable the host cache more simply from the command-line for an
existing VM:</p>
<pre><code>VBoxManage storagectl &lt;vm&gt; --name &lt;controllername&gt; --hostiocache on
</code></pre>
<p>Where <code>&lt;vm&gt;</code> is your vagrant VM name, which you can get from:</p>
<pre><code>VBoxManage list vms
</code></pre>
<p>and <code>&lt;controllername&gt;</code> is probably <code>&quot;SATA Controller&quot;</code>.</p>
<p>The VM must be halted for this to work.</p>
<p>You can add a section to your <code>Vagrantfile</code> to do this when new VMs are created:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ruby" data-lang="ruby"><span style="display:flex;"><span>config.vm.provider <span style="color:#ed9d13">&#34;virtualbox&#34;</span> <span style="color:#6ab825;font-weight:bold">do</span> |v|
</span></span><span style="display:flex;"><span>    v.customize [
</span></span><span style="display:flex;"><span>        <span style="color:#ed9d13">&#34;storagectl&#34;</span>, <span style="color:#ed9d13">:id</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#ed9d13">&#34;--name&#34;</span>, <span style="color:#ed9d13">&#34;SATA Controller&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#ed9d13">&#34;--hostiocache&#34;</span>, <span style="color:#ed9d13">&#34;on&#34;</span>
</span></span><span style="display:flex;"><span>    ]
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">end</span>
</span></span></code></pre></div><p>And for further reading, <a href="https://www.virtualbox.org/manual/ch05.html#iocaching">here is the relevant section in the
Virtualbox manual</a>
that goes into more detail about the pros and cons of host IO caching.</p>
]]></content></entry><entry><title type="html">Linux input ecosystem</title><link href="https://www.joeshaw.org/linux-input-ecosystem/" rel="alternate" type="text/html" title="Linux input ecosystem"/><published>2010-10-01T15:27:24-04:00</published><updated>2010-10-01T15:27:24-04:00</updated><id>https://www.joeshaw.org/linux-input-ecosystem/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="linux"/><category term="input"/><category term="evdev"/><category term="udev"/><category term="xorg"/><category term="xinput"/><category term="keyboard"/><category term="touchpad"/><summary type="html">A tour of the kernel, udev, and Xorg</summary><content type="html" xml:base="https://www.joeshaw.org/linux-input-ecosystem/"><![CDATA[<p>Over the past couple of days, I&rsquo;ve been trying to figure out how input
in Linux works on modern systems.  There are lots of small pieces at
various levels, and it&rsquo;s hard to understand how they all interact.
Things are not helped by the fact that things have changed quite a bit
over the past couple of years as
<a href="http://freedesktop.org/wiki/Software/hal">HAL</a> &ndash; which I helped
write &ndash; has been giving way to
<a href="http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html">udev</a>,
and existing literature is largely out of date.  This is my attempt at
understanding how things work today, in the Ubuntu Lucid release.</p>
<h3 id="kernel">Kernel</h3>
<p>In the Linux kernel&rsquo;s input system, there are two pieces: the <em>device
driver</em> and the <em>event driver</em>.  The device driver talks to the
hardware, obviously.  Today, for most USB devices this is handled by
the <em>usbhid</em> driver.  The event drivers handle how to expose the
events generated by the device driver to userspace.  Today this is
primarily done through <em>evdev</em>, which creates character devices
(typically named <code>/dev/input/eventN</code>) and communicates with them
through <em>struct input_event</em> messages.  See
<a href="http://lxr.free-electrons.com/source/include/linux/input.h"><code>include/linux/input.h</code></a>
for its definition.</p>
<p>A great tool to use for getting information about evdev devices and
events is <a href="http://packages.ubuntu.com/search?keywords=evtest">evtest</a>.</p>
<p>A somewhat outdated but still relevant description of the kernel input
system can be found in the kernel&rsquo;s
<a href="http://lxr.free-electrons.com/source/Documentation/input/input.txt"><code>Documentation/input/input.txt</code></a>
file.</p>
<h3 id="udev">udev</h3>
<p>When a device is connected, the kernel creates an entry in sysfs for
it and generates a hotplug event.  That hotplug event is processed by
udev, which applies some policy, attaches additional properties to the
device, and ultimately creates a device node for you somewhere in
<code>/dev</code>.</p>
<p>For input devices, the rules in
<code>/lib/udev/rules.d/60-persistent-input.rules</code> are executed.  Among the
things it does is run a <code>/lib/udev/input_id</code> tool which queries the
capabilities of the device from its sysfs node and sets environment
variables like <code>ID_INPUT_KEYBOARD</code>, <code>ID_INPUT_TOUCHPAD</code>, etc. in the udev
database.</p>
<p>For more information on <code>input_id</code> see the <a href="http://www.spinics.net/lists/hotplug/msg03174.html">original announcement
email</a> to the
hotplug list.</p>
<h3 id="x">X</h3>
<p>X has a udev config backend which queries udev for the various input
devices.  It does this at startup and also watches for hotplugged
devices.  X looks at the different <code>ID_INPUT_*</code> properties to
determine whether it&rsquo;s a keyboard, a mouse, a touchpad, a joystick, or
some other device.  This information can be used in
<code>/usr/lib/X11/xorg.conf.d</code> files in the form of <code>MatchIsPointer</code>,
<code>MatchIsTouchpad</code>, <code>MatchIsJoystick</code>, etc. in <code>InputClass</code> sections to
see whether to apply configuration to a given device.</p>
<p>Xorg has a handful of its own drivers to handle input devices,
including evdev, synaptics, and joystick.  And here is where things
start to get confusing.</p>
<p>Linux has this great generic event interface in evdev, which means
that very few drivers are needed to interact with hardware, since
they&rsquo;re not speaking device-specific protocols.  Of the few needed on
Linux nearly all of them speak evdev, including the three I listed
above.</p>
<p>The evdev driver provides basic keyboard and mouse functionality,
speaking &ndash; obviously &ndash; evdev through the <code>/dev/input/eventN</code>
devices.  It also handles things like the lid and power switches.
This is the basic, generic input driver for Xorg on Linux.</p>
<p>The synaptics driver is the most confusing of all.  It <em>also speaks
evdev to the kernel</em>.  On Linux it does not talk to the hardware
directly, and is in no way Synaptics(tm) hardware-specific.  The
synaptics driver is simply a separate driver from evdev which adds a
lot of features expected of touchpad hardware, for example two-finger
scrolling.  It should probably be renamed the &ldquo;touchpad&rdquo; module,
except that on non-Linux OSes it can still speak the Synaptics
protocol.</p>
<p>The joystick driver similarly handles joysticky things, but speaks
evdev to the kernel rather than some device-specific protocol.</p>
<p>X only has concepts of keyboards and pointers, the latter of which
includes mice, touchpads, joysticks, wacom tablets, etc.  X also has
the concept of the <em>core</em> keyboard and pointer, which is how events
are most often delivered to applications.  By default all devices send
core events, but certain setups might want to make devices non-core.</p>
<p>If you want to receive events for non-core devices, you need to use
the XInput or XInput2 extensions for that.  XInput exposes core-like
events (like <code>DeviceMotionNotify</code> and <code>DeviceButtonPress</code>), so it is
not a major difficulty to use, although its setup is annoyingly
different than most other X extensions.  I have not used XInput2.</p>
<p><a href="http://who-t.blogspot.com/">Peter Hutterer&rsquo;s blog</a> is an excellent
resource for all things input related in X.</p>
]]></content></entry><entry><title type="html">AVCHD to MP4/H.264/AAC conversion</title><link href="https://www.joeshaw.org/avchd-to-mp4-h264-aac-conversion/" rel="alternate" type="text/html" title="AVCHD to MP4/H.264/AAC conversion"/><published>2010-04-10T10:28:03-04:00</published><updated>2010-04-10T10:28:03-04:00</updated><id>https://www.joeshaw.org/avchd-to-mp4-h264-aac-conversion/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="avchd"/><category term="mp4"/><category term="h264"/><category term="aac"/><category term="ffmpeg"/><summary type="html">It is surprisingly tricky</summary><content type="html" xml:base="https://www.joeshaw.org/avchd-to-mp4-h264-aac-conversion/"><![CDATA[<p>For posterity:</p>
<p>I have a <a href="http://amzn.com/B001OI2Z2I?tag=joeshaw-20">Canon HF200 HD video
camera</a>, which records to
AVCHD format. AVCHD is H.264 encoded video and AC-3 encoded audio in a
MPEG-2 Transport Stream (m2ts, mts) container.  This format is not
supported by Aperture 3, which I use to store my video.</p>
<p>With <a href="http://www.0xdeadbeef.com/">Blizzard</a>&rsquo;s help, I figured out an
ffmpeg command-line to convert to H.264 encoded video and AAC encoded
audio in an MPEG-4 (mp4) container.  This is supported by Aperture 3
and other Quicktime apps.</p>
<pre><code>$ ffmpeg -sameq -ab 256k -i input-file.m2ts -s hd1080 output-file.mp4 -acodec aac
</code></pre>
<p>Command-line order is important, which is infuriating.  If you move
the <code>-s</code> or <code>-ab</code> arguments, they may not work.  Add <code>-deinterlace</code> if
the source videos are interlaced, which mine were originally until I
turned it off.  The only downside to this is that it generates huge
output files, on the order of 4-5x greater than the input file.</p>
<p><strong>Update, 28 April 2010:</strong> Alexander Wauck emailed me to say that
re-encoding the video isn&rsquo;t necessary, and that the existing H.264
video could be moved from the m2ts container to the mp4 container
with a command-line like this:</p>
<pre><code>$ ffmpeg -i input-file.m2ts -ab 256k -vcodec copy -acodec aac output-file.mp4
</code></pre>
<p>And he&rsquo;s right&hellip; as long as you don&rsquo;t need to deinterlace the video.
With the whatever-random-ffmpeg-trunk checkout I have, adding
<code>-deinterlace</code> to the command-line segfaults.  I actually had tried
<code>-vcodec copy</code> early in my experiments but abandoned it after I found
that it didn&rsquo;t deinterlace.  I had forgotten to try it again after I
moved past my older interlaced videos.  Thanks Alex!</p>
]]></content></entry><entry><title type="html">Real-time MBTA bus location + Google Maps mashup</title><link href="https://www.joeshaw.org/real-time-mbta-bus-location-google-maps-mashup/" rel="alternate" type="text/html" title="Real-time MBTA bus location + Google Maps mashup"/><published>2009-11-15T23:31:48-05:00</published><updated>2009-11-15T23:31:48-05:00</updated><id>https://www.joeshaw.org/real-time-mbta-bus-location-google-maps-mashup/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="mbta"/><category term="bus"/><category term="maps"/><category term="mashup"/><content type="html" xml:base="https://www.joeshaw.org/real-time-mbta-bus-location-google-maps-mashup/"><![CDATA[<p>This weekend I read that the MBTA and Massachusetts Department of
Transportation had released a <a href="http://www.eot.state.ma.us/developers/realtime/">trial real-time data
feed</a> for the
positioning of vehicles on five of its bus routes.  This is very
important data to have, and while obviously everyone would like to see
more routes added, it&rsquo;s a start.</p>
<p>I decided to <a href="/mbta-bus">hack together a mashup</a> of this data with
Google Maps, to see how easy it would be.  In the end it took me a few
hours on Saturday to get the site up and running, and a couple more on
Sunday adding features like the drawing of routes on the map,
colorizing markers for inbound vs. outbound buses, and adding reverse
geocoding of the buses themselves.</p>
<figure>
  <a href="/mbta-bus">
    <img src="/images/mbta-bus-real-time.png" alt="MBTA Real-time bus info" title="MBTA Real-time bus info" />
  </a>
</figure>
<p>To do this I used three technologies (Google App Engine, JQuery,
Google Maps) and two data sources (the real-time XML feed and the MBTA
Google Transit Feed Specification files).</p>
<h3 id="google-app-engine">Google App Engine</h3>
<p><a href="http://appengine.google.com">App Engine</a> is so perfectly suited for
smaller, playtime hacks like this that it&rsquo;s hard to imagine how anyone
got anything done before it existed.  The tedious, up-front
bootstrapping that is required in so many programming projects has
been enough to completely turn me off to small, spare-time hacking
projects on occasion in the past.  The brilliance behind a hosted
software environment is obvious, but the amount of work to build a
safe, hosted system with a fairly comprehensive set of APIs seems to
be such a mountain of work that in many ways I find it surprising that
anyone &ndash; even, perhaps especially, Google &ndash; built it at all.</p>
<p>I chose the Python SDK and the programming was straightforward and
easy.  It takes some elements from Django, with which I am familiar
from work.</p>
<h3 id="jquery">JQuery</h3>
<p>A no-brainer.  Hands down the best JavaScript toolkit available.
Making the AJAX calls to get route and vehicle location information
was a breeze, and the transparent handling of the XML data of the
real-time feed prevents me from losing the will to live &ndash; a common
feeling when dealing with XML.</p>
<p>My only complaint is with the documentation.  While the API reference
is good for any given piece of the API, the examples are a little
light and there is absolutely zero cross-referencing to other parts,
especially ones not a part of JQuery itself.  It was not obvious, for
example, how to deal with the XML document returned by the AJAX call.
It sounds like the docs are <a href="http://twitter.com/jeresig/status/5750291977">getting some
work</a>, though, so this
will hopefully improve.</p>
<h3 id="google-maps">Google Maps</h3>
<p>This was my first endeavor with the <a href="http://code.google.com/apis/maps">Maps
API</a>, and it&rsquo;s good.  It&rsquo;s not the
best API in the world, but it&rsquo;s hardly the worst either.  Adding
markers of different colors is annoying, but not so onerous as to make
it tedious.  The breadth of functionality provided is impressive, but
then again it has been around for a few years at this point.  Markers
are easy to add, drawing the route map is absolutely trivial with a
KML file, and even the reverse geocoding &ndash; which gives you a street
address given a latitude/longitude pair &ndash; is straightforward.</p>
<p>The docs suck, though.  There&rsquo;s no indication that a size or anchor
position is required when creating an icon for a custom marker &ndash;
required for colors other than red &ndash; and due to the minified JS files
tracking down that error took longer than any other task in the
project. Reverse geocoding mentions that a <tt>Placemark</tt> object
will be returned, but that class doesn&rsquo;t appear anywhere in the
reference documentation.</p>
<h3 id="real-time-data-feed">Real-time data feed</h3>
<p>Lots to like.  Straightforward, easy to parse.  It&rsquo;d be nice if I
didn&rsquo;t have to do the reverse geocoding to figure out what the street
address is, but it&rsquo;s not a dealbreaker.  Main downside is that it&rsquo;s
XML as opposed to JSON.  And of course, it&rsquo;s only 5 bus routes and
zero subway and commuter rail routes.</p>
<h3 id="mbta-google-transit-feed-specification-files">MBTA Google Transit Feed Specification files</h3>
<p>A <a href="http://www.eot.state.ma.us/default.asp?pgid=content/developer&amp;sid=about#para15">comprehensive set of data describing every transit
route</a>,
every stop, and every route in the MBTA system.  An impressive set of
data encoded in a format designed for Google Transit.  There is a <a
href="http://code.google.com/p/googletransitdatafeed/">set of example
tools</a> to view and manipulate this data, and one of those
translates this data into a KML file for use with Google Maps.  I
should have tweaked the tools to output only the KML for the routes I
cared about, but I did this by hand instead&hellip; not a big deal for only
5 bus lines.  These KML files are fed into the Google Maps API to
display the route as a blue line on the map when selected.</p>
<h3 id="poke-47196-201">POKE 47196, 201</h3>
<p>This is what a lot of programming is like now, for better and for
worse.</p>
<p>On the one hand it is the perfect example of high-level
component-oriented programming.  Data is formatted in easily parseable
interchange formats and plugged into well-defined interfaces.  These
interfaces plug into other interfaces.  The result is a zoomable,
pannable map with real-time bus location information that updates
every 15 seconds.  The lines-of-code count is around 100 including
both Python and JavaScript.  With a few hours work, I built something
modestly useful out of nothing.  I stand on the shoulders of giants.</p>
<p>On the other hand I didn&rsquo;t really <em>build</em> anything.  This is
just assembly line programming.  It was not a particularly creative
endeavor, and it wasn&rsquo;t challenging intellectually.  Anybody could
have done it.  It&rsquo;s cool, but there is little sense of accomplishment
in the end product. It feels a little hollow.</p>
<p>Which is not to say that I didn&rsquo;t enjoy it, or that it wasn&rsquo;t worth
the effort.  I learned new technology, I played with software and data
that I hadn&rsquo;t had the opportunity to before.  I broadened my horizons,
however slightly.  And it got me to write this blog post.</p>
]]></content></entry><entry><title type="html">Python daemon threads considered harmful</title><link href="https://www.joeshaw.org/python-daemon-threads-considered-harmful/" rel="alternate" type="text/html" title="Python daemon threads considered harmful"/><published>2009-02-24T17:28:36-05:00</published><updated>2020-06-11T08:55:00-04:00</updated><id>https://www.joeshaw.org/python-daemon-threads-considered-harmful/</id><author><name>Joe Shaw</name><email>joe@joeshaw.org</email></author><category term="python"/><category term="daemon"/><category term="threads"/><content type="html" xml:base="https://www.joeshaw.org/python-daemon-threads-considered-harmful/"><![CDATA[<div class="notices note" ><em>Update April 2015</em>: Reading it again years later, I regret the tone
of this post.  I was frustrated at the time and it comes across now as
just smarmy.  Still, I stand by the principal idea: that you should
avoid Python&rsquo;s daemon threads if you can.</div>
<div class="notices note" ><em>Update June 2015</em>: This is <a href="http://bugs.python.org/issue1856">Python bug
1856</a>.  It was fixed in Python 3.2.1
and 3.3, but the fix was never backported to 2.x.  (An attempt to
backport to the 2.7 branch <a href="http://bugs.python.org/issue21963">caused another
bug</a> and it was abandoned.)  Daemon
threads <em>may</em> be ok in Python &gt;= 3.2.1, but definitely aren&rsquo;t in
earlier versions.</div>
<p>The other day <a href="http://itasoftware.com">at work</a> we encountered an
unusual exception in our nightly pounder test run after landing some
new code to expose some internal state via a monitoring API.  The
problem occurred on shutdown.  The new monitoring code was trying to
log some information, but was encountering an exception.  Our logging
code was built on top of Python&rsquo;s
<a href="http://docs.python.org/library/logging.html"><code>logging</code></a> module, and
we thought perhaps that something was shutting down the logging system
without us knowing.  We ourselves never explicitly shut it down, since
we wanted it to live until the process exited.</p>
<p>The monitoring was done inside a daemon thread.  The <a
href="http://docs.python.org/library/threading.html#id1">Python docs
say</a> only:</p>
<blockquote>
<p>A thread can be flagged as a &ldquo;daemon thread&rdquo;. The
significance of this flag is that the entire Python program exits when
only daemon threads are left.&quot;</p></blockquote>
<p>Which sounds pretty good, right?  This thread is just occasionally
grabbing some data, and we don&rsquo;t need to do anything special when the
program shuts down.  Yeah, I remember when I used to believe in things
too.</p>
<p>Despite a global interpreter lock that prevents Python from being
truly concurrent anyway, there is a very real possibility that the
daemon threads can still execute after the Python runtime has started
its own tear-down process.  One step of this process appears to be to
set the values inside <a
href="http://docs.python.org/library/functions.html#globals"><code>globals()</code></a>
to <code>None</code>, meaning that any module resolution results in an
<code>AttributeError</code> attempting to dereference <code>NoneType</code>.
Other variations on this cause <code>TypeError</code> to be thrown.</p>
<p>The code which triggered this looked something like this, although
with more abstraction layers which made hunting it down a little
harder:</p>
<div class="highlight"><pre tabindex="0" style="color:#d0d0d0;background-color:#202020;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">try</span>:
</span></span><span style="display:flex;"><span>    log.info(<span style="color:#ed9d13">&#34;Some thread started!&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">try</span>:
</span></span><span style="display:flex;"><span>        do_something_every_so_often_in_a_loop_and_sleep()
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">except</span> somemodule.SomeException:
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">pass</span>
</span></span><span style="display:flex;"><span>    <span style="color:#6ab825;font-weight:bold">else</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#6ab825;font-weight:bold">pass</span>
</span></span><span style="display:flex;"><span><span style="color:#6ab825;font-weight:bold">finally</span>:
</span></span><span style="display:flex;"><span>    log.info(<span style="color:#ed9d13">&#34;Some thread exiting!&#34;</span>)
</span></span></code></pre></div><p>The exception we were seeing was an <code>AttributeError</code> on the
last line, the <code>log.info()</code> call.  But that wasn&rsquo;t even the
original exception.  It was actually another <code>AttributeError</code>
caused by the <code>somemodule.SomeException</code> dereference.  Because
all the modules had been reset, <code>somemodule</code> was <code>None</code>
too.</p>
<p>Unfortunately the docs are completely devoid of this information, at
least in the threading sections which you would actually reference.
The best information I was able to find was <a href="https://mail.python.org/pipermail/python-list/2005-February/343697.html">this email to
python-list</a>
a few years back, and a
<a href="http://mail.python.org/pipermail/python-dev/2003-September/038151.html">few</a>
<a href="http://mail.python.org/pipermail/python-bugs-list/2004-July/023901.html">other</a>
<a href="http://mail.python.org/pipermail/python-bugs-list/2008-January/045448.html">emails</a>
which don&rsquo;t really put the issue front and center.</p>
<p>In the end the solution for us was simply to make them non-daemon
threads, notice when the app is being shut down and join them to the
main thread.  Another possibility for us was to catch
<code>AttributeError</code> in our thread wrapper class &ndash; which is what
the author of the aforementioned email does &ndash; but that seems like
papering over a real bug and a real error.  Because of this
misbehavior, daemon threads lose almost all of their appeal, but oddly
I can&rsquo;t find people really publicly saying &ldquo;don&rsquo;t use them&rdquo; except in
scattered emails.  It seems like it&rsquo;s underground information known
only to the Python cabal.  (<a href="http://en.wikipedia.org/wiki/There_Is_No_Cabal">There is no
cabal.</a>)</p>
<p>So, I am going to say it.  When I went searching there weren&rsquo;t any
helpful hints in a Google search of &ldquo;python daemon threads considered
harmful&rdquo;.  So, I am staking claim to that phrase.  People of The
Future: You&rsquo;re welcome.</p>
]]></content></entry></feed>