The author makes a good point about the discipline imposed by not having excepti...

RyanZAG · on Dec 5, 2013

It depends on the problem you're trying to solve, I think.

Let's consider a command line application that fetches a URL, like wget. Without exceptions, you would check the return code of all the system functions you call (dns, sockets, etc), and if any of them fail you can't really recover. You just write out an error message and exit. With exceptions, you could wrap the whole thing and if any system function throws an exception, you print out the exception and exit. So it's much the same in this case.

Now put the url fetch code into a larger application. With exceptions this is pretty easy - you can catch any exception thrown by your url fetch code and then retry it a few times until it succeeds, or you inform the user of the app. Without exceptions this is tougher as you need to unroll everything manually in order to retry.

Finally we have something complicated like running a simulation. If something gives an error, we probably don't want to stop the whole simulation. We'd want to check the specific error result and then fix the simulation and continue. In this case exceptions would add a lot of boiler plate code and would need to be handled just as carefully as error results (eg, wrapping each call in its own try and handling each error).

So it's pretty clear that it depends is very applicable to when you'd rather use exceptions or error results. In general I think exceptions are better as long as they're only used for actual error cases as it gives the user of the functions more control over where and at what level to handle exceptions.

gngeal · on Dec 5, 2013

> Now put the url fetch code into a larger application. With exceptions this is pretty easy - you can catch any exception thrown by your url fetch code and then retry it a few times until it succeeds, or you inform the user of the app. Without exceptions this is tougher as you need to unroll everything manually in order to retry.

Unless you have something like goroutines - or Erlang-style processes - and perhaps employ a moderately disciplined coding style. Retrying those is just a matter of relaunching them, isn't it?

Also, Common Lisp got exceptions mostly right. Many other languages...not so much, I guess.

vidarh · on Dec 5, 2013

> Without exceptions this is tougher as you need to unroll everything manually in order to retry.

I don't get why you think this is a problem. You can trivially "unroll" it by explicitly writing your functions to be single-exit and "unrolling" at the end before returning whatever your result is.

"Exception handling" without "real" exceptions is then also trivial: just check the appropriate error code and take whatever action is suitable at the time.

tracker1 · on Dec 6, 2013

This is why I really like the callback pattern that node.js tends to use for all it's IO api calls... the callback always has the option for an error first.. generally if its' a nested callback, bubble up early. Internally, the C code can just check the state and call the callback method early if an error state arises.

yason · on Dec 6, 2013

C libraries don't just exit. A C executable could in some circumstances just fprintf() to stderr and exit(1) but usually even that requires sane deinitialization so before exiting you will effectively end up back in main() from the lower levels.

C libraries generally handle their errors on some level, possibly jumping directly out from the lowest levels and then doing deinit before passing back an error code to the caller of the library function, or exiting in a nested fashion by deinitializing stuff and reporting failure to the caller. Whatever works for whichever codebase but try to use any decently popular library, such as libcurl, and see if they just exit in the middle.

alexkus · on Dec 6, 2013

Exactly. Writing libraries in C means you have to be much more strict than simple throwaway single execution binaries.

What our company has ended up doing is wrappering most low level functions with the common code around it. For example, fopen() may fail and return EINTR if the process received a signal just as it was opening a file (we use signals to tell processes to reread their config so this does happen albeit rarely) so the wrapper calls fopen() in a do/while loop that repeats on EINTR (and a few others). It saves you from writing the same 50 odd lines each time you want to open a simply open a file.

In an exception led world you'd still have to do the same thing, but with different syntactic sugar. You can't just fail and bubble up the exception without dealing with the few exceptions you must deal with and repeat, and you can't just repeat on every exception as most will just fail again and again. You end up writing code to do the same thing just in different styles.

Checking the return values of every single function call, and dealing with it, can make the code verbose (one line of code and then 10+ lines dealing with errors) but it is worth it in the end for bullet proof programs. Also code where you have to undo all of the work already done in the function up until that point, which can get a little tedious, i.e.:-

    APP_RET makefoo( char *fname )
    {
      char *foo,*bar;
      thingy_t *qux;
      size_t len;
    
      ASSERT( fname );
    
      len = strlen( fname );
    
      foo = malloc( len+1 );
      if( foo == NULL ) {
         return( ENOMEM );
      }
      bar = malloc( (2*len)+1 );
      if( bar == NULL ) {
         free( foo );
         return( ENOMEM );
      }
      qux = malloc( sizeof( thingy_t ) );
      if( qux == NULL ) {
         free( foo );
         free( bar );
         return( ENOMEM );
      }
      ...
      return( APP_OK );
    }

It's the job of anything calling makefoo() to deal with the various errors it can return, but the idea is to return the minimum number of unique error codes as necessary to avoid proliferation of error codes to the higher and higher functions. Many calling functions will only really care about success or failure, and will just use the return code to log out the reason for the failure.

The wrappers help deal with many situations; did fwrite() write all of the bytes we wanted or just some of them? Well, our wrapper around fwrite will handle short writes and repeat the call depending on the result of ferror().

asveikau · on Dec 6, 2013

> Also code where you have to undo all of the work already done in the function up until that point, which can get a little tedious, i.e.:-

Sorry pal, you're doing it wrong. At least, this is not the way I've done it and seen it done in large C code bases.

You're supposed to have only one return statement, and one block that frees everything. For example if you initialize `foo`, `bar`, and `qux` to `NULL`. Then testing these pointers for `NULL` de facto tells you how far you got in the function, and which buffers need to be freed. Just before your one single return statement (can't emphasize this enough) you call `free` on all of them regardless of success or failure. It's much more composeable than what you described - allocations for `foo`, `bar`, `qux` can fail, the ones that will be not yet allocated at that point in time will be `NULL`, and `free(NULL)` is a harmless no-op.

None of this business of "I've got to return now, let's see, how many of these buffers do I need to release at this point in time?", with varying amounts of the same free statement appearing redundantly. Write the cleanup block once when the pointers are about to fall out of scope, have it able to run in both success and failure cases and be done with it. Think of it as a more manual RAII if you like.

As for what to replace those early `return`s with, the two common schools are `goto` into the cleanup block, or repeatedly checking some kind of failure status variable before performing new actions.

Nursie · on Dec 6, 2013

What you've described looks just as messy to my eyes, and I've seen all different ways to do this.

You can nest under if(thing!=NULL) but then you end up with indentation creep.

You can use the goto pattern if you like but some folks will tell you that goto's are never, ever to be used.

Or you can do what's done above. When it comes down to it the logic is basically identical and it's just down to code formatting.

asveikau · on Dec 6, 2013

> You can nest under if(thing!=NULL) but then you end up with indentation creep.

I am not suggesting any nesting of anything. Repeatedly checking at the same indentation level.

> You can use the goto pattern if you like but some folks will tell you that goto's are never, ever to be used.

What matters more to you, getting stuff right or repeating adages that other people have said out of context? `goto` is probably the cleanest way to do it in plain C.

> When it comes down to it the logic is basically identical and it's just down to code formatting.

No actually, it is quite a bit more than style, doing it the way alexkus has it is much much much less maintainable when it's done all over a large code base. He's got `foo`, `bar`, `qux`. What if a year later some future maintainer totally unfamiliar with the code needs to add another one? Then it's up to that person to find all of the exit paths, make sure that `foo`, `bar`, `qux` and the new thing are freed in all cases. Doing that is a lot harder if the free statements are all over the place and repeated several times instead of in a single block.

Nursie · on Dec 6, 2013

>> I am not suggesting any nesting of anything. Repeatedly checking at the same indentation level.

This could be considered wasteful. You end up checking if something is NULL, if it is you jump to the cleanup code and immediately check again. Nesting may be more elegant.

>> What matters more to you, getting stuff right or repeating adages that other people have said out of context? `goto` is probably the cleanest way to do it in plain C.

Writing code in a way consistent with the team I work with and the established codebase.

>> What if a year later some future maintainer totally unfamiliar with the code needs to add another one?

Then they need to read what the function is doing and understand it before they mess with the code, just like in any other situation.

>> Doing that is a lot harder if the free statements are all over the place and repeated several times instead of in a single block.

More verbose certainly, but if the code is written in small, discrete functional blocks then it shouldn't really impact much.

He repeats frees, you repeat tests. An indenter would repeat neither but has indent readability to consider.

Frankly, don't trust anyone that tells you that they have the one true way to do things.

asveikau · on Dec 6, 2013

> This could be considered wasteful.

It's true that there is an extra compare. I think it's a small cost for maintainable code.

> Then they need to read what the function is doing and understand it before they mess with the code, just like in any other situation.

Sounds great, however, the time they spent figuring out your haphazard, repetitive and confusing free() statements could be better spent somewhere else. When I said that my way is more "composeable" I meant that adding, removing, or re-ordering operations is a cheaper operation for the programmer. Follow this style and you'll spend less time trying to read and figure out code because it will fit the existing convention and will be bleedingly obvious where the buffers are released.

I may have been a bit hyperbolic with calling this "correct" or "supposed" however I didn't make up these conventions, I advocate them because I have seen them work really well and I have seen yours create mounds of inflexible spaghetti.

Nursie · on Dec 7, 2013

>> It's true that there is an extra compare. I think it's a small cost for maintainable code.

And I don't think it's the only way to achieve maintainable code. That's all I'm saying.

>> Sounds great, however, the time they spent figuring out your haphazard, repetitive and confusing free() statements

If it's the coding standard of the product that you have a small block of this at the top before you start actually writing the function then it doesn't take any more time for a coder to understand than any other way around, the important part is consistency.

>> I advocate them because I have seen them work really well and I have seen yours create mounds of inflexible spaghetti.

What's mine? I'm not advocating any of them, just sticking to one. I still don't think yours is any better than (for instance) -

    type function()
    {
        type2 *thing = (cast) malloc (size);
        type2 *thing2;

        if (thing)
        {
            thing2 = (cast2) malloc (size2);
            if(thing2)
            {
                // do some stuff here
                // and some more stuff
                ...
                free(thing2);
            }
            free(thing);
        }
        return code;
    }

A pattern which auto-unrolls as it exits without the need for more tests, and to me is every bit as maintainable as the goto cleanup; pattern.

Spaghetti code (to me) is more about encapsulation and modularisation failures than it is about the content of any individual function.

asveikau · on Dec 8, 2013

So you avoid a few null checks but you introduce the indentation problem you mentioned a few comments ago.

I think we can agree this is better than the alexkus example but there are tradeoffs involved.

(PS: I am bothered at how you cast the return value of malloc, this is C we're talking about right, not some other language with more plus signs? :-))

seabee · on Dec 6, 2013

The logic may be identical but the scope for programmer error is not. Consider the possible mistakes someone can write by doing it each different way.

Nursie · on Dec 6, 2013

I've seen errors in all the ways considered here.

nightski · on Dec 5, 2013

But what if the exception was in your code and not the system call? This is incredibly dangerous.

Monads are perfect for this. Really combines the best of both worlds without all the unspecified/undefined behavior.

memracom · on Dec 5, 2013

You could always write Monads in C... http://blog.sigfpe.com/2007/02/monads-in-c-pt-ii.html

Groxx · on Dec 5, 2013

Then it should be a different exception, and if it matters, you should be treating that class of exceptions differently. Sometimes it doesn't matter.

bjourne · on Dec 5, 2013

Your experience with languages with exceptions seem to come from people who misuse them. Randomly placing catch clauses around in the code is not good practice, even if perhaps a majority of all programmers in safe languages code that way. That causes latent bugs that are incredibly hard to debug.

The trick is to almost never ever catch exceptions. For example, in his post he describes a bug caused by accessing beyond allocated memory. That would in a safe language immediately have caused an ArrayIndexOutOfBoundsException (or equivalent) which the programmer would have fixed. In C, errors are often "silent" because you can't be bothered to check the return value of every call to printf. In a correctly coded program in a safe language errors are never ever silent.

notacoward · on Dec 5, 2013

"Your experience with languages with exceptions seem to come from people who misuse them."

Yes, it does, because people who misuse them seem to be a majority. I think that's an important point in language (or for that matter any kind of) design. You can't just look at how well things work for the experts - a mistake made by both Common Lisp and C++ in different ways). Nor can you just consign all non-experts to some straw-man-ish "blub" category. You have to look at what skill level is required to make the benefits outweigh the costs, which could be anywhere along the skill continuum, and compare that to the actual skill distribution of the programmers you have (or will have after hiring and/or training).

The sad fact seems to be that the tipping point for exceptions is at a point that leaves most programmers on the bad side. The same is almost certainly true for any kind of meta-programming. It might even be true for closures and continuations. "Primitive" languages lacking features like exceptions or GC surely do trip up the true beginners, but leave fewer traps further along the path.

andrewflnr · on Dec 6, 2013

Somewhat OT, but I find it curious that you would, in the same breath so to speak, put closures and continuations in the same category of "hard things". To me, continuations are still pretty mysterious, but closures are a pretty simple, usable idea.

betterunix · on Dec 6, 2013

Continuations are extremely counterintuitive and should only be used as low-level building blocks. Exceptions are actually a kind of continuation, and in Common Lisp the compiler uses continuations to implement exceptions (i.e. "conditions") and the restarts system. It is common in functional languages for the compiler to use continuation passing style as an intermediate representation, where instead of returning from a function you will invoke the "return continuation" that was passed to the function as an argument (i.e. it is the code that follows the function call -- where you return to).

sergiosgc · on Dec 6, 2013

> The trick is to almost never ever catch exceptions.

I strongly disagree. Not catching exceptions leaks abstraction layers. If I have a Prefs::save() method, I don't want it throwing a DiskFullException when the Prefs class is an abstraction of a preferences datastore. I don't care what is the final store, as long as it fits the abstraction. A well designed abstraction will catch and wrap the exception into something that makes sense at that level of abstraction, never leaking implementation details.

betterunix · on Dec 6, 2013

"A well designed abstraction will catch and wrap the exception into something that makes sense at that level of abstraction, never leaking implementation details."

This makes recovering from the error rather difficult. If the problem is that a disk is full, I need to do something about the disk being full (maybe ask the user to delete some files). If the problem is that the disk was disconnected, I need to do something else about it.

The real issue here is that you are thinking of exceptions as they exist in languages like C++ and Java, where you destroy your call stack in order to locate the exception handler. Such languages make the difficult problem of error recovery that much harder. Common Lisp does it better: the handler is allowed to invoke restarts (if they exist), which the function that signaled the error sets up. This encapsulates things very neatly. The disk was full? The error is signaled by write, which sets up a restart that tries to continuing writing to the disk. At the next level of abstraction, you might have a restart to remove the half-written record from your disk. In theory, you might only need one top-level exception handler, which interacts with the user as needed to recover from errors (or politely inform them that no recovery is possible).

sergiosgc · on Dec 6, 2013

I'm not familiar with the concept of restarts. I do concede, though, that wrapping extensions limits recovery. Either the library can recover on its own, or it can't fulfill its designed service.

On the other hand, the most revered architectures we have aren't leaky. You don't see network stack code trying to recover from Ethernet collisions at the IP level, or app logic trying to salvage an SQL transaction when a restriction has been tripped. The price for non leaky abstractions is not zero, but the gains are also definitely not zero.

bjourne · on Dec 6, 2013

I don't think you understand proper exception handling. Catching and wrapping DiskFullException is pretty pointless because what are you going to do about it? Nothing. It's nonsensical for a preferences class to deal with that situation. Instead let it bubble up and so that the caller has the option of handling it, for example by showing a dialog "Delete temporary files and try again?"

You'll never be able to catch all exception. In addition to your DiskFullException, you have PermissionDeniedException, NFSException, NullPointerException, InvalidFilenameException, PathToLongException ad infinitum. By trying to be "nice" by trying to wrap all those exception you are actually doing your api users a great disservice.

sergiosgc · on Dec 6, 2013

You state a lot of half truths ("you'll never be able to catch all exceptions"), don't justify the assumptions and didn't handle the core of my argument (abstraction leakage). In the hurry to insult me, did you actually read my argument?

bjourne · on Dec 6, 2013

Because the core of your argument was based on a stupid rule I don't agree with! You: "Throwing DiskFullException results in abstraction leakage" Me: "No it doesn't.."

sergiosgc · on Dec 6, 2013

My bad. I should have realized earlier I was discussing with a child.

Please accept my apologies.

bruceboughton · on Dec 6, 2013

To your second point: all of these errors should extend IOException ;)

laurent123456 · on Dec 6, 2013

Librairies that wrap exceptions into something else often do a disservice to their users. In this `Prefs:save()` example, what should the wrapping library throw? A "SaveFailedException"? That's more abstract, however now I would need to go check the source code of the library, find where the exception has been converted to something else, comment out the "try/catch" statement, and rerun the program. Then I can finally now what really happened and do something to fix it.

sergiosgc · on Dec 6, 2013

If done right, rethrowing exceptions does not affect the ability to debug the code. You don't throw a pristine new exception on the spot. You wrap the exception in a new one, effectively maintaining all the information. Coded recoverability suffers, but debugging ability does not. You have the same debugging information in the rethrown exception as you did in the bubbled up one.

joshguthrie · on Dec 5, 2013

"Discipline" is indeed the right word.

I was taught C at Epitech. A single segfault, no matter insidious, was a valid reason to render a whole project NULL. We often had evaluations ran with LD_LIBRATY_PATH=malloc_that_fails.so or just piping /dev/urandom to stdin...

Needless to say, calling exit() when a call to malloc() failed wasn't an acceptable recover routine.

Crito · on Dec 5, 2013

> Needless to say, calling exit() when a call to malloc() failed wasn't an acceptable recover routine.

What do you do when malloc fails? A bit of graceful shutdown and logging seems like it would be in order, but otherwise how do you keep rolling if mallocs start failing? It seems to me like that would indicate something has gone unusually wrong and full recovery is futile.

vidarh · on Dec 6, 2013

I grew up using the Amiga, when having memory allocation fail was routine (a standard Amiga 500 for example, came with 512KB RAM, and was rarely expanded to more than 1MB, so you would run out of memory).

What you do when malloc() fails depends entirely on your application: If a desktop application on the Amiga would shutdown just because a memory allocation failed, nobody would use it. The expection was you'd gracefully clean up, and fail whatever operation needed the memory, and if possible inform the user to let him/her free up memory before trying again.

This expectations in "modern" OS's that malloc never fails unless the world is falling really annoys me - it for example leads to systems where we use too much swap to the point where systems often slow down or become hopelessly unresponsive in cases where the proper response would have been to inform the user - the user experience is horrendous: Swap is/was a kludge to handle high memory prices; having the option is great, but most of the time when I have systems that dip into swap, it indicates a problem I'd want to be informed about.

But on modern systems, most software handles it so badly that turning swap off is often not even a viable choice.

Of course there are plenty of situations where the above isn't the proper response, e.g. where you can't ask the user. But even for many servers, the proper response would not be fall over and die if you can reasonably dial back your resource usage and fail in more graceful ways.

E.g. an app server does a better job if it at least provides the option to dynamically scale back the number of connections it handles rather than failing to provide service at all - degrading service or slowing down is often vastly better than having a service fail entirely.

MichaelGG · on Dec 6, 2013

Isn't fork the real offender, which requires Linux to overcommit by default? Disabling swap shouldn't affect that, right? Just makes your problem happen later, in a somewhat non-deterministic way.

Without fork, what reason do you not disable swap? I can only think of an anonymous mmap where you want to use the OS VM as a cache system. But that's solved easily enough by providing a backing file, isn't it?

pjmlp · on Dec 6, 2013

> Isn't fork the real offender, which requires Linux to overcommit by default?

fork() != Linux.

Each UNIX system does it on its own way.

acon · on Dec 6, 2013

Saying that fork forces overcommit is strange. Fork is just one of the things that allocates memory. If you don't want overcommit fork should simply fail with ENOMEM if there isn't enough memory to back a copy of all the writable memory in the process.

MichaelGG · on Dec 6, 2013

I meant the practical considerations of fork means overcommitment is needed in many cases where it otherwise wouldn't be needed. If you fork a 2GB process but the child only uses 1MB, you don't want to commit another 2GB for no reason.

daeken · on Dec 6, 2013

> Isn't fork the real offender, which requires Linux to overcommit by default?

Maybe I'm missing something, but how does fork require overcommitment? When you fork, you end up with COW pages, which share underlying memory. They don't guarantee that physical memory would be available if every page were touched and required a copy; they just share underlying physical memory. Strictly speaking, very little allocation has to happen for a process fork to occur.

MichaelGG · on Dec 6, 2013

If there's no overcommit, each of those COW pages needs some way of making sure it can actually be written to. Isn't that literally the point of overcommit? Giving processes more memory than they can actually use on the assumption they probably won't use it? And Windows takes the different approach of never handing out memory unless it can be serviced (via RAM or pagefile).

What am I missing? (I know you know way more about this than I do.)

daeken · on Dec 6, 2013

When you fork a process, your application's contract with the kernel is such: existing pages will be made accessible in both the parent and the child; these pages may or may not be shared between both sides -- if they are, then the first modification to a shared page will cause an attempt to allocate a copy for that process; execution flow will continue from the same point on both sides. That's pretty much the extent of it (ignoring parentage issues for the process tree). The key thing here is the 'attempt' part -- nothing is guaranteed. The kernel has never committed to giving you new pages, just the old ones.

I don't personally see this as an overcommit, since the contract for fork() on Linux doesn't ever guarantee memory in the way that you'd expect it to. But in all honesty, it's probably just a matter of terminology at the end of the day, since the behavior (write to memory -> process gets eaten) is effectively the same.

Edit: Though I should note, all of the overcommit-like behavior only happens if you are using COW pages. If you do an actual copy on fork, you can fail with ENOMEM and handle that just like a non-overcommitting alloc. So even in the non-pedantic case, fork() really doesn't require overcommit, it's just vastly less performant if you don't use COW.

MichaelGG · on Dec 6, 2013

Oh. I was under the impression that if overcommit was disabled then forking a large process won't work if there's not enough RAM/swap available, regardless of usage.

kazagistar · on Dec 6, 2013

So out of memory failure won't happen when you malloc, it will happen when you assign a variable in a COW page. This somewhat invalidates the idea of a failing malloc.

army · on Dec 6, 2013

The problem is once you touch those memory pages, total memory usage increases even if you don't call malloc.

eliteraspberrie · on Dec 5, 2013

The most common way to indicate that an error has occurred in a C function is to return non-zero. If this is done consistently, and return values are always checked, an error condition will eventually make its way up to main, where you can fail gracefully.

For example:

    int a(void)
    {
        ...
        if (oops) {
            return 1;
        }
        return 0;
    }

    int b(void)
    {
        if (a(...) != 0) {
            return 1;
        }
        return 0;
    }

    int main(void)
    {
        if (b(...) != 0) {
            exit(EXIT_FAILURE);
        }
        return EXIT_SUCCESS;
    }

(This means that void functions should be avoided.)

demallien · on Dec 6, 2013

It's not so simple in real life. I use this style of error-handling for only one type of project: a library with unknown users. In that case, as I can't make assumptions about the existence of better error handling systems, it gives the most flexible result. But at a price, I know have to document the error codes, and I had damned well better also provide APIs that allow my client to meaningfully recover from the error.

In most I have worked on, this type of error handling is completely inadequate. Think multithreaded applications. The code that needs to handle the error your code just generated isn't in the call stack. This happens very often in my experience, and I have found that the best solution is to post some kind of notification message rather than returning an error code. This creates a dependency on the notification system though, so it's not always the correct solution.

The thing that I dislike the most in your example was when you propagated the error from function a out of function b. my most robust code mostly uses void functions. Error codes are only used in cases where the user can actually do some meaningful action in response to the error, and feNkly this is rarely the case. Instead I try as much as possible to correctly handle errors without propagation. It frees up the user of my APIs from having to worry about errors, and in my opinion thus should be a design goal of any API.

clarry · on Dec 5, 2013

What's the point of propagating all errors way up to main if you're only going to exit anyway? I think we know how to indicate errors in C functions. What to do about specific errors, in this case allocation failures, is a more interesting question.

nitrogen · on Dec 5, 2013

If malloc() fails now, it might succeed again later. So you can just go on doing everything else you were doing, then try the memory-hungry operation again in the future.

For example, this could be important in systems where you might be controlling physical hardware at the same time as reading commands from a network. It's probably a good idea to maintain control of the hardware even if you don't have enough memory right now to read network commands.

vidarh · on Dec 6, 2013

This is a pet peeve of mine with modern applications: So many of them just throw their metaphorical hands in the air and give up.

Prior to swap and excessive abuse of virtual memory this was not an option: If you gave up on running out of memory, your users gave up on your application. On the Amiga, for example, being told an operation failed due to lack of memory and given the chance to correct it by closing down something else was always the expected behaviour.

But try having allocations fail today, and watch the mayhem as any number of applications just fall over. So we rely on swap, which leaves systems prone to death spirals when the swap slows things down instead.

If embedded systems programmers wrote code the same way modern desktop applications developers did, we'd all be dead.

majika · on Dec 6, 2013

Doesn't it duplicate effort to put the responsibility of checking for available memory on individual applications?

I think, in most computing environments, it should be the operating system's responsibility to inform the user that free memory is running out. Applications should be able to assume that they can always get memory to work with.

I think the swap is an extremely sensible solution, in that executing programs slowly is (in most environments, including desktop) better than halting programs completely. It provides an opportunity for the user to fix the situation, without anything bad happening. Note that the swap is optional anyway, so don't use it if you don't like it.

Comparing modern computing environments to the Amiga is laughable. It's not even comparable to modern embedded environments, because they serve(d) different purposes.

I'm a hobbyist C application/library developer who assumes memory allocation always works.

nitrogen · on Dec 6, 2013

Most computing environments don't have a user to speak of, and the correct response of an application to an out of memory error could range from doing nothing to sounding an alarm.

As a user, I find it incredibly frustrating when my old but indispensable music software runs out of address space (I have plenty of RAM) and, instead of canceling the current operation (e.g. processing some large segment of audio), just dies with a string of cryptic error dialogs. The best thing for the user in this case is to hobble along without allocating more memory, not to fail catastrophically by assuming that memory allocation always works.

Swap is not a good solution because when there's enough swap to handle significant memory overuse, the system becomes unresponsive to user input since the latency and throughput to swap are significantly slower than RAM.

majika · on Dec 6, 2013

I think most computing environments do have a user of, if you consider a "user" to be something that be notified and can act on such notifications (e.g. to close applications).

Your music software's problem seems to be a bad algorithm - not that it doesn't check the return values of the `*alloc` functions.Aas you say, it should be able to process the audio in constant space. While I assume that I can always acquire memory, I do still care about the space-efficiency of my software.

I must admit I've never seen my system depending on swap, so I don't know how bad it is. But, if you have 1GB of on-RAM memory already allocated, wouldn't it only be new processes that are slow?

Also, I'd again point out that if you don't like swap, you don't have to have one.

hvidgaard · on Dec 6, 2013

> if you have 1GB of on-RAM memory already allocated, wouldn't it only be new processes that are slow?

No - the memory sub system, will swap out pages based on usage and some other parameters. A new application would most likely result in already running applications least used pages being swapped out.

nitrogen · on Dec 6, 2013

I must admit I've never seen my system depending on swap, so I don't know how bad it is.

Just for fun, try deliberately creating a swap storm some time. Then try to recover from it :-). Do this on a system that doesn't have other users.

kansface · on Dec 6, 2013

Conversely, if app developers wrote the same code that embedded systems programmers do, we'd never have any apps to use. Its just not worth the trade off- moreover, telling a user to free memory is a losing battle.

derleth · on Dec 6, 2013

> If embedded systems programmers wrote code the same way modern desktop applications developers did, we'd all be dead.

If Boeing made passenger jets the way Boeing made fighters, we'd all be dead, too, but try telling a fighter pilot that they should do their job from a 777. It's two very different contexts.

Besides, some errors can't be recovered from. What do you do when your error logging code reports a failure? Keep trying to log the same error, or do you begin trying to log the error that you can't log errors anymore?

burstmode · on Dec 6, 2013

>What do you do when your error logging code reports a failure? Keep trying to log the same error, or do you begin trying to log the error that you can't log errors anymore?

First you try to fix the problem of the logging system by runnig a reorganisation routine (delete old data,...) or reinit the subsystem. If that does not work AND if logging is a manadatory function of you system you make sure to change into a safe state and inidcate a fatal error state (blinking light, beeper, whatever). If the logging is such an important part of the system surrounding your system it might take further actions on its own and reinit your system (maybe turn your system off and start another logging systen). There is no exit. You never give up.

jzwinck · on Dec 5, 2013

It's an even better idea to make the hardware fail safe, so you can let the program die and not worry too much about it. This does not apply in all cases (cars), but it does apply in many (trains, like a deadman switch for the computer). For a vivid example of why this is an important approach, read about the Therac-25.

nitrogen · on Dec 6, 2013

Absolutely, but I would still write my software as though the hardware could fail deadly unless doing so made the system less reliable.

joshguthrie · on Dec 5, 2013

To make sure you free() all your previous allocations on the way down. You can choose not too, it's "kinda" the same (can't remember the exact difference) but it's dirty and people with OCD just won't accept it.

(Disclosure: I got OCD too, this is not meant to make C development hostile to people with OCD.)

scott_s · on Dec 5, 2013

If your program is going to exit anyway, there's no need to call free() on allocated memory. The operating system will reclaim all of the memory it granted to your process when it exits. Remember that malloc lives in your process, not the kernel. When you ask for memory from malloc, it is actually doling out memory that you already own - the operating system granted it to you, when malloc requested it. And malloc requested it because you asked for more memory than it was currently managing.

If your intention is to continue running, then of course you want to call free() on your memory. And this certainly makes sense to do as you exit functions. But if you're, say, calling exit() in the middle of your program, for whatever reason, you don't need to worry about memory.

Other resources may be a problem, though. The the operating system will reclaim things it controlled, and granted to your process - memory, sockets, file descriptors and such. But you need to be careful about resources not controlled by the operating system in such a manner.

joshguthrie · on Dec 6, 2013

Multiple reasons.

Some kernels may not get memory back by themselves and expect each application to give it back. We're lucky that the kernels we use everyday do, but we may one day have to write for a target OS where it's not the case. Just hoping "the kernel will save us" is a practice as bad as relying on undefined behaviors.

If you're coding correctly, you have exactly as much malloc()'s as you have free()'s, so when rewinding the stack to exit, your application is gonna free() everything anyway.

Speaking of resources, what about leftover content in FIFOs or shm threads that you just locked?

And when you got OCD, you're only satisfied with this:

    $ valgrind cat /var/log/*
    ...
    ==17473== HEAP SUMMARY:
    ==17473==     in use at exit: 0 bytes in 0 blocks
    ==17473==   total heap usage: 123 allocs, 123 frees, 2,479,696 bytes allocated
    ==17473==
    ==17473== All heap blocks were freed -- no leaks are possible
    ==17473==
    ==17473== For counts of detected and suppressed errors, rerun with: -v
    ==17473== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

(cat was an easy choice and all I got on this box, but I've had bigger stuff already pass the "no leaks are possible" certification)

scott_s · on Dec 6, 2013

First, you sometimes do want to call exit deep within a program. That is the situation I am addressing, not normal operation. Of course you want to always free unused memory and prevent memory leaks. I am quite familiar with the importance of memory hygiene, and have even written documents admonishing students to use valgrind before coming to me for help: http://courses.cs.vt.edu/~cs3214/fall2010/projects/esh3-debu...

Second, please re-read my last sentence. I specifically addressed things that the kernel does not reclaim. This would also include System V shared memory segments and the like. You must manage these yourself, and it is always messy. Typically, rather than calling exit directly, you're going to invoke a routine that knows about all such resources, frees them, then calls exit. But you still don't need to unwind your stack back to main.

Third, the kernel always gets back all of its memory that was granted through conventional means. That's what operating systems do. I think you have a fundamental misunderstanding of what malloc is, and where it lives. Malloc is a user-level routine that lives inside of your process. When you call malloc, it is granting you memory that you already own. Malloc is just a convenience routine that sits between you and the operating system. When you say malloc(4), it does not go off and request 4 bytes from the operating system. It looks into large chunks of memory the operating system granted to it, and gives you some, updating its data structures along the way. If all of its chunks of memory are currently allocated, then it will go ask the operating system for memory - on a Unix machine, typically through a call to brk to mmap. But when it calls brk or mmap, it will request a large amount of memory, say a few megabytes. Then, from that large chunk, it will carve out 4 bytes for you to use.

(This is why off-by-one errors are so pernicious in C: the chances are very good that you actually do own that memory, so the kernel will happily allow you to access the value.)

Now, even if you are a good citizen and return all memory to malloc, the operating system still has to reclaim tons of memory from you. Which memory? Well, your stacks and such, but also all of that memory that malloc still controls. When you free memory back to malloc, malloc is very unlikely to then give it back to the operating system. So all programs, at exit, will have memory that they own that the kernel will need to reclaim.

notacoward · on Dec 7, 2013

They say memory is the second thing to go. ;) Unfortunately, the OS doesn't know how to remove files or modify database entries that also represent program state, or properly shut down connections to other machines. Proper unwinding is still necessary.

scott_s · on Dec 7, 2013

For the third time, I specifically addressed resources that are not controlled by the operating system.

deathanatos · on Dec 6, 2013

Sane cleanup. Close any open resources, especially interprocess visible resources. Resources entirely in your process, such as memory, will just get freed by the OS; a file might want to be flushed, a database properly closed. Likely, in the frame where you're out of memory, you won't have the context to know what should be done: that is most likely a decision of your caller, or their caller…

waps · on Dec 5, 2013

This is a brilliant example of why you'd want exceptions. Look at what you're doing for error handling, manually every time.

Exceptions do the exact same thing, except: 1) automatically 2) type-safe 3) allow you to give extra detail about the error 4) tools can actually follow the control flow and tell you what's happening and why 5) debuggers can break on all exceptions. Try breaking on "any error" in your code (I do know how C programmers "solve" that : single-stepping. Ugh)

In this case, they are better in every way.

This is almost as bad as the code in the linux kernel and GNOME where they "don't ever use C++ objects !", and proceed to manually encode virtual method tables. And then you have have 2 object types that "inherit" from eachother (copy the virtual method table) and then proceed to overwrite the wrong indices with the overridden methods (and God forbid you forget to lock down alignment, resulting in having different function pointers overwritten on different architectures). Debugging that will cost you the rest of the week.

When it comes to bashing exceptions, it would be better to give people the real reason C++ programmers hate them, it's because of the major unsolvable problem you'll suddenly run in to when using them. In C and C++ you can use exceptions XOR not using exceptions.

This sounds like it's not a big deal, until you consider libraries. You want to use old libraries ? No exceptions for you ! (unless you rewrite them) You want to use newer libraries : you don't get to not use exceptions anymore ! You want to combine the two ? That's actually possible but if any exception library interacts with a non-exception library in the call-stack boom.

Exceptions are a great idea, but they don't offer a graceful upgrade path. Starting to use exceptions in C++ code is a major rewrite of the code. I guess if you follow the logic of the article that "would be a good thing", but given ... euhm ... reality ... I disagree. Try explaining "I'm adding exceptions to our code, rewriting 3 external libraries in the process" to your boss.

EpicEng · on Dec 6, 2013

    This is a brilliant example of why you'd want exceptions. Look at what you're doing for error handling, manually every time.

You say that like safely handling exceptions is trivial. Exceptions are emphatically not "better in every way", they are a mixed bag. They offer many clear benefits (some that you have described here), but at the cost of making your code more difficult to reason about. You essentially end up with a lot of invisible goto's. Problems with exceptions tend to be much more subtle and hard to debug.

I'm not against them at all, and often I prefer them, but there are certainly downsides.

qznc · on Dec 6, 2013

Also, exceptions are faster.

There is a lot of comparisons and branching going on, when the program always checks return codes. Assuming zero-cost exceptions, there is only overhead in the failure case.

waps · on Dec 8, 2013

I also find it very disingenious of the pro-exceptions post to claim that these mazes of ifs are easy to navigate. In his example that is sort-of true. When you're using actual real data to make the comparison it's easy to introduce very hard to trace bugs in them.

Once I had two things to check, one being time, and as you know that means 8 cases. You have to pick one to check first, and I picked the non-time based check to check first. That means that I suddenly didn't check all cases anymore :

  if (currentTime() < topBound) {
    if (some other condition) {
      if (currentTime() > lowerBound) {
        // at this point you of course do NOT know for sure that currentTime < topBound. Whoops.

(these look like they can be trivially merged. That's true if you look at just these lines, it becomes false if you look at the full set of conditions).

ethomson · on Dec 5, 2013

I don't get the sense that there was any attempt to recover from errors - it sounds more like they were enforcing that error checking occurred, by replacing `malloc` with one that just returned `NULL` always. It sounds like the goal was to make sure that one didn't assume `malloc` would always succeed and just use the memory.

Indeed, recovery is basically futile in this case and your program is going to shut down pretty quickly either way. Maybe you'll get the chance to tell the user that you ran out of memory before you die, which seems polite.

cpeterso · on Dec 5, 2013

In systems that overcommit memory (like Linux), malloc() can return non-NULL and then crash when you read or write that address because the system doesn't have enough real memory to back that virtual address.

deathanatos · on Dec 6, 2013

Even in Linux which set to overcommit, malloc() can still return NULL if you exhaust the virtual address space of your process, though I expect it's much less likely now on 64-bit platforms.

alextingle · on Dec 5, 2013

What if you application is the fly-by-wire for an airliner? Can you imagine that there might be better options than just calling abort(3)?

clarry · on Dec 5, 2013

Yes, a better option is to make sure this error cannot happen, by making sure the program has enough memory to begin with. Fly-by-wire shouldn't need unbounded memory allocations at runtime.

There are some applications where you can try to recover by freeing something that isn't critical, or by waiting and trying again. Or you can gracefully fail whatever computation is going on right now, without aborting the entire program. But these are last resort things and will not always save you. If your fly-by-wire ever depends on such a last resort, it's broken by design :-)

Crito · on Dec 5, 2013

From what I understand, in those sort of absolutely critical applications the standard is to design software that fails hard, fast, and safe. You don't want your fly-by-wire computer operating in an abnormal state for any amount of time, you want that system to click off and you want the other backup systems to come online immediately.

The computer in the Space Shuttle was actually 5 computers, 4 of them running in lockstep and able to vote out malfunctioning systems. The fifth ran an independent implementation of much of the same functionality. If there was a software fault with the 4 main computers, they wanted everything to fail as fast as possible so that they could switch to the 5th system.

sitkack · on Dec 5, 2013

Embedded systems like this do not use dynamic memory allocation.

Crito · on Dec 5, 2013

Related: The JPL C guidelines forbid dynamic memory allocation: http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf (HN discussion: https://news.ycombinator.com/item?id=4339999)

They also demand that code be littered with sanity checking assertions.

sitkack · on Dec 6, 2013

Thanks for posting this.

Tangent, I was thinking about Toyota's software process failure and how they _invented_ industrial level mistake proofing yet did not apply it their engine throttle code.

C is obviously the wrong language, but from a software perspective they should have at least tested the engine controller from an adversarial standpoint (salt water on the board, stuck sensors). That is the crappy thing about Harvard architecture cpu (separate instruction and data memory), you can have while loops that NEVER crash controlling a machine that continues to wreck havoc, sometimes you want a hard reset and a fast recovery.

http://en.wikipedia.org/wiki/Crash-only_software

rmrfrmrf · on Dec 5, 2013

Ugh that's why I'll never buy Boeing; non-upgradable memory in 2013? Please!

alextingle · on Dec 6, 2013

I knew that people would nit-pick on this and not address the actual issue. Next time, I should try harder to come up with a better example.

My point is: sometimes it is worth trying to recover when malloc() fails.

sitkack · on Dec 6, 2013

I wasn't trying to nitpick. Correcting the example, and yes recovering from a malloc failure _could_ be a worthy goal, but on Linux by the time your app is getting signaled about malloc failures the OOM killer is already playing little bunny foo foo with ur processes.

If your app can operate under different allocation regimes then there should be side channels for dynamically adjusting the memory usage at runtime. On Linux, failed malloc is not that signal and since _so many_ libraries and language runtime semantics allocate memory, making sure you allocate no memory in your bad-malloc path is very difficult.

joshguthrie · on Dec 5, 2013

Like eliteraspberrie said, the proper way to recover from an error is to unroll your stack back to your main function and return 1 there.

Error checking was enforced for EVERY syscall, be it malloc() or open(). Checking for errors was indeed required but not enough: proper and graceful shut down was required too.

joshguthrie · on Dec 5, 2013

Go ahead. Call exit(). On your HTTP/IRC/anything server. Just because you couldn't allocate memory for one more client. Now your service is down and the CTO is looking for blood =)

Yes, it's far-fetched and like some said further down the comments, you "can't" run out of memory in Linux, but straight killing a service is never good.

astrange · on Dec 5, 2013

If your server's running Limux, it's going to kill your process with no questions asked if you run out of memory. You're better off practicing crash-only error recovery and having robust clients that can handle a reconnect.

HTTP is stateless already, so crash and restart all you want!

yew · on Dec 6, 2013

The OOM killer is more likely to kill some other process and trash your server.

Thankfully that sort of behavior has been vastly reduced since the thing was introduced, but disabling overcommit for high-reliability applications is still a reasonable course of action.

dingaling · on Dec 6, 2013

The OOM killer might eventually kill something, after it thrashes the system for a few hours.

I had a server last week in which swap hadn't been configured. A compilation job took all memory and the OOM started thrashing. Thankfully there's always one SSH session open but I couldn't kill anything, sync or shutdown; fork failed with insufficient memory.

Left if thrashing overnight and had to power-kill it next day.

Crito · on Dec 5, 2013

I'd say letting that server bounce and having the watchdog/load balancer work to keep you at capacity is the best option there. You are going to need that infrastructure anyway and if you can't malloc enough for one more client, who is to say that the existing stuff is going to be able to continue either?

You should count on any particular instance bouncing when that happens, and design your system to survive that. You should also invest some effort to figure out why your system can get into that state. Consider if any particular instance should be attempting to accept more jobs than it has the ability to handle. I shouldn't be able to trigger an out of memory situation on your IRC server just by connecting as many irc clients as I can.

housel · on Dec 5, 2013

I suspect you mean LD_PRELOAD=whatever.so

joshguthrie · on Dec 5, 2013

Thanks, haven't used it in two years and am a bit rusty :)

dllthomas · on Dec 5, 2013

I've been thinking this might be well ameliorated by using checked exceptions, but even languages that have them don't get sophisticated with them, so they wind up being painful enough they're not really used (in my experience).

For instance, I don't know of a language that lets you say "This function accepts a function (f) and returns a function which throws anything f throws except FooException."

mtdewcmu · on Dec 6, 2013

I think you may have hit on something subtle. To me, it's better to have two completely distinct modes: I'm either dealing with errors or not. Then if an error occurs, I can expect one of two things -- either it should be handled or it should crash. I've noticed in some exception-using code I've encountered that a lot of times, error conditions are tested and then not really handled -- they're either silently ignored, or dealt with by some half-measure. To me, silent failures are much more insidious than crashes. This halfway style of error handling is the worst of all worlds, IMO, and it might be encouraged by exceptions. Java forces you to acknowledge errors that you have no need to, and probably pushes you toward this gray area when you're not sure how to handle things yet. Then, later, it's totally non-obvious what's truly handled and what isn't. It also clutters up your code. Clarity is often more beneficial than reflexive error-checking.

tieTYT · on Dec 5, 2013

Well I think this is only a problem with runtime exceptions. You have no choice but to deal with compile time exceptions (of course you can deal with them poorly if you choose). But it seems the compile time exceptions are unpopular.

I think there's another way to deal with this and that's a better type system. AFAIK it's impossible to have a null pointer exception in Haskell. EG: If you do something like `hashmap.get` the type system forces you to write code to react to the situation when the value doesn't exist in the map. Your code won't run otherwise.

...I can't tell if I'm rambling or replying to you :)

gizmo686 · on Dec 6, 2013

Haskell does not force you to handle the value not existing. For example, you can do:

let Just b = HashMap.lookup k xs

If the element is not there, then lookup will return Nothing, which will fail to pattern match with 'Just b'. This can also be accomplished using the fromJust function. What Haskell does do is make it obvious where these exceptions can occur by documenting what functions can return Nothing, and by requiring you to explicitly cast between values that may be Nothing and values that cannot be Nothing.

astrange · on Dec 5, 2013

Haskell's type system isn't powerful enough to guarantee no exceptions will occur, though it certainly manages NULL better and other languages like Agda go further.

Try debugging an out of bounds array access in Data.Array, which IIRC a few years ago just made my program print "error" and die...

Periodic · on Dec 6, 2013

I feel the majority of libraries you will use in Haskell won't resort to errors except in two cases: One is very low-level libraries doing things like interacting with C-code or raw memory. These are dangerous in all languages unless you are careful. The second is in functions that explicitly let the caller know they may fail.

For example, you can write "head aList" which will cause an error if aList is empty, however if you instead write "case aList of { [] -> handleEmptyList; a:as -> handleNonEmptyList a}" then you can cleanly get around the exceptional cases. It does take discipline to use these functions and when writing in "quick and dirty" mode I do skip and use the other functions, but they should only be used carefully.

cwzwarich · on Dec 6, 2013

Checking before you call a partial function isn't really any different than a null check.

dllthomas · on Dec 5, 2013

I wonder if that's a general "water seeks its level" thing. We only care so much about robustness, and start to trade off in favor of it at some threshold regardless of language...

adamnemecek · on Dec 5, 2013

Yeah, I think that people are starting to realize this. E.g. Go and Rust don't have exceptions (or Java/C++ style exceptions).