This may make some people pull their hair out, but Iā€™d love to hear some arguments. Iā€™ve had the impression that people really donā€™t like bash, not from here, but just from people Iā€™ve worked with.

There was a task at work where we wanted something thatā€™ll run on a regular basis, and doesnā€™t do anything complex aside from reading from the database and sending the output to some web API. Pretty common these days.

I canā€™t think of a simpler scripting language to use than bash. Here are my reasons:

  • Reading from the environment is easy, and so is falling back to some value; just do ${VAR:-fallback}; no need to write another if-statement to check for nullity. Wanna check if a variableā€™s set to something expected? if [[ <test goes here> ]]; then <handle>; fi
  • Reading from arguments is also straightforward; instead of a import os; os.args[1] in Python, you just do $1.
  • Sending a file via HTTP as part of an application/x-www-form-urlencoded request is super easy with curl. In most programming languages, youā€™d have to manually open the file, read them into bytes, before putting it into your request for the http library that you need to import. curl already does all that.
  • Need to read from a curl response and itā€™s JSON? Reach for jq.
  • Instead of having to set up a connection object/instance to your database, give sqlite, psql, duckdb or whichever cli db client a connection string with your query and be on your way.
  • Shipping isā€¦ fairly easy? Especially if docker is common in your infrastructure. Pull Ubuntu or debian or alpine, install your dependencies through the package manager, and youā€™re good to go. If you stay within Linux and donā€™t have to deal with differences in bash and core utilities between different OSes (looking at you macOS), and assuming you tried to not to do anything too crazy and bring in necessary dependencies in the form of calling them, it should be fairly portable.

Sure, there can be security vulnerability concerns, but youā€™d still have to deal with the same problems with your Pythons your Rubies etc.

For most bash gotchas, shellcheck does a great job at warning you about them, and telling how to address those gotchas.

There are probably a bunch of other considerations but I canā€™t think of them off the top of my head, but Iā€™ve addressed a bunch before.

So whatā€™s the dealeo? What am I missing that may not actually be addressable?

  • FizzyOrange@programming.dev
    link
    fedilink
    arrow-up
    26
    arrow-down
    2
    Ā·
    4 days ago

    Iā€™m afraid your colleagues are completely right and you are wrong, but it sounds like you genuinely are curious so Iā€™ll try to answer.

    I think the fundamental thing youā€™re forgetting is robustness. Yes Bash is convenient for making something that works once, in the same way that duct tape is convenient for fixes that work for a bit. But for production use you want something reliable and robust that is going to work all the time.

    I suspect you just havenā€™t used Bash enough to hit some of the many many footguns. Or maybe when you did hit them you thought ā€œoops I made a mistakeā€, rather than ā€œthis is dumb; I wouldnā€™t have had this issue in a proper programming languageā€.

    The main footguns are:

    1. Quoting. Trust me youā€™ve got this wrong even with shellcheck. I have too. Thatā€™s not a criticism. Itā€™s basically impossible to get quoting completely right in any vaguely complex Bash script.
    2. Error handling. Sure you can set -e, but then that breaks pipelines and conditionals, and you end up with really monstrous pipelines full of pipefail noise. Itā€™s also extremely easy to forget set -e.
    3. General robustness. Bash silently does the wrong thing a lot.

    instead of a import os; os.args[1] in Python, you just do $1

    No. If itā€™s missing $1 will silently become an empty string. os.args[1] will throw an error. Much more robust.

    Sure, there can be security vulnerability concerns, but youā€™d still have to deal with the same problems with your Pythons your Rubies etc.

    Absolutely not. Python is strongly typed, and even statically typed if you want. Light years ahead of Bashā€™s mess. Quoting is pretty easy to get right in Python.

    I actually started keeping a list of bugs at work that were caused directly by people using Bash. Iā€™ll dig it out tomorrow and give you some real world examples.

    • JamonBear@sh.itjust.works
      link
      fedilink
      arrow-up
      6
      Ā·
      3 days ago

      Agreed.

      Also gtfobins is a great resource in addition to shellcheck to try to make secure scripts.

      For instance I felt upon a script like this recently:

      #!/bin/bash
      # ... some stuff ...
      tar -caf archive.tar.bz2 "$@"
      

      Quotes are OK, shellcheck is happy, but, according to gtfobins, you can abuse tar, so running the script like this: ./test.sh /dev/null --checkpoint=1 --checkpoint-action=exec=/bin/sh ends up spawning an interactive shellā€¦

      So you can add up binaries insanity on top of bashā€™s mess.

      • lurklurk@lemmy.world
        link
        fedilink
        arrow-up
        2
        Ā·
        2 days ago

        I imagine adding -- so it becomes tar -caf archive.tar.bz2 -- "$@" would fix that specific case

        But yeah, putting bash in a position where it has more rights than the user providing the input is a really bad idea

      • esa@discuss.tchncs.de
        link
        fedilink
        arrow-up
        1
        Ā·
        2 days ago

        Quotes are OK, shellcheck is happy, but, according to gtfobins, you can abuse tar, so running the script like this: ./test.sh /dev/null --checkpoint=1 --checkpoint-action=exec=/bin/sh ends up spawning an interactive shellā€¦

        This runs into a part of the unix philosophy about doing one thing and doing it well: Extending programs to have more (absolutely useful) functionality winds up becoming a security risk. The shell is generally geared towards being a collection of shortcuts rather than a normal, predictable but tedious API.

        For a script like that youā€™d generally want to validate that the input is actually what you expect if it needs to handle hostile users, though. Itā€™ll likely help the sleepy users too.

      • MonkderVierte@lemmy.ml
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        Ā·
        3 days ago

        gtfobins

        Meh, most in that list are just ā€œif it has the SUID bit set, it can be used to break out of your security contextā€.

    • lurklurk@lemmy.world
      link
      fedilink
      arrow-up
      1
      Ā·
      2 days ago

      I donā€™t disagree with your point, but how does set -e break conditionals? I use it all the time without issues

      Pipefail I donā€™t use as much so perhaps thatā€™s the issue?

      • FizzyOrange@programming.dev
        link
        fedilink
        arrow-up
        1
        Ā·
        2 days ago

        It means that all commands that return a non-zero exit code will fail the script. The problem is that exit codes are a bit overloaded and sometimes non-zero values donā€™t indicate failure, they indicate some kind of status. For example in git diff --exit-code or grep.

        I think I was actually thinking of pipefail though. If you donā€™t set it then errors in pipelines are ignored, which is obviously bad. If you do then you canā€™t use grep in pipelines.

        • lurklurk@lemmy.world
          link
          fedilink
          arrow-up
          1
          Ā·
          2 days ago

          My sweet spot is set -ue because I like to be able to use things like if grep -q ...; then and I like things to stop if I misspelled a variable.

          It does hide failures in the middle of a pipeline, but itā€™s a tradeoff. I guess one could turn it on and off when needed

    • Badland9085@lemm.eeOP
      link
      fedilink
      arrow-up
      2
      arrow-down
      2
      Ā·
      3 days ago

      I honestly donā€™t care about being right or wrong. Our trade focuses on what works and what doesnā€™t and what can make things work reliably as we maintain them, if we even need to maintain them. Iā€™m not proposing for bash to replace our web servers. And I certainly am not proposing that we can abandon robustness. What I am suggesting that we think about here, is that when you do not really need that robustness, for something that may perhaps live in your production system outside of user paths, perhaps something that you, your team, and the stakeholders of the particular project understand that the solution is temporary in nature, why would Bash not be sufficient?

      I suspect you just havenā€™t used Bash enough to hit some of the many many footguns.

      Wrong assumption. Iā€™ve been writing Bash for 5-6 years now.

      Maybe itā€™s the way Iā€™ve been structuring my code, or the problems Iā€™ve been solving with it, in the last few years after using shellcheck and bash-language-server that Iā€™ve not ran into issues where I get fucked over by quotes.

      But I can assure you that I know when to dip and just use a ā€œproper programming languageā€ while thinking that Bash wouldnā€™t cut it. You seem to have an image of me just being a ā€œbash glorifierā€, and Iā€™m not sure if itā€™ll convince you (and I would encourage you to read my other replies if you arenā€™t), but I certainly donā€™t think bash should be used for everything.

      No. If itā€™s missingĀ $1Ā will silently become an empty string.Ā os.args[1]Ā will throw an error. Much more robust.

      Youā€™ll probably hate this, but you can use set -u to catch unassigned variables. You should also use fallbacks wherever sensible.

      Absolutely not. Python is strongly typed, and even statically typed if you want. Light years ahead of Bashā€™s mess. Quoting is pretty easy to get right in Python.

      Not a good argument imo. It eliminates a good class of problems sure. But you canā€™t eliminate their dependence on shared libraries that many commands also use, and thatā€™s what my point was about.

      And Iā€™m sure you can find a whole dictionaryā€™s worth of cases where people shoot themselves in the foot with bash. I donā€™t deny thatā€™s the case. Bash is not a good language where the programmer is guarded from shooting themselves in the foot as much as possible. The guardrails are loose, and itā€™s the script writerā€™s job to guard themselves against it. Is that good for an enterprise scenario, where you may either blow something up, drop a database table, lead to the lost of lives or jobs, etc? Absolutely not. Just want to copy some files around and maybe send it to an internal chat for regular reporting? I donā€™t see why not.

      Bash is not your hammer to hit every possible nail out there. Thatā€™s not what Iā€™m proposing at all.

      • FizzyOrange@programming.dev
        link
        fedilink
        arrow-up
        3
        arrow-down
        1
        Ā·
        3 days ago

        And I certainly am not proposing that we can abandon robustness.

        If youā€™re proposing Bash, then yes you are.

        Youā€™ll probably hate this, but you can use set -u to catch unassigned variables.

        I actually didnā€™t know that, thanks for the hint! I am forced to use Bash occasionally due to misguided coworkers so this will help at least.

        But you canā€™t eliminate their dependence on shared libraries that many commands also use, and thatā€™s what my point was about.

        Not sure what you mean here?

        Just want to copy some files around and maybe send it to an internal chat for regular reporting? I donā€™t see why not.

        Well if itā€™s just for a temporary hack and it doesnā€™t matter if it breaks then itā€™s probably fine. Not really what is implied by ā€œproductionā€ though.

        Also even in that situation I wouldnā€™t use it for two reasons:

        1. ā€œTemporary small scriptā€ tends to smoothly morph into ā€œ10k line monstrosity that the entire system depends onā€ with no chance for rewrites. Itā€™s best to start in a language that can cope with it.
        2. It isnā€™t really any nicer to use Bash over something like Deno. Likeā€¦ I donā€™t know why you ever would, given the choice. When you take bug fixing into account Bash is going to be slower and more painful.
        • Badland9085@lemm.eeOP
          link
          fedilink
          arrow-up
          1
          arrow-down
          4
          Ā·
          2 days ago

          Iā€™m going to downvote your comment based on that first quote reply, because I think thatā€™s an extreme take thatā€™s unwarranted. Youā€™ve essentially dissed people who use it for CI/CD and suggested that their pipeline is not robust because of their choice of using Bash at all.

          And judging by your second comment, I can see that you have very strong opinions against bash for reasons that I donā€™t find convincing, other than what seems to me like irrational hatred from being rather uninformed. Itā€™s fine being uninformed, but I suggest you tame your opinions and expectations with that.

          About shared libraries, many popular languages, Python being a pretty good example, do rely on these to get performance that would be really hard to get from their own interpreters / compilers, or if re-implementing it in the language would be pretty pointless given the existence of a shared library, which would be much better scrutinized, is audited, and is battle-tested. libcrypto is one example. Pandas depends on NumPy, which depends on, I believe, libblas and liblapack, both written in C, and I think one if not both of these offer a cli to get answers as well. libssh is depended upon by many programming languages with an ssh library (though there are also people who choose to implement their own libssh in their language of choice). Any vulnerabilities found in these shared libraries would affect all libraries that depend on them, regardless of the programming language you use.

          If production only implies systems in a userā€™s path and not anything else about production data, then sure, my example is not production. That said though, I wouldnā€™t use bash for anything thatā€™s in a userā€™s path. Those need to stay around, possible change frequently, and not go down. Bash is not your language for that and thatā€™s fine. Youā€™re attacking a strawman that youā€™ve constructed here though.

          If your temporary small script morphs into a monster and youā€™re still using bash, bash isnā€™t at fault. You and your team are. Youā€™ve all failed to anticipate that change and misunderstood the ā€œtemporaryā€ nature of your script, and allowed your ā€œtemporary thingā€ to become permanent. Thatā€™s a management issue, not a language choice. Youā€™ve moved that goalpost and failed to change your strategy to hit that goal.

          You could use Deno, but then my point stands. You have to write a function to handle the case where an env var isnā€™t provided, thatā€™s boilerplate. You have to get a library for, say, accessing contents in Azure or AWS, set that up, figure out how that api works, etc, while you could already do that with the awscli and probably already did it to check if you could get what you want. Whatā€™s the syntax for mkdir? Whatā€™s it for mkdir -p? What about other options? If you already use the terminal frequently, some of these are your basic bread and butter and you know them probably by heart. Unless you start doing that with Deno, you wonā€™t reach the level of familiarity you can get with the shell (whichever shell you use ofc).

          And many argue against bash with regards to error handling. You donā€™t always need something that proper language has. You donā€™t always need to handle every possible error state differently, assuming you have multiple. Did it fail? Can you tolerate that failure? Yup? Good. No? Can you do something else to get what you want or make it tolerable? Yes? Good. No? Maybe you donā€™t want to use bash then.

          • FizzyOrange@programming.dev
            link
            fedilink
            arrow-up
            3
            arrow-down
            1
            Ā·
            2 days ago

            Youā€™ve essentially dissed people who use it for CI/CD and suggested that their pipeline is not robust because of their choice of using Bash at all.

            Yes, because that is precisely the case. Itā€™s not a personal attack, itā€™s just a fact that Bash is not robust.

            Youā€™re trying to argue that your cardboard bridge is perfectly robust and then getting offended that I donā€™t think you should let people drive over it.

            About shared libraries, many popular languages, Python being a pretty good example, do rely on these to get performance that would be really hard to get from their own interpreters / compilers, or if re-implementing it in the language would be pretty pointless given the existence of a shared library, which would be much better scrutinized, is audited, and is battle-tested. libcrypto is one example. Pandas depends on NumPy, which depends on, I believe, libblas and liblapack, both written in C, and I think one if not both of these offer a cli to get answers as well. libssh is depended upon by many programming languages with an ssh library (though there are also people who choose to implement their own libssh in their language of choice). Any vulnerabilities found in these shared libraries would affect all libraries that depend on them, regardless of the programming language you use.

            You mean ā€œthird party librariesā€ not ā€œshared librariesā€. But anyway, so what? I donā€™t see what that has to do with this conversation. Do your Bash scripts not use third party code? You canā€™t do a lot with pure Bash.

            If your temporary small script morphs into a monster and youā€™re still using bash, bash isnā€™t at fault. You and your team are.

            Well thatā€™s why I donā€™t use Bash. Iā€™m not blaming it for existing, Iā€™m just saying itā€™s shit so I donā€™t use it.

            You could use Deno, but then my point stands. You have to write a function to handle the case where an env var isnā€™t provided, thatā€™s boilerplate.

            Handling errors correctly is slightly more code (ā€œboilerplateā€) than letting everything break when something unexpected happens. I hope you arenā€™t trying to use that as a reason not to handle errors properly. In any case the extra boilerplate isā€¦ Deno.env.get("FOO"). Wow.

            Whatā€™s the syntax for mkdir? Whatā€™s it for mkdir -p? What about other options?

            await Deno.mkdir("foo");
            await Deno.mkdir("foo", { recursive: true });
            

            Whatā€™s the syntax for a dictionary in Bash? What about a list of lists of strings?