Disclaimer: This is a rant, but I mean it to be a useful rant. The point of this article is to be mad and things, offer solutions and explore them. However, I will not fully explore the solutions offered so they may appear to be half-baked. My intention is not to dwell too much on their details but to stimulate your mind and try to shake some of the stupid status-quo.
Oh, and it’s also not as long as it seems. It features some images and an optional appendix. It’s going to be fun, promise!
Think about a program you’ve had to write. Because you’re a true Unixer you accept your input from stdin output to stdout. There’s a beautiful simplicity to it, which is further enhanced with pipes. Pipes are truly beautiful. No joke. They’re an amazing concept, and they make sense conceptually. Not a lot of things have that.
So let’s look at a program we all know and love, curl:
% curl -vvv http://api.icndb.com/jokes/random | jq '.value.joke'"Two wrongs don't make a right. Unless you're Chuck Norris. Then two wrongs make a roundhouse kick to the face."
And curl heartily replies with the IPs it’s connecting to, the request and response headers, and even progress bars! Finally the output is passed to jq where it can go its merry way and get us our joke.
Hold that thought. Something’s fishy here.
How did curl say what’s data (the http response) to be passed on to the pipe and what’s info meant for the user (progress bars et al)? After all, jq isn’t interested in the request’s progress bar, but the user sure is.
You most likely know the answer: stderr. stdout is what’s piped to the command while stderr which isn’t handled by the pipe is instead handled by your shell. We can explicitly handle stderr like this:
% curl -vvv http://api.icndb.com/jokes/random 2>/dev/null | jq '.value.joke'"Chuck Norris can win at solitaire with only 18 cards."
By convention 2 is stderr’s file descriptor so we can tell our shell to pipe it to some file. Now you won’t see it.
You may not have noticed what’s “wrong” here, especially if you’re a shell user with experience under your hat. I urge you to reread the previous couple of paragraphs again.
Take your time.
How on Earth do we use stderr as a convention to signify output meant for the user? It’s literally named “standard error”, and we bastardised it into also meaning “completely legit output”.
I’ve tried to come up with an appropriate analogy but failed, since the only suitable one seems to be a restaurant with a dish named “punch my face” which is sometimes a delicious soufflé and sometimes earns you a punch in the face.
This isn’t just stderr. stdin is also faulty:
% echo 4 | read -E4
Have a program which accepts input from stdin but also wants user input (“are you sure”s, multi-step input, etc)? Too bad if you’re piping!
Yes, you can do things like manipulate /dev/tty and friends. But that’s besides the point. You can also use curses, or write a GUI or a server, or you can just kick off and go to Bermuda or the Bahamas (come on pretty mama).
The point is that the holy trio of stdin, stdout and stderr is flawed. They convolute two distinct concepts: Program interaction and user interaction. We want pipes for programmatic manipulation of data and the terminal for interactive manipulation of data.
We add two more streams: userin and userout. Pipes bind between stdin and stdout, userin and userout are bound to the terminal.
If there’s no pipe before you and you read from stdin then you’re reading from the terminal, which is userin.If you’re not connected to a terminal, then things go as normal: Reading will block forever, writing has no effect.
Here are some complications which arise:
It does however raise a valid programming decision: What do we accept from stdin and what from userin? There’s no straight cut answer, but if you think about it you may already have it: Structured input which suits programs (like init file formats, html, or any structured data really) come from stdin. Human input (like prompts, queries, etc) comes from userin. It’ll have to be thought through.
A possible solution to some of these problems is to have a libc function give you the handles to userin and userout so they won’t be ordinary global variables. That means they won’t be pipe-able, at least not without special syntax. I’m happy with that.
I really like this idea because it adds to the beauty while not breaking everything: Programs still interact through pipes. And it’s opt-in — programs written before the introduction of userin and userout won’t feel anything, programs written after can finally not write user output to stderr and know that they can get user input from userin.
But enough about that. Let’s move on to even more controversial ground.
It’s currently 2016 and I still can’t view images on my terminal or see a syntax-highlighted file or interactively read a pdf or plot a graph. The trend here is that you can only view text and not anything else.
It doesn’t have to be this way. I’m getting cold feet just from writing these lines but what if…what if…the terminal wasn’t limited to text?
The future is here and its prompt is a Nyan cat. Terminology implements all what I asked for and more. In the picture above I typed tycat 3d-save.png and boom an image appeared, so what am I complaining about, right?
The problem is that of standards. We need a standard way to tell our terminal that hey, this output here is an image or a video or PostScript or a code file, please format it as appropriate. One terminal implementing a separate command for this kind of output isn’t good enough. That guy or gal who just finished installing Ubuntu as their first Linux distro should be able to fire up gnome-terminal and cat a cat image.
Remember that with the addition of userout we have a way to communicate that a piece of information is meant for the user and not for a stray program, so piping isn’t a huge concern of ours.
This isn’t a walk in the park. Once again we face a list of problems:
The simplest problem to tackle is signaling the file format. We have MIMEs for that. But it’s not a matter of simply writing a mime header followed by a newline: What if we don’t want to send a mime header? What if we don’t send a mime header but our output looks exactly like a mime header?
To tackle both 1 and 2 a relatively straightforward way is to be able to “fork” a stream. C pseudo-code:
FILE *userout = getuserout();FILE *video_stream = forkstream(userout, “video/ogg”);FILE *gif_tream = forkstream(userout, “image/gif”);
This’ll allow us to provide a mime type and pass the file around without worry. Cooler than that would be ensuring that writing to gif_stream while in the same time writing to video_stream wouldn’t cause a collision. This will be difficult to implement, but I think it’s worth it.
Annoyingly, we need to tackle 3 and 4 and 5: What about terminals that don’t support this? What’s the minimal amount of formats needed to be compliant? What about multiple terminals?
Let’s do a sidebar for a moment, a sidebar which I think is relevant not just for this point or this rant, but for the entire series of rants. Instead of listening to my babbling (which I’m sure you’re tired of by now), I recommend you listen to someone smarter than me say smart things. Tune in until he shows the slide “Do web sites need to look exactly the same in every browser”. Maybe more. You should watch that lecture anyway. Nicholas Zakas on Progressive Enhancement:
Decided to skip it or just want to see me talk? Oh you flatter-mouth. Here’s a recap of what he said:
TVs used to only display black and white. Then, people started making colour TVs, and then high-def TVs. Despite having vastly different capabilities, they’re all capable of showing the same content: You plug them in and they show you the 100th rerun of Friends.
A question I still haven’t acknowledged is whether this is giving too much responsibility to the terminal. I think the answer is No. The terminal should support exactly how much it wants to support and not a format beyond. But it should have this basic capability: Programs should be able to signal what they’re displaying to the user.
This is possible. We’ve been doing it elsewhere. We’ve just been neglecting the terminal.
fish has done amazing things to your shell. You have completion suggestions and colours and so many things which make so much sense that you’re wondering why would anyone still voluntarily use bash!? zsh which is an absolute beast is still not ubiquitous even though it’s a drop-in replacement to bash.
Why? Why doesn’t your distro ship with zsh or fish? To be fair some do, sort of. I can speak of Arch whose installation image comes with a configured zsh. But that’s a drop in the ocean. Why doesn’t Ubuntu use zsh as the default? Why doesn’t CentOS give you fish?
I don’t know why we do this to ourselves. Your screen is amazing. You view HD movies off of it. But when you work in the terminal, you probably don’t even have anti-alised fonts.
While writing this I’ve googled around for my ideas. It’s impossible that I’m the only one who had the insane ideas of adding standard streams or making the shell less horrible. And indeed, two popular projects cropped up: FinalTerm and TermKit.
FinalTerm hits some of the points I raised in the terminal section. It’s also never been out of alpha, and it also suffers from the severe disadvantage of being dead.
TermKit amazed me, for better and for worse. It hit on what I aimed for and then some. It also suffers from the same sympton of death. I’m not going to cover everything here since a post-morterm was published on this reddit page.
The failure of these projects isn’t making me happy. FinalTerm died because of technological reasons alongside regular OSS reasons (with the lead developer demotivated, who picks up the reins?), and TermKit died both because of technological reasons and because it tried to do too many things in a too short amount of time. This isn’t motivating.
What saddens me most is that I just don’t know what to do with these ideas. Even if I were to implement my own shell and terminal that wouldn’t be enough — these things require co-operation on the side of program makers. It’s possible that it’s too late and we’ll always be in the state that we are today. It’s not that bad. It’s just…ugh.
Ugh.
I don’t know what’s next.
I started working on a terminal emulator called plex. It won’t be a complete TE and will only provide a POC of the ideas presented here (hopefully).
Let’s see where it goes.
But wait a minute…where do you think you’re going? We’re not finished yet. This section details problems to which I have no solution, or my thoughts in the matter are still too immature to matter. Feel free to skip this section or the entire article! Might be a bit too late to do the latter.
Let’s start out with something fresh: passing arguments to commands.
# It always starts out simple:% ls something% ls -l something% ls --format=long# wait, or was it% ls --format long# what about spaces% ls ’01. Serenity.mp4'# and quotes% ls “I’m a teapot”% ls I\’m\ a\ teapot# now you’ll want to kill yourself% grep -EIrn ‘can \’you [\\doubt] \”w\\/hy’
I don’t even know if the last example is correct. I don’t have the heart to go through it.
You have a bunch of ways to pass arguments to a command, some of which have to be prefixed in a certain format. Some flags need a single dash, some flags need a double. And there’s a lot of inconsistent logic: Usually double dash prefixes a multi-letter word, but sometimes it doesn’t (example: mplayer -sub subtitle-path but also mplayer --help, and find is a repeat offender). Sometimes you can specify several single-letter arguments in one dash, a la grep, but sometimes things are just weird like in head -20 path which treats numeric argument after a dash to mean -n, and the list just goes on.
And don’t you get me started on find -exec and the kind of juggling you have to go through there, not to mention piping and passing things into sh -c!
Oh and let’s not forget --help which sometimes works but sometimes isn’t handled. Imagine how powerful it’d be if your shell could intelligently infer the program’s arguments by first running it with --help and parsing the results, providing you with intelligent hints and completions. But of course the usage output isn’t standardised at all so you can forget that. It is somewhat alleviated by builtin argument parsers like Python’s argparse or Ruby’s optparse but you’ll never see something like that as part of libc.
There’s also how downright weird variable expansion is. It’s like dumb textual macros. We deserve so much better.
fish sort of improves upon this, and on the way breaks a lot of things (which is both good and bad). But I don’t think they went far enough. I want a proper programming language.
Now tell me: Have you ever run ipython?
It’s too mind-boggling of an experience. If I was a braver man I’d do chsh ipython. This is much more like what we deserve. We deserve a proper programming language, with sane syntax for piping and variables and an actual help.
We’re never going to get it.
Let’s say you have two programs and they want to send data to each other. To ease things a bit you run both through the shell so you pipe them up. One of the most famous examples is ps and grep:
% ps aux | grep dbus
Once again, there’s something wrong here. Something that I’ve subtly alluded to in my first example with curl and jq. Programs don’t send structured, machine-friendly data to one another. Because of the convolution between program output and human output programs over time adopted their own output conventions and you, the output consumer, are left on your own to make sense of what’s going on here.To clarify: ps’ output isn’t easily machine readable and parseable. Our program needs to both know that we’re accepting input from ps and how to parse that input.
This is both good and bad. This is good because having your program output free-form text is amazing. This is bad because having your program output free-form text is difficult to deal with.
This is not an easy problem to solve. Most people say “just start passing objects around instead of text”, but it’s not as easy as it seems. The semantics of communication between programs is a hard problem. Here are some difficulties:
Speaking of which, I haven’t said what format we’ll be using. JSON? MessagePack? BERT? Cap’n Proto? Any other of the millions out there? How will we pick one over the other?If we opt for strongly typed solutions like Protobuf or Thrift then how do communicate our schemas?If we opt for weakly type solutions like S-Expressions we’re left with problems of validation and de-serialisation, which points us to the next problem:
You may be thinking that Powershell has already solved this problem. That’d be like saying Python solved this problem — Powershell only communicates between programs written in a specific platform, it is not a general-purpose mediator between entirely different platforms. How does Java communicate date objects to Erlang? How does Io (which uses prototypical inheritance) send objects to C?
This is a hard problem. One which isn’t going to be solved today. For a smarter person than me saying smart things, I recommend you watch Joe Armstrong’s “The How and Why of Fitting Things Together”: