“Wait, what? The people who need it the most? That's pretty random.” - The people reading this article, probably.
Let's be honest. If you've been a developer for a while now, you've used XML and JSON. You're probably comfortable with either of these markup languages. You've likely heard of YAML, and maybe seen it if you use other apps like Docker or OpenStack.
But maybe your knowledge of it has been juuuuust enough to modify a string, or update some existing files. But you don't really know what's going on in that file, and you're maybe not compelled to learn the ins and outs of YAML. If that's you, congratulations! You're the type of person who needs a quick start for YAML the most, so you can take advantages of its true power 💡💡
In this post I'll cover some reasons why you should learn YAML, its cool features and how to get started.
1. Why does YAML even exist?
This might seem like a weird place to start, but you should know why YAML exists and where it fits into the overall markup landscape. Otherwise, you run the risk of just learning the bare minimum to scrape by because you may view it as unnecessary. In my personal experience, this means that I wind up knowing just enough to be dangerous - able to create and modify scripts from largely googling tips, but unable to diagnose why something doesn’t work.
Everything you ever type in YAML could also be adequately expressed in XML or JSON, so why is YAML even a thing? YAML is less verbose than XML while still allowing developers to accurately describe precisely what they want. It also allows for greater flexibility in how you store your data, which is something we’ll get into later.
It’s also in a better position to be able to explain to you what the script is doing. The reason for this is that YAML is sold as a “human friendly” data serialisation language. JSON and XML are also data serialisation languages, but they weren't so much created with a focus on readability. After all, JSON doesn’t natively support comments (You'd have to use JSONC for that, or another implementation of JSON that supports comments).
YAML files are usually used to configure something. They are used to configure docker, with docker-compose, for instance. Openstack also uses YAML configuration, and of course, so does Codemagic.
The point of the story is that YAML is definitely here to stay, and if you know it, it will make your life easier. Moreover, you can potentially write less and do more with it than you could with, say, XML or JSON.
Why would I use it instead of JSON or XML?
The decision to use YAML to configure build pipelines when you are already familar with JSON or XML can be challenging. You have to learn a new way of expressing data… So why should you?
Well, there's the obvious reason that the only way that you can interact with services that use YAML configuration files is to actually use YAML. There are countless other services today that are configured through YAML, like Openstack or Docker (via the docker-compose.yaml) file. Granted, it takes a few moments to get started with any new language, but the benefits can be massive.
Then there's the other reason that YAML has features not found in JSON or XML that can actually make you a more efficient developer. And that's good for everyone, right? 😎 Bigger players like Azure DevOps (formerly Visual Studio Online) are switching from a UI driven experience in their pipelines to one that is driven by a YAML file.
YAML positions itself as a “human-readable data-serialisation” language. So the intent is clear - to make it easy to read (and write!) structured data.
Now that we have a broad idea of what YAML is, and what problem it sets out to solve, let's dig in.
2. How does the formatting work?
All data serialisation languages have to follow a certain pattern, in order to ensure that they are valid. With XML, these are the nodes that make up the XML Document. They look like this:
It's easy to see where the data for a key starts and ends, as well as to assert what type of data is in each key. JSON is a similar story, where the same data looks like this:
But this a YAML article, and you're here for the YAML 😄. YAML looks like this:
YAML doesn't rely on nodes or structuring in the same way that JSON or XML does. Instead, it relies on indentation levels to structure data. Personally, I wouldn't say that these indentation levels are better or worse than using curly brackets to define the data structure, but you do have to pay more attention than say, JSON or YAML on what tab level you're at.
3. The basic structure
The root structure is pretty much just a key value store. In C#, we call these dictionaries, whereas in Dart we would call them maps. The key is a string, and the value can be, well, anything. Most of the time these will be scalar values.
What is a scalar value?
A scalar value just means a single value.
In the above example, the key
buildtype will have the value of
Maybe you're looking at that and feeling confused, or even worried. In this instance,
native isn't surrounded by quotes. So what type is it? In this case, the YAML interpreter sees a string and simply interprets it as such. If it was
0 instead of
native, the interpreter would interpret it as a number. Of course, if we really wanted a 0 as a string, we would write
This kind of type detection occurs throughout YAML. If you want a boolean value, like true or false, simply write it. If you write a 0 or a 1 to try to express true or false, these will just be interpreted as numbers and not booleans. This aligns quite well with YAML's overall goal to be easily readable and consumable to humans (after all, using the phrase true or false as compared to a 0 or a 1, the former is more clear as to intent). You're also able to put spaces in your keys and values, and the YAML interpreter will still know what you are on about. Let's take a quick look at this now.
buildtype: native (the same we had above)
build type: native desktop (spaces in your key and your value are still valid YAML )
buildtype: 'downstream:offline' (quotes aren't required for keys or values, but you can use them if you want. You may have a specific requirement to include a colon in your value or key, and that's when using quotes would definitely be appropriate).
target: null (null values are simply represented by the word
~ as shorthand.)
version: 100 (will be interpreted as a number value)
version: '100' (will be interpreted as a string value)
Use spaces, not tabs, for indents
When you're using a YAML editor, like Visual Studio Code, you might get into the habit of hitting Tab to indent your lines. When you do this, the editor is actually inserting a few spaces, as opposed to an actual TAB character. It's important to keep this in mind, as YAML doesn't actually allow TAB spaces in documents. Trying to use them will make the YAML interpreter cranky. 😡🗯
Representing multi-line data
Representing multi-line data in JSON can be difficult. Searching for ‘JSON multi-line’ will yield quite a few questions on how to accurately represent multi-line strings in JSON. You may choose to include release notes in your YAML or other text data, so being able to appropriately represent multi-line data is important.
Fortunately, in YAML, it's trivial to express multi-line data. Simply use the
| operator for this. Like the below.
release notes: | This release includes many more pictures of cats. Contact the developer if you have any questions about the cat quantity. Ciao!
4. Using collections
You can likely recall off the top of your head how other languages like JSON represent collections and lists. It's simple enough, right? In YAML it's even simpler.
To demonstrate this, lets imagine that we've been sent on an errand from Codemagic to pick some things up from the shop. Its on official Codemagic stationary, of course. It might look a bit like this.
We can see the “Shopping List” title, and then we can see the included items on that list. This is achieved by using a single dash next to each item, to indicate a new item so its not ambigous. Nobody told you that's how a list should be written, it's just a natural way to write a list. Right? Collections in YAML follow this intuitive nature, like so:
artifacts: - build/**/outputs/**/*.apk - build/**/outputs/**/*.aab - build/**/outputs/**/mapping.txt - flutter_drive.log
We're describing a list of artefacts, and then jumping right in to what the values of that list are. In this case, it's simply a list of strings. There's not a lot different in this compared to the shopping list example given above.
We can also see that, just like our Codemagic shopping list, child items are indented out a little. It's important to be consistent in how far these are indented out with spaces (and remember, TAB characters aren't supported in YAML, so you can't use those to indent anything in YAML!).
Also, each collection can contain disparate types. So you can write a list with strings, numbers, or even scientific notation, and it's still valid YAML. Of course, your consuming app must be okay with this.
5. Don't repeat yourself
When we run our builds through Codemagic, it's likely that seperate workflows will have some things in common. That makes sense, they are all ultimately building the same application for you.
So, if there are things in common, it stands to reason that we are destined to repeat ourselves in configuring these seperate workflows. They will all produce APK's. They will probably all run tests.
In YAML, we don't have to repeat ourselves. Thankfully, it allows for us to write templates for ourselves and then re-use them again later on. These are known as anchors.
To familarise ourselves with this concept, we'll take a look at a YAML that I use in my ‘continual’ app. The codemagic.yaml is here > >.
This is the content of the file:
workflows: default-workflow: &baseline name: Default Workflow environment: flutter: stable cache: cache_paths: - $FCI_BUILD_DIR/build scripts: - | # set up debug keystore rm -f ~/.android/debug.keystore keytool -genkeypair \ -alias androiddebugkey \ -keypass android \ -keystore ~/.android/debug.keystore \ -storepass android \ -dname 'CN=Android Debug,O=Android,C=US' \ -keyalg 'RSA' \ -keysize 2048 \ -validity 10000 - | # set up local properties echo "flutter.sdk=$HOME/programs/flutter" \ > "$FCI_BUILD_DIR/android/local.properties" - flutter packages pub get - flutter build apk --release artifacts: - build/**/outputs/**/*.apk - build/**/outputs/**/*.aab - build/**/outputs/**/mapping.txt - flutter_drive.log publishing: email: recipients: - firstname.lastname@example.org stable-workflow: <<: *baseline environment: flutter: beta
We can clearly see the intent of this document at a glance. There's a workflow, and the pipe operator is used when multi-line data is used. We also see a collection of strings under the
But we also have another workflow here. The
stable-workflow. And it has some weird
<< operator. Put simply, that's telling the YAML interpreter to use all the values found in the
baseline anchor ⚓. This is available on the
default-workflow line. Through this, we can easily re-use templates in the same document later on, and override the keys and values as we want to.
In my experience, you'll have different workflows for different reasons. You might have a test build run, and when it completes, it should email the testers. Then you might have another workflow that builds a stable version, and pushes it to the relevant stores (or posts to Slack channels, and all the other fabulous things you can do via the Codemagic YAML configuration).
In the above example, normally, you'd be copy pasting these workflows into these huge long documents that are finicky to maintain. Your original build process will change, and when it does, are you going to want to update it in every single place where you referenced that particular build step? Using anchors lets you avoid this, and affords you the capability to compose your YAML files instead of manually prescribing everything over and over again.
After we have comitted that
codemagic.yaml file, how does it look on the actual Codemagic site?
stable-workflow turns up in Codemagic, right where we'd expect it. Because the YAML interpreter does the heavy lifting for us, we can easily choose workflows that derive from a common base.
6. Use a text editor with some smarts
You can edit YAML with any text editor. You can use
vi if you really want to, or notepad. But you can use an editor that provides some sort of validation on the fly. For example, when I edit my YAML in Visual Studio Code, it looks like this.
Visual Studio code helps us out by showing pretty colours next to the lines to show what indent level that particular line is at. It also converts your tab presses to spaces to avoid validation issues like that.
If you try to do something wrong, it also provides a level of validation (also known as linting).
Other editors like Notepad++ also provide YAML support 📒.
7. If you get stuck, read the instructions
If you want to become an expert in all things YAML, you can also read the specification here. While reading the specification probably sounds distinctly unenjoyable, it also definitely stipulates what you can and cannot do in YAML.
For a more code-orientated approach to working with YAML, Learn X in Y Minutes has this excellent page showing a much more detailed usage of YAML (some of which you may not use in a build configuration setting, but hey, you never know).
So, now you know a little YAML, and it's time to shine🌟🌟. So go and grab your own codemagic.yaml from your build and have a play around 😃.
That's all for now. Happy hacking!
Lewis Cianci is a software developer in Brisbane, Australia. His first computer had a tape drive. He’s been developing software for at least ten years, and has used quite a few mobile development frameworks (like Ionic and Xamarin Forms) in his time. After converting to Flutter, though, he’s never going back. You can reach him at his blog, read about other non-fluttery things at Medium, or maybe catch a glimpse of him at your nearest and most fanciest coffee shop with him and his dear wife.