CI/CD Pipelines
First of all, the pipelines will be built using Github Actions, and they will be all the way to deployment (Continous Integration/Continuous Deployment). With that in mind we will break down the phylosophy of how they are setup and key points to take into consideration.
By convention pipelines are setup in |
Philosophy
A robust CI/CD pipeline for any modern application is crucial for ensuring smooth development, testing, and deployment processes. This specific project takes both personal experience, but also nuggets of wisdom coming from across experts in the field, a source for the ideas applied is: Hands-On Continuous Integration and Delivery.
In short, manual processes, and desorganized processes should be mitigated as much as possible, and it’s through communication with different areas of the business that we can find areas of improvement. Automation should be a key word when it comes to relegating processes that are grunt-work in nature and that only take away from more important tasks for a developer. Besides that, we should have a clear flow of work in order for the software we build to be fast on delivery and health checking. If someone coded something that breaks the system, we should pick up on that as quickly as possible.
Whilst we already have put a set of automation steps focused on each workstation through Husky, there are definitely many ways to circumvent those, and we shouldn’t have to rely entirely on each developer’s willingness to adhere to protocol but a centralized and (in theory) incorruptible source of thruth. Hence there are other mechanisms to keep tabs on code quality and correctness (according to a in-house set of rules).
-
Pull Requests
-
CI/CD Pipelines
Due to the nature of this project, pull requests will definitely won’t be leveraged as much, unless the need or opportunity for them arises, however they should be still be in place so that a whole team can have a look at code changes that are about to be integrated into the codebase and give feedback, and also pick up on possible issues or opportunities for improvement.
But, Pipelines, on the other hand will be leveraged in the form of GitHub Actions, there will be one pipeline for PRs and another pipeline for CI in main. (It’s here that you can see that PRs and Pipelines should also come in hand and have arguments when it comes to trimming down checks for one use case and then adding others for another).
Even if a developer has already go through the layers we have setup in the form of hooks, we will have pipelines with some of the same steps plus others specific to it so that the process is faster when running on the Remote Repository platform (GitHub).
The idea behind Continous Deployment is Agile in nature. Always be building something, testing, and publishing it. It will depend highly on the team and product if you want to get up until that point, since there might be justified use cases in which you only want to go all the way to Continous Delivery. And by some manual intervention, the deployment has to be done by someone assigned to said task.
In our case, we are going all the way to deployment, but be aware that this is extremely taxing in nature, since we are running on a build machine. So be sure to optimize when pipelines should run and not incurr in them becoming a hassle and source of frustation, when they should be there to make things easier.
And in that endeavor of making the pipelines to be as fast as possible, we will leverage
caching, and specifically caching with |
PR Pipeline
This consists in one job called "Lint, Build, Check". It will only run on PRs pointed at master.
CI
Steps:
-
Run on a base Ubuntu image, and from the root of the repo, head down to the
./kakeibro-web
folder. -
Establish a strategy: Run in parallel two instances of the pipeline (matrix), one with Node 20 and another one with Node 22
-
A good practice is to test out the build process in different node versions. Since we are following best practices, we should be testing on the Active LTS and Active Maintenance versions of node.
-
-
We checkout the code with a pre-defined
actions/checkout@v4
action-
Note: At the time of writing the pre-defined actions are at v4 this could end up getting updated later, and we should be mindful of this in case there are actions that are still on other versions such as v3.
-
Note 2: We have to use these pre-defined sub-routines since github encourages it and are meant to abstract really general use cases such as cloning the repo first, (the build machine will always start with an empty state), they get the added bonus of having other syntax that can build on top of the base functionality without us having to code it from scratch (e.g: Fetch specific commits, optimize performance, attaching submodules)
-
-
We install
pnpm
withpnpm/action-setup@v4
. Just like the previous step it will work with a pre-defined script aimed at abstracting logic and making the file look cleaner -
We install node with
actions/setup-node@v4
, in this specific instance we want to tie it topnpm
, and due to the matrix we setup at the beginning we have to now query thematrix.node-version
variable in order to accomodate node with the respective node version, besides that we can setup caching by simply pointing at it to use a strategy that caches withpnpm
. We also have to point specifically to thepnpm-lock.yaml
file so that the pipeline and the build machine can read it and check their own cache of packages and if all the hashes match it will then use the cache instead of trying to install everything again from scratch-
IMPORTANT: It is for these types of use cases that comitting the
-lock
file can be extremely helpful.
-
-
We install all the dependencies with
pnpm
and with the--frozen-lockfile
flag so that it doesn’t try to overwrite something on the lockfile.-
Note:
pnpm i --frozen-lockfile
is recommended to be used on CI/CD pipelines, to ensure consistency. The command install dependencies without modifying thepnpm-lock.yaml
file.
-
-
We then run ESLint by leveraging the scripts that are also leveraged by Husky.
-
We then run a build to check that all the code is okay and nothing broke.
-
We then leverage a script called
debug-check.js
that’s under ascripts
folder at the web app folder. This is used by both Husky and now the CI/CD pipeline in order to check forconsole.log
ordebugger
statements in the code and failing if it does find them. -
We then run a check for outdated packages, but we fail silently, still the table of all dependencies that are outdated will show in the pipeline summary.
-
We will then run a script that checks for package versions that have vulnerabilities in them that were found.
Master Pipeline
This consists of two jobs, "Lint, Build, Check", "Deploy to Firebase"
CI
Steps:
-
Run on a base Ubuntu image, and from the root of the repo, head down to the
./kakeibro-web
folder. -
Establish a strategy: Run in parallel two instances of the pipeline (matrix), one with Node 20 and another one with Node 22
-
A good practice is to test out the build process in different node versions. Since we are following best practices, we should be testing on the Active LTS and Active Maintenance versions of node.
-
-
We checkout the code with a pre-defined
actions/checkout@v4
action-
Note: At the time of writing the pre-defined actions are at v4 this could end up getting updated later, and we should be mindful of this in case there are actions that are still on other versions such as v3.
-
Note 2: We have to use these pre-defined sub-routines since github encourages it and are meant to abstract really general use cases such as cloning the repo first, (the build machine will always start with an empty state), they get the added bonus of having other syntax that can build on top of the base functionality without us having to code it from scratch (e.g: Fetch specific commits, optimize performance, attaching submodules)
-
-
We install
pnpm
withpnpm/action-setup@v4
. Just like the previous step it will work with a pre-defined script aimed at abstracting logic and making the file look cleaner -
We install node with
actions/setup-node@v4
, in this specific instance we want to tie it topnpm
, and due to the matrix we setup at the beginning we have to now query thematrix.node-version
variable in order to accomodate node with the respective node version, besides that we can setup caching by simply pointing at it to use a strategy that caches withpnpm
. We also have to point specifically to thepnpm-lock.yaml
file so that the pipeline and the build machine can read it and check their own cache of packages and if all the hashes match it will then use the cache instead of trying to install everything again from scratch-
IMPORTANT: It is for these types of use cases that comitting the
-lock
file can be extremely helpful.
-
-
We install all the dependencies with
pnpm
and with the--frozen-lockfile
flag so that it doesn’t try to overwrite something on the lockfile.-
Note:
pnpm i --frozen-lockfile
is recommended to be used on CI/CD pipelines, to ensure consistency. The command install dependencies without modifying thepnpm-lock.yaml
file.
-
-
We then run ESLint by leveraging the scripts that are also leveraged by Husky.
-
We then run a build to check that all the code is okay and nothing broke.
-
We then leverage a script called
debug-check.js
that’s under ascripts
folder at the web app folder. This is used by both Husky and now the CI/CD pipeline in order to check forconsole.log
ordebugger
statements in the code and failing if it does find them. -
We then run a check for outdated packages, but we fail silently, still the table of all dependencies that are outdated will show in the pipeline summary.
-
We will then run a script that checks for package versions that have vulnerabilities in them that were found.
Deploy
Steps:
-
This will depend on the previous CI step, if it doesn’t fail then deploy will run normally
-
Run checkout recipe
-
Run pnpm recipe
-
Run node repice, with a specific node version, we are aiming at staying up to date and with the most modern yet in Active LTS or Maintenance LTS from Node, and so we setup node with Version 22. Same setup to take from cache for modules if we have them there.
-
Install dependencies with frozen-lockfile
-
Retrieve service account details from the repository’s secrets vault and save it under a temporary
.json
file so that the firebase CLI can pick up on it and use it for authorization/ -
Run the
deploy
script. Since the firebase CLI is installed through thepackage.json
we should have it present, however due to the nature of this being a build machine we need the service account .json credentials in a file and its path referenced through anenv
variable (GOOGLE_APPLICATION_CREDENTIALS
)-
NOTE: We should have configured the token on GitHub. Otherwise the flow will fail since Firebase won’t authorize the machine to do the deploy.
-
Cleanup Pipeline
As mentioned in the PR pipeline and the
Master pipeline, we leverage caching in
order to make builds faster, but that doesn’t come "for free", we need to be well
aware that this has to live somewhere, and whether we use pnpm
or the action for
general purpose caching. The moment
the cache varies, may it be due to new dependencies detected, or other external factors,
we will have to generate new caches. And they will start adding up, if we have no use
for old caches we should clean them up. And automation of this cleanup is key.
It’s with this in mind that all repositories have a cache cleanup CRON Action, the one specific for the web app is this:
name: Cleanup Action Caches
on:
schedule:
- cron: "0 0 * * *" (1)
workflow_dispatch: (2)
permissions:
actions: write (3)
jobs:
clean-cache:
runs-on: ubuntu-latest
steps:
- name: Get caches list
id: list-caches
run: |
response=$(curl -s -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" \ (4)
-H "Accept: application/vnd.github+json" \ (5)
"https://api.github.com/repos/${{ github.repository }}/actions/caches") (6)
echo "$response" | jq -r '.actions_caches | sort_by(.created_at) | reverse' > caches.json (7)
- name: Delete old caches
run: |
latest_cache_id=$(jq -r '.[0].id' caches.json) (8)
jq -c '.[1:] | .[]' caches.json | while read -r cache; do (9)
cache_id=$(echo $cache | jq -r '.id') (10)
curl -X DELETE -s -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" \
-H "Accept: application/vnd.github+json" \
"https://api.github.com/repos/${{ github.repository }}/actions/caches/$cache_id" (11)
done
if: success() (12)
1 | This Action will be configured as a CRON that runs at 00:00 UTC every day. |
2 | In case we need to manually trigger the Action it has the respective flag so that the option is enabled |
3 | Every action has its own token injected into its runtime, however it’s by default
read-only . And to delete caches we need for that to also inherit write permissions. |
4 | As mentioned in <3>, the Action will have injected a token for it, it will be saved
under secrets.GITHUB_TOKEN , we will leverage the GitHub API in order to retrieve our cache
list and clean it up, we have to send an Auth token so that the API grants us access and
responds correctly |
5 | application/vnd.github+json is the media-type from which github responds with,
so we are configuring it to accept the response with that. |
6 | Another part of already built-in utilities is the fact that under the github
namespace, we can retrieve variables such as .repository . The value of said variable
will be a unique ID, that in the realm of the GitHub API references our current repository
at which the Action is running. |
7 | Whenever we have saved under a variable something, a good practice is to always
reference it with quotation marks to avoid issues with spaces or special characters,
it’s the result of printing the contents of the variable (that has the response from the
GitHub API) that we pipe into jq a utility to work with JSON strings, another good
practice with it is to use the -r flag so that it won’t interpret backslashes as escape
characters (so that we don’t manipulate the response). With jq we can pipe different
instructions that help us navigate a json structure: .actions_caches | sort_by(.created_at) | reverse
firstly points to the .actions_caches property of the JSON object, (we know it’s an array),
and so we tell the tool to try to sort what’s unde the key by a created_at field (if the structure
is not an array this would fail), and lastly after the items have been sorted, we will reverse that
result, since we want for the most recent entry at the beginning (the top). In the end
the final form of what we transformed is saved under a caches.json file. |
8 | We have no technical use for this line but just so that the pipeline breaks if it doesn’t find a valid entry and so that we can see the value on GitHub, we will retrieve the latest cache ID under a variable |
9 | The source is a JSON array, and so in order to iterate over it and do something with
the data, we have to apply jq -c '.[1:] | .[]' caches.json , which leverages jq
again in order to read the structure and with -c it will make sure that each JSON
object will be in one separate line, and with .[1:] we try to slice the array to
exclude the first element (the latest cache entry), it’s then the result of that,
that we iterate over and output it as a separate JSON object with .[] . To that whole
result that jq yields we then apply Bash shell constructs, we are applying a
a loop because read is a command that reads a single line from the input, applied
like this we basically are configuring the read to be reading line-by-line and saving
those contents under a cache variable, the moment there are no more lines to read then
the loop will end. |
10 | We then use jq again to extract the specific id field of the sole JSON
object that we should have at cache , and we save that id value under a cache_id
variable. |
11 | Just as the previous step we will then hit a GitHub API endpoint in charge of deleting a cache entry, we send the Action’s token plus the ID for the cache that we extracted. |
12 | Part of GitHub Actions, use this if you want to enforce a condition on a step or job, so that if the previous step/job was successful then this step can run, otherwise it won’t. A good way to enforce short-circuits, still brittle and you have to add it to each step or job that comes after the declaration of the pivot element. (Same idea as our conditional execution for deployment) |
Firebase
Firebase has a service called Hosting
. Following its documentation
we can integrate firebase into our web app repository.
In the Firebase Console it’s best to create a project. We have created one for the whole kakeibro app.
A good practice so that we adhere to DevOps practices, is to have the firebase CLI
in the form of a node package, meaning it should be declared on our |
-
Head into the web app folder
-
(With the Firebase CLI intalled) run
firebase init
-
Type
Y
-
Select
Hosting
-
Use an existing project
-
Select the project that you have in Firebase
-
Since vite outputs a bundle to a
./dist
folder, we will put that as the public directory. -
Accept
Configure as SPA
-
Do not setup GitHub Actions (we will do that on our own)
-
Do not overwrite the
index.html
(we should have ran a build already) -
Right after this whole setup we should have a
.firebaserc
andfirebase.json
files in our repo. We should commit those, they are key instruments when running afirebase deploy
-
Add a script at
package.json
so that it automates things (you can name itdeploy
). It should build the whole project and then run the firebase CLI tool.
Due to the nature of Firebase and just to test out the script ourselves, it would be
best to run locally the |
After we have wired up the firebase domain through a CNAME
record for our custom
domain, (don’t forget to add 2 records one for the www
subdomain and another one
without it). We should be able to hit both kakeibro.dsbalderrama.top
and
www.kakeibro.dsbalderrama.top
and get our app right there running.
NOTE: Don’t forget to add the .firebase/
folder to .gitignore
.
DISCLAIMER: The next paragraph’s approach has been deprecated, and might break in the future. The project uses a service account, however for the purposes of having the docs with more information these pieces of information are kept. In order to generate a token so that our action can then run a deploy. We can run
|
The whole firebase login:ci
command generates a token specifically to be run on a
pipeline and it is by design enforced to be passed only at the firebase deploy --only hosting --token $FIREBASE_TOKEN
level. We can’t log into Firebase with this token, and it’s in this specific endeavor that
the deploy
script has to be modified so that we can pass to the firebase
command
all of the different parameters tsc -b && vite build && firebase deploy --only hosting
. By adding
the last parameter we will be setting by default for the command to always run the deployment
only for hosting and we can then attach to it extra arguments. And so, when we run
pnpm run deploy --token ${{ secrets.FIREBASE_TOKEN }}
we will have all of those parameters after deploy
attached to the firebase deploy --only hosting …
section of the deploy
script.
Small note: A good practice, to avoid issues, would be the usage of --only hosting
,
so that the whole project only tries to deploy to the SPA section of the whole Firebase
project. There might be times at which running that without the flag will attempt
to deploy to other services and end up breaking something.
About www domain vs clean name
Firebase will need to have a second domain added to it so that it redirects the
www
subdomain to our non-www
subdomain. (Github Pages sets that up automatically).
A good recommendation is at the DNS level to setup the second CNAME record to point to the non-www CNAME.
The DNS configuration side isn’t enough though, the server needs to take care of redirecting things, the HTTP redirection if you will. Hence we will have to make these two configurations ourselves. One on Firebase (the server), and another on our DNS provider.
Danger of deprecation of --token
When running the previous --token
parameter approach we will get a warning stating:
Authenticating with `--token` is deprecated and will be removed in a future major version of `firebase-tools`. Instead, use a service account key with `GOOGLE_APPLICATION_CREDENTIALS`: https://cloud.google.com/docs/authentication/getting-started
This indicates that we should move away from using this CI token and instead use the
service account keys approach. We should be
managing different keys for different services, since using a master .json
credentials
key is a one-point-of-failure. If it gets compromised, ALL OF YOUR SERVICES ARE COMPROMISED.
We have to download it as a .json
file, and then upload it as a secret to the repository
(the text).
Firebase projects end up as different GCloud projects, the best way to manage their credentials is to head to the Project Settings and under the Service Accounts tab we can then click on the All service accounts link. It will take us to a GCloud instance in where we might see an already configured Service Account for us. We can then head down to Service Accounts in the GCloud pannel and generate a key for that service account that has access to our firebase project in charge of the website.
It’s with that in mind that the command can get rid of the --token
parameter,
and we can leverage env variables and the secret text like this:
- name: Setup service account credentials
run: echo '${{ secrets.GCLOUD_SERVICE_ACCOUNT }}' > $HOME/gcloud-service-key.json (1)
- name: Set GOOGLE_APPLICATION_CREDENTIALS
run: echo "GOOGLE_APPLICATION_CREDENTIALS=$HOME/gcloud-service-key.json" >> $GITHUB_ENV (2)
- name: Build and Deploy to Firebase
env:
run: pnpm run deploy
- name: Cleanup service account file
run: rm -f $HOME/gcloud-service-key.json (3)
1 | The main thing is that the env variable is expecting a .json file path, not the
actual json string. Hence we have to copy the contents for a brief second to a file. |
2 | It’s then that we save as an env variable (so that it persists for the next steps), the path to the file we saved with the service account json string. |
3 | Lastly, as a security measure, we will delete the temporary file with our credentials as a form of cleanup. |
References:
You can edit the levels of access for a service account by heading to |
Possible audit blockers
When running the pnpm audit
step and we do find a vulnerability for a package,
then that immediately returns a 1. And a pipeline will fail, whilst this is great
because our controls are in check, it might end up blocking development if we don’t do
something about it, and this is due to the fact that sometimes this package might
be a transitive dependency. And until that gets fixed in our direct referenced packages
with a version bump, we will be stuck, this depends on other’s and we can’t block
the development flow due to this.
And so, an options is to fail silently, but staying on top updates, this can be
done as easily as pnpm audit || echo 'Failing audit silently, waiting for esbuild bump'
.
We could even go further and automate this, but nevertheless, this is just part of the workflow that’s established, some of the pains we have to accept and how to still get value out of everything that was setup.
As mentioned in CI, we have a matrix, one cool thing about GitHub is that it will indeed try to optimize runs, and that is in the form that if some of the matrix instances fails, if another one is still in the middle of running, it will stop immediately and be cancelled. |
A way to be on top of updates is to always be checking on forums (hopefully GitHub) about how the maintainers will deal with this dependency.
E.g: GitHub Issue.
Depending on how people are looking at it you might even want to start adding the specific vulnerability to exceptions:
"pnpm":{
"auditConfig": {
"ignoreGhsas": [
"GHSA-67mh-4wv8-2f99"
]
}
}
This, for example, takes care of an esbuild
vulnerability that won’t be fixed until
a next major release for Vite, hence we will have to make do with the vulnerability.
(And also adds extra info to our end as developers with a dependency), a decision
in this specific use case was to make an update to V6.2 (for example) the moment it’s
available since that will get rid of this vulnerability.
Outdated: Depending on the versions, we could get the pnpm outdated
package
failing instead of just adding "warnings". E.g.: globals
went from version 15.15.0
to 16.0.0
and the step now started failing. How would you fix this? Try to run a
pnpm update
, if this doesn’t bump it to the version that is the latest, manually
edit it and then run pnpm update
again.