Recently I was doing some work on Gatsby plugin, and this gave me an opportunity to look at Gatsby closely and understand the mechanism — mainly, how is it so fast?
To understand this, I need to know more about Static Site Generator (SSG). So normally, websites render in two ways: statically and dynamically.
Static site is generated before the user requests it. In comparison, dynamic site is generated on the fly to provide personalised user experience (e.g. based on user’s location, profile, etc.), thinking of recommendations pop up for example.
The way that static sites are built consists of three phase:
- Content is written a templating language(e.g. jsx, pug).
- The above gets processed through the static site generator which spits out HTML files (as well as CSS, image files) as individual webpages.
- These files are normally uploaded to a web server or CDN so your websites become live.
In addition, there might be more advanced steps like:
- A watcher will notice for any changes in the source files normally in dev mode and rebuild your assets.
- The static site generator will rebuild the changed file somehow intelligently (thinking react virtual DOM).
So how would the structure makes a difference? We need to further look into the process of how web browser (client) requests the static site vs. dynamic generated site.
Note that web servers like nginx are only able to read static files aka HTML, CSS and JavaScript, while application server like Node/Express can deal with dynamic scripting and database. A client will only communicate with web server through HTTP request, and the latter, if needed will communicate with application server.
So in SSG mode, the web browser sends an HTTP request to the web server. The server immediately returns the client with the HTML file requested. This is client-side processing.
In dynamic generated site, the web browser sends an HTTP request to the web server, who will forward the request to an application server. The application server will do somethings like pulling from database, building the response based on the data and requests, and then send the response to the web server which returns the response to the web browser. This is called server-side processing.
As you can see, for the dynamic generated sites, there’s an additional period of “build time” (time to construct the response) on top of the “request time” (time when client sends the request) for each request, since in SSGs all sites are built ahead of time, and stored in the web server (CDNs) already.
That’s why it’s so fast.
There’s additional benefits:
- Security: since SSGs are pre-built HTML files and served from a web server or CDN, they have very few vectors to fail from malicious attack. And since there’s no database, complex modification to the response required, the hosting infrastructure can be simplified to easily be protected.
- Scale: since SSGs can be put on CDN and ready to be served without additional build process, it means they are easily scalable and have better performance with in-memory caches and faster respond time.
Static site generators vs. CMS
When I first started, I was confused between SSGs and CMS (content management system). Digging deeper, it seems that most CMS use a web front end, has a database to store content and manage permissions, while SSGs can be command-line application, focusing on rendering content as HTML.
It’s common to integrate CMSs with SSGs so they can do best in their own fields especially with the help of headless CMS.
For example, I was using Contentful as CMS and Gatsby as SSG. And Gatsby pulls the content via APIs, and generate a new static site whenever content has changed within the CMS.
So finally, onwards to Gatsby!
As an SSG, Gatsby is fast by nature. It takes the source code and will spit out the final static site with HTML CSS JavaScript images and everything we need. But the team behind it has done more remarkable work to make it faster.
First, Gatsby uses GraphQL to get the data. Comparing to RESTApi, GraphQL has its own advantages in speed and efficiency.
With Webpack at its core, Gatsby is smart enough to apply Code split intelligently. For example, there’s this feature called link prefetching: when browser has loaded the page, it will look for links with prefetch attributes and load them at the background. When a user hovers/clicks on a link, the file will be loaded instantly.
Also, Gatsby replies on React to load the page dynamically: “Gatsby generates your site’s HTML pages, but also creates a JavaScript runtime that takes over in the browser once the initial HTML has loaded” . So each link to another page becomes a route by Reach Router . When you go from page A to B, Gatsby figures out smartly which part of page A needs to be updated to become page B and do so, and then jump you to the top of the page so it looks like you are redirecting to page B, but it’s in really just a SPA updating the content on one page.
Images are always a pain in the ass due to their size, and Gatsby optimises image by utilising the gatsby-image library which use GraphQL and image resizing to reduce all sorts of optimisation like a “blur-up” technique will be used to display image with low-quality initially and a high resolution image is recreated using the GraphQL query.
On CSS, Gatsby identifies the CSS the critical css and only adding this CSS inline to the page, so there’s no additional download.
Gatsby will push critical resources via HTTP/2 if necessary (Can be configured in configs through webpack).
That’s so much of it today. Hopefully you have a better understanding of what is a static site generator and why Gatsby is so fast.
Happy Reading!