Converting HTML to PDF with Puppeteer and Azure Functions

Generating PDFs can be a challenging task, particularly when dealing with complex designs. In this post, I demonstrate how to create an Azure Function and use Puppeteer to convert HTML to PDF
Taha Azzabi
Taha Azzabi4 min readFebruary 8, 2023
Converting HTML to PDF with Puppeteer and Azure Functions

This is my first attempt at writing an article, so please be easy on me 🙂

Most of us have faced the task of generating a PDF at some point - whether it's an invoice, an employment contract, or something else. You might think it's a simple task and think you're smart enough and you said, "I'll find a package that can handle it." That is, until your project manager, comes to you with a sophisticated design.

Oh no, you think to yourself. This is going to be a pain in the ass. In this article, I'll try to spare you the hassle and guide you on how to create an Azure Function and use Peeputter to convert a HTML to PDF, just like you are already familiar with in a browser to print a web page to PDF.

What is Puppeteer ?

Puppeteer is a popular library for Node.js, created by the Google Chrome team. It gives you a simple way to control and automate a headless Chrome or Chromium browser using the DevTools protocol. With Puppeteer, you can launch a headless Chrome browser from your Node.js code and do things like take screenshots, generate PDFs, navigate pages, and more.

It's worth mentioning that there's a similar library called Playwright, which offers similar functionality to Puppeteer

My Development Environment

In my local environment, I have the following setup, and I use NPM to install dependencies:

  • Operating system: test done on Windows 10 and Mac M2
  • Node version: 16.17.1
  • NPM version: 8.19.2

Install the Azure Functions CLI and Azure CLI

The first thing to do is to install the Azure Functions command-line interface (CLI) globally on your system. This will allow you to use it from any directory.

To install the Azure Functions CLI, you can use the following command in command prompt:

npm install -g azure-functions-core-tools

Once the Azure Functions CLI is installed, you can use it to create, develop, and deploy Azure Functions.

In addition to the Azure Functions CLI, we will also need the Azure CLI, which we will use later to fetch settings. You can install the Azure CLI by following the instructions provided on the Microsoft Learn website.

After installing the Azure CLI, run the following command to connect to your Azure account:

az login

Create the Azure function in Azure

There are several methods to create an Azure function, such as using Visual Studio or Azure CLI. Anyway, for this article, I'll choose to do it on the Azure portal website.

  1. Go to the Azure portal: https://portal.azure.com/
  2. Search for "Azure Functions" and select it from the results
  3. Click on "Create" and give your function a name. Select the runtime stack as NodeJS and version 18 LTS. Most importantly, select the operating system as Linux since it supports building for production, which we'll do when we deploy our function.

Creating an Azure Function in Azure with Node.js and Linux

Finally, when you finish creating the function, you will have something like this:

Created Azure Function named 'htmlUrlToPdf' in the Azure portal website

Create and run the Azure Functions locally

  1. Open a terminal or command prompt window.
  2. Create a new folder for your function app using the following command:
mkdir azurefunctionsapp

Navigate to the folder using the following command:

cd myfunctionapp

Use the following command to create a new function app:

func init

Choose Node for worker runtime and JavaScript as the language for your function:

Creating and running Azure Functions locally with Node.js and JavaScript

Use the following command to create a new function within the function app:

func new

Choose an HTTP trigger template for your function, and give a name to your function

Creating an HTTP trigger function named 'pdfGenerator' for Azure Functions locally

Use the following command to start the local development server:

func start

If everything is okay, you will see the URL to access the function on your local.

Azure Functions local development server is up and running

Generate a PDF from a HTML page web using Peeputter.

Finally, we will write some code, but before we do, we must install Puppeteer.

npm install puppeteer

Then, you need to create a configuration file named .puppeteerrc.cjs to specify where to download Chromium for Puppeteer. For more information, refer to this link: https://pptr.dev/troubleshooting#could-not-find-expected-browser-locally .

const { join } = require("path"); /** * @type {import("puppeteer").Configuration} */ module.exports = { cacheDirectory: join(__dirname, ".cache", "puppeteer"), };

Then go ahead to the folder of your function and replace the index.js with the code below, notice that I disabled the Sandbox mode.

const puppeteer = require("puppeteer"); module.exports = async function (context, req) { try { // Check if the URL is passed as a query string if (!req.query.url) { context.res = { status: 400, body: "Please pass a URL on the query string", }; return; } // Launch the browser instance with specified args /** * IMPORTANT: Without these settings, the function will crash on production. * Otherwise, you need to configure a Sandbox on Azure. * See: https://pptr.dev/troubleshooting#setting-up-chrome-linux-sandbox */ const browser = await puppeteer.launch({ args: ["--no-sandbox", "--disable-setuid-sandbox"], }); // Create a new page const page = await browser.newPage(); // Go to the specified URL await page.goto(req.query.url); // Generate the PDF const pdf = await page.pdf({ format: "A4" }); // Set the response headers and body context.res = { headers: { "content-type": "application/pdf", "content-disposition": "attachment; filename=mon-supper-pdf.pdf", }, body: pdf, }; // Close the browser instance await browser.close(); // Call the context.done() method to indicate the completion of the function context.done(); } catch (error) { context.log.error(error); context.res = { status: 500, body: "An error occurred while converting the page to a PDF.", }; return context.done(); } };

The last required step is to ensure that Puppeteer will run installation after the project installation. I know this is weird, but for more information, refer to this link: https://github.com/puppeteer/puppeteer/issues/9192 To do so, open the package.json and add "postinstall": "node node_modules/puppeteer/

your package.json should look like this:

{ "name": "", "version": "1.0.0", "description": "", "scripts": { "start": "func start", "postinstall": "node node_modules/puppeteer/install.js" }, "devDependencies": { "azure-functions-core-tools": "^4.x" }, "dependencies": { "puppeteer": "^19.6.3" } }

Deploy the function on production

Before deploying, make sure to synchronize the function app settings from Azure to your local machine, as deployment failure may occur otherwise. For more information on the issue : https://stackoverflow.com/a/71311555/1087623.

func azure functionapp fetch-app-settings <functionappname>

Voila ! The final step is to deploy and build the function on production.

func azure functionapp publish htmlUrlToPdf --build remote

Enjoy!

Join the Journey!

Fascinated by the intersection of AI and web/mobile technologies? and more content like this ? As an experienced, motivated, and passionate web developer, I'm dedicated to providing quality content that you won't regret. Join my newsletter below

I want to assure you that I do not want to send spam. You can unsubscribe at any time by clicking on this link or the link at the bottom of every email.