How to manipulate .docx files and convert to .pdf

I currently have a node.js backend that creates contracts based on a given template and replaces the "variables" set in it with input values.
Now I'm converting my backend to rust but hit this roadblock.

How it's working:
Using the google drive API to manipulate the file it occurs in a few steps:

  1. Gets the body of template docx file;
  2. Find the variables (a word with double curly brackets ex: {{name}} );
  3. Replaces variables with inputs;
  4. Creates pdf using the body.

What would be the best way to achieve something similar with rust only, without using google's API?

About editing a word document I found a docx crate that creates one but it didn't find how to manipulate it.

And on converting a docx file into a pdf I'm currently looking into pandoc but not sure if it is the best way to go about it.

Has anyone done a similar project? If so what tools did you use?

Consuming a docx is horrible. (Classic post from 2008.) You never want to do that if you can possibly avoid it. So making google drive -- or better, Word -- do it is the only reasonable answer.

If your edit is trivial enough, then docx is just a zip file with stuff inside it. You might be able to naïvely search-and-replace in the XML file and have it work out.

But also, consider whether you can just use native work functionality to do it, like Use mail merge for bulk email, letters, labels, and envelopes - Microsoft Support

1 Like

You might be able to programatically control word or libreoffice to do the conversion.

Thank you for your responses they were really helpful.

I just found an awesome post/project by @RustyJoeM

That does the exact thing I'm trying to accomplish, now I only need to find a way to convert a docx to a pdf.

I encountered a similar issue at $WORK, where there was an additional constraint of generating documents with nice graphics, created and meticulously positioned by a UX designer. (It's a premium product and so we expect it to look nice in exchange for the clients' money.)

The problem with Word and .docx was that… it simply didn't work. The web-based Word and the native app didn't agree on some internal detail (i.e., at least one of them was buggy), so documents got mangled and the results looked horrible with images out of place, text being wrapped and overflowing at the wrong points, etc.

I knew Word was not really a professional publishing tool. If you want to create documents that look good and are reproducible, you'll basically have to use something that is not a proprietary black box. Furthermore, if you need templating (as I did too), you'll need to use something that is either parseable as proper syntax tree, or directly editable by a templating engine.

You might have guessed it – I used LaTeX, and I hope you will too.

2 Likes

being author of the tiny app referenced, i have to say i did not test it on more complex documents and/or various word versions, so there are no guarantees :slight_smile:
i used it as learning tool to creare "some tiny app that does something in rust" - proof of concept for one simple use case i have had. i am happy if it helps someone in some way of course, but you have to make sure it does whats needed or update code yourself as necessary :slight_smile:

1 Like