Best way to describe code blocks for: accessible audio book

Hey, fellow rustaceans!

I was looking for a good audiobook version or audio + video format of the book and found nothing, except an old issue on github from a few years ago that received a decent amount of community attention, but ultimately was never created.

Well I'm still neck deep in my rust journey...and I used to produce audiobooks for the National Library Service for the blind and print disabled. I thought, who better to produce an accessible audiobook format for such a diverse community?

On this note though, I'm unsure of the best way to read / break the code blocks down to be understandable for the visually disabled and this sort of content would be considered among of the highest complexity audiobook I've produced, none of which were related to programming. So I'd love as many thought-out opinions on how to narrate this effectively...especially from those in the community who have overcome this type of adversity, or have experience in producing this type of content (or even people familiar with accessibility tools, etc.) so I can produce the highest quality learning material that is accessible to everyone.

Should I just read the code blocks character by character? Or would it make more sense to make a narrators note / appendix breaking down the syntax, then while reading the code blocks in the chapter say something to the effect of "function foo takes variable bar of type i32 as an argument and returns a variable of type i32" ?

This was a common method we used at the NLS when needed to give context to graphs, etc. but I really would love some more opinions on the best way to go about this!!

For anyone wondering, this project will be done section by section to make updating the videos at a later date easier, as well as get the accessible version of the book out sooner. I'll likely release them to a youtube channel in a combined video + audio format, however I'd also be more than happy to release an audio only version for free download to wherever makes the most sense.

Thanks :slight_smile:
Ke7in

3 Likes

I don’t have any experience with this, but my instinct would be to find out how screen readers handle code in an editor and match it— That way, their experience with code in the audiobook corresponds to how they’ll be interacting with code when they start actually programming.

1 Like

Thank you for your input!

My initial thought was the same, which from my personal experience with TTS and accessibility tools is: it just reads out character by character when there are no words. And maybe this is the best way to go? However, with previous books I've produced, when there is complex graphical material to digest, we would very often add additional notes for a higher order of context.

My main issue is when I did a test run of TTS, it really felt like the meaning was entirely lost. Of course this is to someone who is not visually disabled and I only have my gut and previous experience to go on, though...so the more sample sizes the better informed I'll be to make a more appropriate decision!

Another potential avenue could be to do both, add a note at the top of the code block describing the code signature, return types, etc., and then read the block out bit by bit. However, I feel like this might not be the best way for everyone to digest the information as it would add at least an hour, likely quite a bit more to the audio book and possibly detract from the result for the non visually impaired.

Ideally I'd like to fit in the middle where both parties can equally enjoy and understand the content. If not, I'll likely lean on the side of accessibility...but for now I'd like to explore all the possibilities at making this project amazing for all :slight_smile:

I was curious as to what the experience of a screen reader for coding could look like. Came across this presentation, an interesting watch

This is not to say that "mirroring the screen-reader experience" is necessarily the ideal approach, but I suppose it can be interesting either way to learn more about what that experience would look like in the first place. (Also, this is but a single data-point, and I have no clue how diverse&different screen-reader experiences / setups can become.)


Edit: Some interesting things I observed: Punctuation had concise names, e.g. just "semi" for semicolon. The screen reader offered different ways of reading the code; with roughly 3 levels of detail. As far as I remember: The most detailed was letter by letter, funnily enough distinguishing capital letters by pitch, and it was audible while the code was typed; somewhat "less detailed" was while navigating the code (or after finishing a word while writing it) where spaces were still called out, but words are read as full words, not letter-by-letter; and "least detailed" I could hear when he had it read out a whole line of code at a time, when notably spacing was no longer read out.

My takeaway from this obervation: Having different levels of detail / levels of literal-ness, can probably be a good idea. Humans can of course take it away from the literal symbol-by-symbol code much further than the screen reader which mostly just skipped some details (spacing, captitalization) and was reading whole words not letter-by-letter. And on the other hand: Details like spacing and capitalization can be interesting. Especially when esteblishing what syntax looks like for the very first time.

E.g. a section like this introducing the first usage of const

Here’s an example of a constant declaration:

const THREE_HOURS_IN_SECONDS: u32 = 60 * 60 * 3;

The constant’s name is THREE_HOURS_IN_SECONDS and its value is set to the result of multiplying 60 (the number of seconds in a minute) by 60 (the number of minutes in an hour) by 3 (the number of hours we want to count in this program). Rust’s naming convention for constants is to use all uppercase with underscores between words. The compiler is able to evaluate a limited set of operations at compile time, which lets us choose to write out this value in a way that’s easier to understand and verify, rather than setting this constant to the value 10,800.

probably would want the code explained very literally. This might even be the first place where a type annotation is presented like : u8, so even the spacing here might be of interest to show conventional formatting choices :thinking:. Or maybe it isn't so interesting because it's optional and rustfmt is a thing :thinking: I don't know.

More notably, the constant's name is in all uppercase and contains underscores; this is even called out in the text already, but could also be interesting to note the moment the code is first read, too.

On the other hand, at a later point, when the "spacing" and casing conventions of identifiers are more esteblished, one might not want to call it out all the time. But does that include the understores? Those are kind of analogous to PascalCase in type names though; it might also be weird if you said the identifier in struct RustIsAwesome as "rust is awesome", but the one in let rust_is_awesome = 42; would be said "rust underscore is underscore awesome" even though both conventions serve the same ultimate purpose.

Also, just looking at some code examples made me question how best to read method calls...

        let guess: u32 = match guess.trim().parse() {
            Ok(num) => num,
            Err(_) => continue,
        };

like for the .trim().parse() how best to say that? How to say it for methods with arguments? (And peeking forther, how best to say a match expression? Heck... how best to say "=>" !? Is that "equals greater than" or a "right arrow"?[1] It's not literally an arrow, right? Also it's distinct from "->".....)


  1. Or do we explain the syntax tree fully semantically, so this could be e.g.

    "let guess of type u thirty-two be a new variable initialized with a match expression on the value of calling the trim method, then the parse method on the variable guess. Both methods take no arguments. The first arm of the match expressions matches the enum variant Ok[ay] whose single field is bound to an identifier num and the arm evaluates to num. The second arm matches on the enum variant Err[or] (shortened to e r r) and ignores its single field with a catch-all pattern and the arm evaluates to the control-flow expression continue to skip the rest of the current iteration of the surrounding loop"

    Even then, one might even want both (in I-don't-know-which order); realistically a reader looks at a section of code multiple times, too, anyways; so in addition to a semantic explanation of the syntax, it could also give the plain syntax

    "let guess colon u-thirty-two equals match guess dot trim left paren right paren dot parse left paren right paren left brace newline capital o k left paren num right paren fat right arrow num comma newline capital e r r left paren underscore right paren fat right arrow continue comma newline right brace semicolon" ↩︎

4 Likes

Thank you! This video gave me a ton of context!

I started researching this more as well just after posting, out of curiosity.

The different methods and ideas I described above are basically what were required by the NLS project specifically for audiobooks for the blind, so I believe on some level they must be useful? However, having the opportunity to design my own best accessibility into this...it's really interesting to consider independently in the context of programming.

Another 2 ideas I had were:

  1. Make 2 versions ( more work, but not that much more, especially if planned properly...and 2 perfect versions for everyone )

  2. Do an appendix chapter defining all the syntax, but explain signatures, etc in chapter. However, in the video form link directly to the book's code blocks for them to use their TTS of choice on the text, then in the audio only format release it with the full TTS version. Again, a bit more work and pretty ambitious... but ultimately pretty cool imo.

Any and all feedback is welcome.

Edit: I need to read your edit xD

Afaik => is referred to as "fat arrow", while -> is referred to as "arrow".

1 Like

I'm going to indulge myself and enable some of the VS Code TTS stuff at a normal speed so I can break down how it refers to these characters. We had a guideline for this in my prev experience, but it was VERY different than what I heard in VS.

Yes! This is exactly the type of annotation we would include in the audio books!

I also love your specific style of depiction.

Big thanks!

On another note, if anyone sees this who knows someone who learned VS using the accessibility tools in VS Code, I'd love to be able to have some additional notes on accessibility improvements as far as editor / console interfacing explanations go!

I'm going to research this on my own, but speaking to someone with experience themselves tends to be invaluable from my experience.

Thanks again to everyone here, feeling quite grateful atm.

So, at least from my previous audiobook experience we would immediately discuss this between our team of 4 or 5 and decide what the correct context for this would be (within the NLS spec). We would look at how often it's used and if it had a special context. If yes for both, we would do an annotation explaining the exact characters and the shorthand. IDK VS Code's way of saying all the things like =>, ! yet, but we would depict them into common terms like "is not" or not operator, etc. And if it was very extensive (like rust is) we would also consider dedicating a separate space for all of the special definitions, if not already included, usually at the end.

The caveat here is that each element would be individually navigable, built with a special perl / java program to give every single chapter, paragraph break, annotation, etc. their own nested level of importance...which is not going to be the case here, so I do need to consider the best alternatives in accessibility to what I'm used to.

I found a reply in the same reddit thread I got that YouTube URL from that seemed somewhat relevant:

hello, As a blind python programmer in advance beginer level, I would be happy to answer any questions on the subject to help people and rase awareness. I use NVDA screen reader and a brailbe display to program. with such technologies, I can easily read write and understand code. my biggest challenge so far has been finding accessible teaching material. most video-based tutorials dosn't have code in text. and understandably instructor doesn't read code fully. It is impossible to understand punctuations line capitilizations etc. usually I halve to watch the video to understand the concept than google the syntax. wich forces me to understan before memorize but it is challenge. That's why i prefer books which might get borring☺️ [...]

So it sounds like teaching materials that also describe syntax precisely can be very valuable. Also bundling up the (audio or video) material with relevant source-code sections you can copy-paste out to somewhere else as plain text seems like a great idea.


Also, reading the Rust Book specifically, it often already has code examples coming with explanations for those code examples, i.e. it doesn't just present a code snippet on its own, but provides all the necessary context and extra information for the learner to better understand it. It seems to me that a blind or visually impaired learner would just need to be addressed in a very similar manner; just there's different additional information that also needs to be explained - in addition to what's already there - that is, the specificities of syntax down to spacing, representation of multi-character operators, capitalization, spellings of specific common keywords/identifiers[1], etc. Perhaps in some cases where an example code is explained subsequently line-by-line already, one could best weave in these additional remarks with the existing explanations, effectively modifying and/or extending the book itself.


  1. like Err for the "error" variant of the Result type; assuming one doesn't want to pronounce this by spelling the letters "ee arr arr" every time; maybe also something like std if you want to pronounce it "standard" or "stuhd" instead of "ess tee dee"; certainly also mut, pronounced "mute", because that spelling/abbreviation isn't immediately obvious ↩︎

2 Likes

By the way, if you run into a lack of known (to you) terminology to give names to syntax elements, or need more information of how to properly understand the structure of Rust syntax anyways (e.g. how I called things "expressions" or was talking of "match arms" in my footnote in an earlier reply), a good source to learn more can be the Rust Reference: Introduction - The Rust Reference.

For example this would be the page about match and it could help finding terminology such as "match arm" or "scrutinee"; and provides the information that the whole match itself is an expression, it matches on an expression, it matches it against a "pattern", an additional if ..condition next to a pattern would be called a "guard", the right hand side can be an expression followed by comma, or a block which doesn't need a comma, etc.. (of course, you'd need to judge how much or how little of all of this is necessary - or even useful - to be included in some form or another, in the beginner material that is the book "The Rust Programming Language")

1 Like

OH, that is very interesting! I honestly hadn't even considered that last bit.

I have noticed from the 1/3 of the book I've actually read so far that they do an excellent job at breaking down such complex material...it's honestly one of the reasons I'm brave enough to try this solo. I'm used to a team, a narrator, a recording engineer, a separate editor / QC eval / pronunciation fiend (to catch the mistakes the narrator / engineer missed), and an audio mastering / finishing person...I mean, I've done all the role's at one time or another, but by myself and for a book that's going to potentially be 30 - 40+ hours long?

I'm both excited and petrified lmao

OMG, you really are the goat. This is exactly the reference a perfectionist learning rust needs to know to properly convey the message :slight_smile:

1 Like

I'm gonna have to rejoin reddit to get a hold of this dude :man_facepalming: oh well, worth it!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.