Partial functions

Hi there,

I’ve just started exploring Rust today, so please bear with me if I get anything wrong. :sweat_smile:

I know Rust is all about safety and performance, but I was wondering if it could benefit from introducing a special kind of function (let’s call it a partial function or pfn) where the input domains are restricted through regular expressions at compile time. The compiler could then generate the necessary test conditions for runtime checking.

In mathematics, we have total functions, where all values of the domain are allowed, and partial functions, where only a subset of the domain is allowed. In programming, it seems unwise to allow total functions at the language level because it's dangerous to leave all the restrictions to the coder. We can make mistakes due to our imperfect nature. Delegating this responsibility to the compiler makes sense, as automated code generation would always be consistent. Additionally, allowing a mechanism to describe the outputs of various 'pfn' functions as inputs to other 'pfn' functions could help avoid repetitive checks.

Some might argue that extensive checks could affect performance, but this is just an idea, and I'll leave it to the experts to decide the extent of checks needed.

Also, there’s a big difference between functions in mathematics and programming languages. In mathematics, types and operations are well-defined and there is no runtime, whereas in programming, a function might rely on other services and functions, and the algorithm might not always yield a value due to runtime exceptions. Perhaps Rust already addresses this by making errors and exceptions part of the function type; I’m not sure!

Partial functions could contribute to algorithm proofs and potentially lead to new ideas for secure systems and operating systems.

Regards,
kk

Regular expressions, I'm not so sure. But look into discussions about "pattern types".

3 Likes

Interesting, it is similar but instead of just continuous but discontinuous values too, can also include different types too lets say we describe [0..9]+[a..Z]+[?] which would include the universe of all values from 0 to 9 and lowercase a to uppercase Z and punctuation from to uppercase ?, left to right and top to bottom on the keyboard and an expression to describe token lengths etc..

IMHO, this is something that would be useful if done at the type system level, not just on functions. The "partial" functions will then be just normal functions but defined on "partial" types.

You can already do something today using wrapper types, for example:

pub struct StringThatStartsWithAnA(String);

impl StringThatStartsWithAnA {
    pub fn new(s: String) -> Self {
        if !s.starts_with('A') {
            panic!("String does not start with an 'A'.");
        }
        Self(s)
    }
}

But the check is done at runtime. Having the help of the compiler in defining restrictions of base types would be quite useful in some cases.

Absolutely!! It will have to be implemented at the type system level but I wasn't sure how Rust is designed. I was in Zig forum a couple of days ago and guy suggested string interpolation and the coder kind said 'it is not going to happen' because of some technical complexities :slight_smile:

You're likely looking for dependent types, which allow you to use types to represent propositions about other values and values of types as proofs that the corresponding proposition is true. Then you can use this to have your function require a proof that the value respects the condition of the domain. You can take a look at languages like Agda, Lean, Coq or Idris to see how the implement them. Unfortunately currently they don't work nicely for system languages like Rust. They also bring many complications (for example creating an instance of the type forall (a b : Nat) -> a + b = b + a might be less trivial than you might think!)

3 Likes

Thanks for this @SkiFire13, just had a look at the mentioned languages and not a great fan of their syntax, lol. If the syntax is C derived then it is easy to learn and scan but these have already strained my eyes :frowning:

@SkiFire13 the dependent types concept is not quite the same as what I have mentioned here though.

What I have suggested is just adding restrictions to a type. If S is a set and S1 is a subset of S then S2 is S-S1 and the compiler to add the restriction check for values in S2.

I cannot see any issues in implementing this concept. Proof construction is not part of the compiler but another tool.

primitive type domains can be specified easily using reg ex for compile time and the compiler to generate the corresponding check code.

Yeah unfortunately dependent types heavily depends on purity and are thus better suited for function languages (at least for now). However this is nearer the mathematical concepts that you were striving for, no?

How do you plan the restriction check to be performed? The compiler will need a proof that the restriction is respected when the function is called. If you move it to another tool you basically make that tool necessary for compiling Rust code, so it might as well be in the compiler. Or you move all the restrictions mechanism to the tool, in which case it is no longer in the compiler as you mentioned in the first post.

I'm not sure how you would express numeric constraints using regular expressions, since their domain are strings, not numbers. And even then not everything can be expressed as a regular expression (they can only express regular languages, not context-free or other more complex languages!).

You are simply looking for… types, I guess.

I don't get what you are getting at. A total function is one that is well-behaved on its entire domain (which in practical terms means that it returns a value and doesn't diverge by panicking or crashing). This is a desirable property. A partial function is one that diverges (crashes, panics) or otherwise misbehaves (eg. causes UB) for some inputs, without the compiler catching it.

If you have a function that takes a validated type as its input, then you have a total function. There's nothing wrong with it. I think you are actually confusing "total" and "partial".

The "pre-validated newtype" pattern demonstrated above is all you need in reality. You can also use a validation library if you wish, for easily defining and combining various sorts of conditions via function composition, type-level cleverness, or macro hacks.

2 Likes

technically speaking, any function or operation that panics for a given input is a partial function.

1 Like

In the Ada language one can do ting like:

subtype Small_Int is Integer range -10 .. 10;
type Hash_Index  is mod 97;

Is that the kind of thing you had in mind for specifying subsets of domains and other such constraints?

I guess Rust could adapt type aliases to do that:

type SmallInt = u64 range -10..10;

But can't we do all that by defining our types with structs and overloading operations for it?

I'm somewhat averse to loading up Rust with more syntax and semantics.

We can, and better yet, we can use proc-macros to even verify literals at compile time. This is what types and macros are for. And indeed, I agree that for this reason, this sort of validation logic shouldn't be a piece of built-in language semantics.

If we wanted everything we do to be pre-built into the language, then where does that end? What are the extensibility points (functions, types, traits, etc.) for, then?

1 Like

I am extremely interested in implementing such a macro for a small project of mine, but I don't have the faintest idea from where to start. Can you show an example of a macro that checks some constraints of a "pre-validated newtype" at compile time?

The basic idea is to expect literals as an input and emit a call to the buit-in compile_error! macro if you find a constraint violation.

In the Playground, I can't actually create a separate proc-macro since there's no way to split it into multiple crates, but the token parsing and emitting part is easily demonstrated using quote in this snippet.

1 Like

Got locked out after 18 posts last night :smiley:

@ZiCog and @fogzot understood my point; however, it seems that some in this discussion have made a straightforward concept unnecessarily complicated, possibly due to a lack of background in the concepts of partial and total functions.

The concept here is rooted in set theory, relations, and functions, which are fundamental aspects of high-school mathematics.

A set is a collection of well-defined objects. In programming, a type can be thought of as a set, and operations on the elements of this set map to a domain. Not all operations or functions can take all values from an infinite set but rather from a finite subset of an infinite set. These are called partial functions.

Examples of partial functions include the square root, which cannot take negative values, or division, which cannot have zero as its denominator. Most programming languages provide limited support for these partial functions.

Enumerated types are an example of such types. They can consist of sequential or random values, and the compiler checks for their validity. What I am suggesting is to generalize and expand this support with a new construct.

Someone in this thread asked how you can apply regex to numeric types. As I mentioned in my initial post, this is a compile-time feature. At compile time, the source code is in text form, and AI models operate on text input. In this day and age, such concerns seem outdated.

If security is one of Rust’s goals, why shouldn't this feature be part of the language? Have we considered how many cyberattacks exploit input-related vulnerabilities? What is the root cause of such vulnerabilities? About 60% of code is dedicated to error checking, and who writes this error-checking code? Are programmers perfect? Why should this burden fall on developers manually?

The partial function feature is not meant for literals alone. Limiting it to literals serves no significant purpose and doesn't solve the underlying problems. Is Rust an academic project or a project aimed at addressing real-world issues?

Cyberattacks and defenses cost trillions of dollars globally. Applications and operating systems are often at fault because programming languages lack the tools to prove the logical consistency of an algorithm or to restrict inputs to prescribed finite values. Constrained types alone are insufficient. Functions use system resources and can cause issues too. When we cannot deductively prove the correctness of an algorithm, it can only be 'proven' inductively through testing, which is not the best method for functions operating on infinite values.

Seems to me that this an infinitely large problem. Or at least big enough that tying build it all into the language syntax/semantic is asking too much.

For example: Subtraction is a function of two parameters sub(a, b). One could define sub to accept only natural numbers. But then what happens if the result goes negative?

Example some function f(x) might be defined for all reals except PI.

Surely there is an infinity of such constraints on partial functions. As there is an infinite number of ways to define what is in a set and what is not.

Does't this all get a bit out of hand? Meanwhile we can do pretty well checking such things with the language features we have.

The way to represent the square root as a partial function in Rust is the following:

fn square_root(x: f64) -> f64 {
    if x >= 0.0 {
        x.sqrt()
    } else {
        panic!()
    }
}

What you may want instead is to represent sqrt as a total function from non-negative inputs to non-negative outputs. The way to do that in Rust is:

fn square_root(x: NonnegativeF64) -> NonnegativeF64
4 Likes

You are only describing what you would like the result to be, but have provided no concrete idea for how this would even remotely work. No, regular expressions can't do this, they are way too limited.

Having informations in text form doesn't mean you can analyze them. Sure, you can say that the string "-100" represents a negative number using a regex, but this will only work for literals, which you said you don't want to be limited to. So how do you recognize that 100 - 200 is negative? You can't. An arithmetic expression is not even parsable by a regular expression because it requires a context free grammar, and this doesn't even touch the topic of automatically analyzing it for inferring whether it satisfies some requirements. That's commonly the job of SMT solvers which are way out of scope for a compiler.

AI are definitely, totally, completly not what you want here:

  • they allucinate things, meaning they can say that your program is fine when it isn't true. Later you talk about security, but you conveniently ignore the fact that this is terrible for security. You shouldn't trust the result of an AI, not for normal stuff and definitely not for something security critical;

  • they don't and can't follow a specification, meaning you will never be able to reason about what an AI will or won't accept;

  • for the same reason, you can't even expect backward compatibility, because any update may have it reject some valid programs that were previously accepted.

It would be really fantastic if you could be able to prove things about the input and output of functions! If you find a concrete way to do that's also correct and practical it would be the holy grail of programming. Unfortunately nobody yet knows of a way to do so: you can have practical languages, like C, Rust, Java etc etc, and languages where you can formally prove the correctness of your algorithms, like Agda, Lean etc etc, but not both. This is why this is not yet part of Rust, because it doesn't exist yet.

4 Likes