Rust-like scripting language

I have no idea of the details but my understanding from various presentations is that modern Javascript engines like V8 tackle this problem by not optimizing code for a given type during start up. But if they notice that a particular function is used repeatedly with the same type parameters at run time then they will then JIT the function for use with those types so that it runs faster the next time it is called. Similarly they optimize structure access if they do not change their shape.

Then if the program ever calls the functions with different type parameters or modifies a structure shape those JIT optimizations are thrown away and it reverts to regular interpreting.

My experiments with V8 bear out this suggestion. I have seen JS programs start out slow on the first few iterations and then get into their stride.

It all sounds very hairy scary to me!

Being able to modify objects any how at run time certainly allows for quick and dirty hacking. But it does mean that in a large program it's hard to know what you are dealing with at any point in the source code.

JIT compilation provides various optimizations while benefitting from a quick start up. The main issue with JIT (if there even is one) is implementing one in an efficient manner, viz. one that compiles 'hot' code without interrupting execution. Although a JIT compiler can be easy to implement, writing a good one is difficult—naive implementations slow short programs down rather than speed them up.

IMO, elegant solutions arising from 'quick and dirty hacking' are the raison d'être for most scripting languages. When working on larger software projects written with a scripting language, I totally agree that static approaches should be taken.

In the context of this thread, I think that this is why @L0uisc suggested that Python complemented Rust. Although they differ in syntax and semantics, the underlying 'theme' of both languages is strikingly similar. They both straddle the line between imperative, functional, and object oriented. They both allow for rapid development within an expressive paradigm. However, because neither are used in exactly the same way for the exact same things, it's clear what the best language for a particular problem is. I think this similar-yet-complementary dynamic is why Rust is such a great systems language for Python developers, and why Python is such a great scripting language for Rust programmers.

1 Like

Yep.

That is why my current preferred languages are Rust and Javascript.

Mind you, I'm don't see how "elegant solutions" and "quick and dirty hacking" fit together.

From the point of view of that underlying theme I feel that Javascript is far more Rust like than Python.

But hey, what does it matter? Getting some people who only know Python or Javascript or whatever to explore something else is a good thing. Rust with it's cargo and crates and other comforts might be just the thing.

1 Like

For example, let's say that you want to trace what methods are called on an object. In Python, you'd just write something like:

class Logger:
    def __init__(self, inner):
        self.__inner = inner

    def __getattr__(self, attr):
        print(f"Calling {attr}")
        return getattr(self.__inner, attr)

    def __repr__(self):
        return f"Logger({self.__inner})"

x = Logger([4, 1, 2])
x.append(3)
x.sort()
print(x)

# Output:
# Calling append
# Calling sort
# Logger([1, 2, 3, 4])

Although this is quick and dirty, it's pretty elegant - it just logs and calls what's inside it. By default, it's generic across types, and doesn't require much work on the part of the programmer to implement. It's pretty easy to stumble upon short succinct solutions to complex problems in Python.

4 Likes

On another note, I wonder what a small scripting language analog to Rust would look like. Starting with Notes on a Smaller Rust, I think a language with similar constructs to Rust but with a looser type system and no lifetimes (garbage collected, or compile time gc'd like Micro-Mitten) would be a joy to use. It'd have structs, traits, enums, and everything else there's to love about Rust, but with more dynamic typing, more general built-in types (I'm looking at you &str) and so on.

2 Likes

You might want to check out some of the projects listed here.

It's feasible in Moss too:

class Logger = {
   function get(attr)
      print("Calling {}" % [attr])
      inner = record(self)["inner"]
      m = inner.(attr)
      if m: Function
         return |*argv| m(inner; *argv)
      else
         return m
      end
   end,
   function string()
      "Logger({})" % [record(self)["inner"]]
   end
}

function logger(inner)
   table Logger{inner = inner}
end

x = logger([4, 1, 2])
x.push(3)
x.sort()
print(x)

It's feasible in many scripting languages. I was just pointing out that Python (and scripting languages in general) have dynamic metaprogramming and introspection features that allow for a number of cool techniques. This is a Rust forum, so I wouldn't want to stray too far away from the topic at hand, but most scripting languages have tools available to get things done quick. For example, to parse a 2D comma-separated array from a string, one might do:

def array(string):
    return [[int(x) for x in line.split(",")] for line in string.split("\n")]

In a scripting language, whereas in Business Grade Java™ you'd have to do something like:

ArrayList<ArrayList<int>> array2DFromString(string: String) {
    ArrayList<ArrayList<int>> array2D = new ArrayList<>();
    String[] rows = string.split("\n");

    for (int i = 0; i <= rows.length; i++) {
        ArrayList<int> row = new ArrayList<>();
        String[] items = rows[i].split(",");

        for (int j = 0; j <= items.length; j++) {
            int number = Integer.parseInt(items[j]);
            row.add(number);
        }
        array2D.add(row);
    }
    return array2D;
}

Note: rust stacks up suprisingly well:

pub fn array(string: &str) -> Option<Vec<Vec<usize>>> {
    string.split("\n").map(|row| row.split(",").map(|item| item.parse::<usize>()?).collect()).collect()
}
2 Likes
# Note: split is a monkey patching module, not a function.
use string.split

# A weak spot, currently there is no "auto trimming".
# Let's fix that for the moment.
_int_ = int
int = |s| _int_(s.trim() if s: String else s)

# Here we go:
array = |s| s.split("\n").map(|line| line.split(",").map(int))

# Alternatively:
array = |s| list(list(int(x)
   for x in line.split(","))
      for line in s.split("\n"))

You know, what the heck. Let's design one. Rust has about three key concepts:

  • Algebraic data types
  • Lifetimes
  • multiple immutable XOR single mutable

I feel that for a scripting language, lifetimes aren't the most appropriate as scripting languages don't need to provide strong memory guarantees, so we'll scratch that. Let's start with algebraic data types.

In terms of 'atomic' data types (using the lispy definition of 'atomic'), we've got the following:

  • Bytes (xFF, x00)
  • Integers (10, -420)
  • Floating point numbers (-20.20)
  • Strings ("Hello, world!")
  • Booleans (True, False)

Vecs and HashMapss are so ubiquitous at this point they deserve to be their own thing. In terms of combinatory data types, we've got:

Structs, a set of identifiers mapping to data types:

struct <Name> {
    <field>: <Type>,
    ...,
}

Enums, a set of variants with an optional data type:

enum <Name> {
    <Variant> <Type>,
   ...,
}

Tuples, a sized list of different data types

(<item of Type 1>, ...)

Vecs, a dynamically sized collection of the same type:

[<item of Type>, ...]

Maps, a dynamically sized collection mapping one type to another:

{
    <item of Type 1>: <item of Type 2>,
    ...,
}

I don't think that a Rust scripting language would forgo static type checking, rather, it would use a flexible Hindley-Milner type inference system. This would make it possible to forgo type annotations in function definitions. Additionally, with the work recently done on compile-time garbage collection, it's totally feasible that a new Rust scripting language could do that too.

One final key feature of Rust is impl, which allows traits to be implemented for specific types. I think types should be able to have impl statements, though traits need not be defined. rather something similar to Go's interfaces or Python's duck typing should be used. Finally, semicolons need not apply.

This is all a fairly pointless what-if exercise, so here's a sieve:

fn sieve(limit) {
    primes = [2]
    for n in 3..limit {
        for prime in primes {
            if n % prime == 0   { break                 }
            if n >= sqrt(prime) { primes.push(n); break }
        }
    }
    return primes
}

Wow little did I know, but I should have guessed, that my suggestion for determining data types at run time in a Rust like scripting language here: Rust-like scripting language - #5 by ZiCog, was invented in the late 1950's and has a name.

Mind you, had I ever read anything about Hindley-Milner that started like Wikipedia "A Hindley–Milner (HM) type system is a classical type system for the lambda calculus with parametric polymorphism." I would never have guessed it was anything to do with determining data types in a scripting language.

What you have nicely outlined there is just what I had in mind. If only I had the chops to build such a thing.

2 Likes

Notes on Smaller Rust points out that lifetimes are more than just memory safety and a scripting Rust would have to include it. Otherwise, we simply have OCaml. It has ADTs, its garbage collected, its module system is amazing and... The first Rust compiler was even implemented with it. I think thats what I would call Rust's soulmate. Not really a scripting language, but close. :smile:

2 Likes

I agree, it would have to have lifetimes.

The goal in my mind would be to create a scripting language that is as syntactically and semantically as close to Rust as possible. But could be hacked around with and run as quickly as Python or Javascript.

Then, when one feels the need for speed or whatever other reason one could easily recast ones program into actual Rust.

As far as I can tell Hindley-Milner idea allows us to not have to do all that messy type specifying everywhere as we would expect for a scripting language.

Hopefully it can also be used to do life time tracking at run time without having to put all those ugly tick marks into the source.

But what do I know, this may not even be possible.

1 Like

I have been thinking about such a simpler Rust-like language for a while (I'll call it RustScript in this post).
In my opinion it would really stand out from the myriad of other scripting languages, if it were binary-compatible with Rust crates. Then you could first write your application in RustScript, but still use Rust libraries and data structures for code that needs to be very performant or you could port parts of your application to pure Rust as the design stabilizes. Of course there would need to be a suitable API.

I think this could also help Rust become more popular. Companies that are currently not willing to wait months for their programmers to become productive in Rust could let their new hires start writing applications in RustScript and their more experienced coders do the high performance parts in Rust.
Basically a similar model to Python and the Python C API.

Of course since Rust doesn't have a stable ABI, this would mean working on top of rustc, i.e. transpiling RustScript to Rust and letting rustc compile the final binary. Transpilation to Rust might not be easy, but should be doable if the language is designed for this purpose and it doesn't have to be zero cost.

The problem would be compile times. For a lighter language meant to be used (nearly) interactively Rust compile times are too long. Maybe we could get there by compiling dependencies only once as dylibs and just recompiling the current RustScript crate?

2 Likes

Ocaml is a great programming language, I see it as a higher-level analogue to Rust. Because Rust was inspired by ML, a Rust-derived scripting language would certainly be similar to ML.

What I think makes Rust unique (compared to other ML languages) are impl traits (though Ocaml does support classes), borrow-checking, and zero-cost abstractions.

Lifetimes aren't the same as borrowing, though they are very similar. AFAIK, borrowing helps Rust infer the lifetimes of data in the program, and provides certain guarantees to how that data can be accessed and modified.

The Rust borrow checker, from what I understand, uses scoping and borrowing rules to determine where in the program variables are live, meaning still accessible. This region of where the variable is live is called the variable's lifetime. Rust currently uses the NLL borrow checker, which computes the lifetime of each reference, and the lifetimes of loans to that reference. A borrow checker error arises when a statement accesses a reference that violates some loan.

IIRC, The Polonius borrow checker intends to make this lifetime inference more flexible. Instead of directly computing the lifetime of everything, if starts by finding the origin of each reference. Polonius does away with directly computing liveness. Instead, it states that a loan is live if some live variable has that loan.

// modified from nikomatsakis presentation on polonius
let mut map: HashMap<u32, String> = HashMap::new();

let twenty_two = match map.get(&22) {
    Some(v) => v,
    None => { map.insert(22, "boop".to_string()); &map[&22] },
} 

This would throw a borrow checker error with NLL, but not in Polonius, because v is not live in the None branch of the match.

But note that lifetimes aren't explicitly needed for borrowing to work. A system with borrowing (i.e. single mutable xor aliasable immutable) could manage lifetimes of the objects with a garbage collector, or statically determine the lifetimes using ASAP (as static as possible) memory management techniques.

I'm developing an experimental programming languages that forgoes garbage collection and other traditional memory management techniques. I hasn't been released yet as it's still under heavy development. In short, it tries to infer borrowing and lifetimes dynamically, using a memory-management-technique I call 'vaporization'. The rules of vaporization-based memory management are simple:

  • Values are immutable, variables are mutable references to values.
  • When a variable is reassigned or goes out of scope, the value it holds is released.
  • When a variable is used, a copy of the data it contains is used.

It also makes the following optimizations:

  • When a variable is passed to a function, a reference to its value is passed.
  • The last use of a variable before it is released does not make a copy.

What does this look like in practice? Values are immutable, variables are references to values:

x = 7
y = x
x = x + 2
-- (comment) y is still 7

When a variable is reassigned or goes out of scope, the value it holds is released.

x = 7
x = 9 
-- 7 is released

When a variable is used, a copy of the data is contains is used.

x = 7
y = x
-- y is not the same 7 as x
-- think of it as `let y = x.clone()`

Note, however, that although this system is memory safe, it's also memory intensive.

x = 17
y = x + x -- three copies of x exist

To combat this, a few optimizations are made. There are a few optimizations used, though I'll cover the most impactful ones. First, When a variable is passed to a function, a reference to its value is passed.

-- function syntax is `<pattern> -> <expression>`
increment = x -> x + 7

x = 7
y = increment x

Here, a reference to x is passed into increment. However, this language is fork on mut, so x wouldn't be copied until x + 7 inside increment. This prevents passing many copies of the same data around functions, say in a recursive function, for instance.

Finally, the last use of a variable before it is released does not make a copy.

x = 7
x = x + x

Let's annotate it. V<N> indicates that all V<N> are the same. Additionally, V<Nf> indicates the last use of V<N>.

x<1> = 7
x<2> = x<1> + x<1f>

Following the above rule, x<1f> is not a copy of the value of x, rather it is the value of x itself. When writing code that might mutate something in a linear manner, this significantly reduces the memory usage:

x = [1, 2, 3]
x = x + [4] -- no copies are made

This language also supports flexible hygienic macro system, which 'hides' the assignment in most cases (like mutating an object). In combination with passing references to functions, copies of the data are only made when the data needs to be used in two places at once. I haven't discovered any memory leaks or excessively high memory through testing yet. If you have any feedback, or notice that something's off, please leave a reply :slight_smile:

It really would. One of the hard things about writing a programming language is building an ecosystem. Being able to jumpstart off another language's ecosystem gives a large benefit.

Have there been any discussions for standardizing Rust's ABI? I couldn't find any, but I swear I read something once... who knows. Anyway, perhap RustScript could target the MIR or something. This would still have high compile times, but it would circumnavigate Rust's compiler frontend. Compiling to Rust would also allow compiling to wasm and the like, and you get the free performance gains from the work on the Rust compiler.

If the language were build on top of Rust, a FFI that interops with Rust might be able to do the job.

I agree, compiling Rust dependencies statically would be a good idea if such a language were ever implemented.

1 Like

Absolutely. I've heard the OCaml is actually a very good language but the ecosystem struggles a bit and it is a nightmare on Windows which really hurts adoption.

I've actually brought that up on the internals forum before:

Unfortunately it looks like it essentially isn't going to happen any time soon and maybe never.

I think that's part of the reason why transpile-to-JS languages (think typescript) are so popular these days. OCaml is fun to use and has great packages, but like any old language suffers from old design paradigms and package specifics that are seldom in use today. For a language to remain viable, It has to undergo a major update at least once every five years, and OCaml's package system hasn't been overhauled in forever.

I forgot who said it (I think it might've been about using Rust in the Fuschia microkernel) but I remember what I read about Rust's ABI. Their core argument went something like this:

For a language to do well at a systems level (i.e. OS), it needs to be able to interface with other applications. Because applications can be written in different languages, it's necessary for a protocol, such as an ABI, to allow interface between them. C's ABI is very stable which is why C is in use in most operating systems today. Rust's ABI is unstable and there are no plans to stabilize it, which makes it a poor candidate for OS level work.

I don't completely agree with this, but I do think that a static ABI (maybe bound to each Edition) would immensely benefit the Rust ecosystem.

1 Like

10 posts were split to a new topic: Modular ABI for Rust

@mbrubeck @BurntSushi can posts 56 and 57 be moved into a new topic? I don't want the ABI draft proposal especially to get lost 50 pages down an unrelated topic.

3 Likes