I also completed the Logic of the text-sanitizer
.
On Playground: text-sanitizer
And I worked on the implementation with the other Language to get a comparable Result.
The Input/Output issue is resolved and behaves now like the Rust Application.
read(0, "--2020-05-16 12:06:23-- http://w"..., 8192) = 8192
read(0, "<a href=\"ausfluge/#ausfluge/atla"..., 16384) = 16384
read(0, "arote Str\344nde\">STRÄNDE</a><"..., 32768) = 32768
read(0, "</span></a></li> <li id=\"english"..., 65536) = 65536
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7ccd2000
read(0, " Sie das allw\366chentliche Markttr"..., 131072) = 131072
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7cc92000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7cb92000
read(0, "it etwas Gl\374ck finden Sie dort O"..., 262144) = 262144
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7ca92000
read(0, "rados\"> <div class=\"imgcuadro\"> "..., 524288) = 524288
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7c992000
mmap(NULL, 1114112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7c882000
munmap(0x7fcd7ca92000, 1048576) = 0
read(0, "-wochenmarkt\">MEHR INFO</a></div"..., 1048576) = 1048576
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7c682000
munmap(0x7fcd7c992000, 1048576) = 0
mmap(NULL, 2162688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7c472000
munmap(0x7fcd7c882000, 1114112) = 0
read(0, "en Fahrt mit der F\344hre geht es z"..., 2097152) = 607540
mmap(NULL, 2752512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7c8f2000
munmap(0x7fcd7c682000, 2097152) = 0
read(0, "", 2097152) = 0
mmap(NULL, 2752512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd7c1d2000
write(1, "--2020-05-16 12:06:23-- http://w"..., 2736500) = 2736500
So I repeated the hyperfine
Benchmarks:
For the Rust Application:
$ hyperfine --warmup 3 -r 100 'target/release/text-sanitizer -i es de < lanzarote-com_de-ausfluge.html.x100 > /dev/null'
Benchmark #1: target/release/text-sanitizer -i es de < lanzarote-com_de-ausfluge.html.x100 > /dev/null
Time (mean ± σ): 30.0 ms ± 5.3 ms [User: 26.0 ms, System: 4.1 ms]
Range (min … max): 26.4 ms … 46.4 ms 100 runs
And for the Implementation with the other Programming Language:
$ hyperfine --warmup 3 -r 100 './text-sanitizer.run -i es de < lanzarote-com_de-ausfluge.html.x100 > /dev/null'
Benchmark #1: ./text-sanitizer.run -i es de < lanzarote-com_de-ausfluge.html.x100 > /dev/null
Time (mean ± σ): 56.1 ms ± 7.1 ms [User: 48.9 ms, System: 7.2 ms]
Range (min … max): 51.6 ms … 81.8 ms 100 runs
The Rust Application does the same Job in only 53.47% of the time.
That makes it almost double as fast.
That is the Point that shows a clear and measurable Advantage.
This mostly came to show as the Input was growing significantly.
I can only suppose but I mostly think it comes from the better memory management, which gains more significance on bigger Data Volumes.
But still the complete version of the Application runs 5 times slower than any other proposed solution.