Lua 5.2 and LuaJIT 2.x base64 encoding and decoding
This module is written for Lua 5.2 (and was ported to LuaJIT 2.x) and likely could be used in 5.1 using the LuaRocks bit32 backport. I have not tested on 5.1 and likely never will.
This module "exports" 3 methods with various "overloads" that allow interaction with the encoding / decoding routines. Default is to encode and decode as RFC 2045. This implementation is not strict 2045. Line breaking is the responsibility of the user.
The "base64 RFC 2045" encoding is tested against base64 (GNU coreutils) 8.22 & 8.13 and can produce identical output (including line breaks, if you write the data that way).
Decoding creates a duplicate copy of the input string to sanitize the input. Decoding is about 20% slower than encoding for standard alphabets. Custom alphabet decoding is considerably slower (almost 2x) because of the lack of predictable coherent patterns.
Use as you will. No warranty. (What do you expect for "free stuff" you find on the web?) I'd like to know if others find this useful, but other than that, meh.
####Why? (yet another Lua base64 converter) I was looking for some "natural Lua" base64 encoding routines and didn't find any "fast enough." The project this was written for runs on three different platforms (OSX, Linux, ARM/x64) and would require 3 different binaries, so I wanted to KISS it. With Lua 5.2 and the bit32 library I suspected that a better job could be done. The best routine I found (prior to starting this version) used a considerable amount of memory. For an 818K file, over 6.5M was needed for an intermediate value. This was way too much. I considered popen("base64"), but this isn't easily portable. Speed was an issue when 818K files took over 2 full seconds. So...
Simplest is "string in / string out".
base64=require("ee5_base64")
print( base64.encode("This is a string") )
print( base64.decode("RHVkZSEgV2hlcmUgaXMgbXkgY2FyPz8/Cg==") )
--[[ Output
VGhpcyBpcyBhIHN0cmluZw==
Dude! Where is my car???
]]--
I don't encourage using these routines (at all) for huge strings. A Lua c-module will handle this considerably faster. Just "for fun" I ran a 600M file through base64 and this utility. 0m3.306s vs 8m13.139s! It is plausible that base64 spun multiple threads, but I suspect that the real reason is that all of the "overhead" of a function call vs an extremely tight and optimized C routine is the real factor. (this is a perverse example, it annoys me no end that base64 even exists)
Update 2016/21/02: Also 'just for fun' I timed test_encode.lua against base64 ( Ubuntu 15.10 Linux/4.2.0-23-generic/x86_64 ) today with a 1G file. ALL of the time is in the encoding stage with a single core pegged at 100%. The decrease in time is mostly due to the test machine having a higher clock than the laptop I ran the tests on previously. (There are MINOR improvements that impact the times, but not to the extent shown.)
$/usr/bin/time --format %E lua test_encode.lua < sample4kp.mp4 > woo.b64
6:43.89
$/usr/bin/time --format %E base64 -w 0 < sample4kp.mp4 > wow.b64
0:01.39
$diff woo.b64 wow.b64
$
###More examples:
####stdin to stdout
base64=require("ee5_base64")
base64.encode(io.stdin,io.stdout)
--[[ Output
$lua test.lua < ee5_base64.lua
LS1bWyoqKioqKioqKio ... dGVyYXRvcgp9Cg==
]]--
o={}
base64.encode("Encode this please",function(s) o[#o+1]=s end)
for i,v in ipairs(o) do
print(i,v)
end
--[[ Output
1 RW5j
2 b2Rl
3 IHRo
4 aXMg
5 cGxl
6 YXNl
]]--
function linespliter()
local c = 0
return function(s)
io.write(s)
c=c+1
if c > 5 then
io.write("\n")
c=0
end
end
end
f=io.open("ee5_base64.lua")
s=f:read("*a") -- read entire file into string
f:close()
base64.encode(s,linespliter())
--[[ Output
LS1bWyoqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioq
. . .
ZGVjb2RlX2lpICAgPSBkZWNv
ZGU2NF9pb19pdGVyYXRvcgp9
Cg==
]]--
-- Mess with the input "V2hhdCBpcyB0aGlzPwo="
s="V 2 h h d C Bp c y(((@!!!!\n\n\r\t\tB0aGlzPwo= :-) ?"
print(base64.decode(s))
--[[ Output
What is this?
]]--
####RFC 4648 'base64url'
base64.alpha("base64url")
i=io.open("foo")
o=io.open("bar","w")
-- String in / out (probably shouldn't be a url!)
o:write(base64.encode(i:read("*a")))
i:close()
o:close()
--[[ No output ]]--
####Custom alphabet
base64.alpha("~`!1@2#3$4%\t6^7&8*9(0)_-+={[}]|\\:;'<D,./?qwertyuioplkjhgfdsazxcv","")
s=base64.encode("User base64 encoding, no term chars")
print(s)
print(base64.decode(s))
--[[ Output
)-^,}'`'+-^,^<8:=_d<[h*q[.}r$#du$3*,}.k:+h;;}/6
User base64 encoding, no term chars
]]--
#Timing These numbers reflect the encoding of an 818K file read into a buffer and written to stdout. The tests are run under OS X 10.9.1 with a 2.6 GHz Intel Core i7. 16 GB of memory is available with limited background processing. Each test is run five times. Decoding (not shown) is slower because of input sanitation.
$ lua -v
Lua 5.2.3 Copyright (C) 1994-2013 Lua.org, PUC-Rio
$ ls
-rw-r--r-- 1 Ernie Ewert staff 818K 2014-01-16 21:07 818KDataFile
-rwxr-xr-x 1 Ernie Ewert staff 9.6K 2014-01-16 21:02 test_encode.lua
-rwxr-xr-x 1 Ernie Ewert staff 107 2014-01-16 21:03 test_encode_stdio.lua
-rw-r--r-- 1 Ernie Ewert staff 1.1M 2014-01-16 21:10 dataout.b64
ErnieE5/lua_base64 (this library)
--[[ test_encode.lua ]]
require("ee5_base64").encode(io.stdin:read("*a"),io.stdout)
-- |
-- Reads entire file to string
$ time lua test_encode.lua < 818KDataFile > dataout.b64 # repeat 5 times
real 0m0.202s
real 0m0.203s
real 0m0.204s
real 0m0.203s
real 0m0.205s
paulmoore/base64.lua (Paul Moore) Modified to use the bit32 library in 5.2 The performance of encoding in this library is GREATLY enhanced with a single line modification. (Nearly twice as fast.) These numbers are "as is." Removing the assert in toChar reduces the run time significantly (778ms total).
--[[ test_encode.lua ]]
io.stdout:write( require("pm_base64").encode( io.stdin:read("*a") ) )
-- |
-- Reads entire file to string
$ time lua test_encode.lua < 818KDataFile > dataout.b64 # repeat 5 times
real 0m1.351s
real 0m1.360s
real 0m1.353s
real 0m1.368s
real 0m1.372s
--[[ test_encode.lua ]]
io.stdout:write( enc( io.stdin:read("*a") ) )
-- |
-- Reads entire file to string
$ time lua test_encode.lua < 818KDataFile > dataout.b64 # repeat 5 times
real 0m2.443s
real 0m2.391s
real 0m2.413s
real 0m2.401s
real 0m2.406s
#Memory Usage The following results are based on the the tests above in the timing section. (string to string) Building the result string is the largest consumer of memory because all of the data must be accumulated and then concatenated.
ErnieE5/ee5_base64 ~16MB 15,888,384 maximum resident set size
paulmoore/base64.lua ~68MB 68,558,848 maximum resident set size
Lua wiki ~38MB 38,924,288 maximum resident set size
I am not surprised at the Lua Wiki versions memory picture. I was stunned at the amount of memory consumed by Paul's version. In fact, looking into why his code churns so much memory gave MY code a boost in encode performance. (Almost 20% from the prior check-in, thanks Paul!) I wasn't clear on the string:byte() method usage and a simple change was helpful. (The 'bytes' table in Paul's code is the input string as a table of bytes.)
--[[ test_encode.lua
Test: string in / file out ]]
i=io.open("818KDataFile")
o=io:open("dataout.b64","w")
require("ee5_base64").encode(i:read("*a"),o)
o:close()
i:close()
$ time lua test_encode.lua
real 0m0.200s
$ /usr/bin/time -l lua test_encode.lua 2>&1 | grep "maximum"
3,923,968 maximum resident set size
--[[ test_encode.lua
Test: input iterator in / predicate out ]]
i=io.open("818KDataFile")
o=io:open("dataout.b64","w")
require("ee5_base64").encode(i,o)
o:close()
i:close()
$ time lua test_encode.lua
real 0m0.350s
$ /usr/bin/time -l lua test_encode.lua 2>&1 | grep "maximum"
1,519,616 maximum resident set size