SimonEnsemble/PorousMaterials.jl

Research on symbols versus strings

Closed this issue · 2 comments

Dataframes.jl uses symbols to query a column. There may be speed advantages to using symbols instead of strings. See here.

Since we're calling ljforcefield.epsilon[atom_i][atom_j] millions of times, I'm wondering if it would speed up computations if we read in framework.atoms as an array of symbols and used symbols as keys for calling force field parameters.

@time will be helpful for this.

I just tried converting the elements from String to Symbol for the Forcefield construction, and energy calculation (so Framework now has atoms::Array{Symbol,1} rather than atoms::Array{AbstractString,1}), the LennardJonesForcefield has dict keys as symbols and Molecule has Symbolic elements just like the Framework.
I timed both the construction of the forcefield and the formation of a 30x30x30 array form my Snapshot.jl:

Symbol: [0.012 seconds (5.76 k allocations: 1.592 MiB)] (I averaged this over 10 runs)
String: [0.0468 seconds (49.26 k allocations: 2.151 MiB)] (also an average over 10 runs)

So this speeds the construction up quite a bit!

Symbol: [26.650878 seconds (778.02 M allocations: 63.750 GiB, 7.82% gc time) (just 1 run)
String: [29.784629 seconds (778.66 M allocations: 63.783 GiB, 7.31% gc time) (just 1 run)

There is a roughly 10% increase in speed moving from String to Symbol

Nice work! 🎆