New library
pg9182 opened this issue · 10 comments
I've written a new library for querying IP2Location and IP2Proxy databases, github.com/pg9182/ip2x
. I'm currently refactoring the code generation and refining the library interface, but it's more or less feature-complete.
- It supports Go 1.18+.
- It supports querying using Go 1.18's new
net/netip.Addr
type, which is much more efficient than parsing the IP from a string every time. - It uses native integer types instead of
big.Int
, which is also much more efficient. - It's about 11x faster than this library when querying a single field, and 2x faster for all fields, while making a fraction of the number of allocations (2 for init, 1 for each lookup, plus 1 for each typed field get, or 2 for an untyped one).
- It has comprehensive built-in documentation, including automatically-generated information about which fields are available in different product types.
- It supports querying information about the database itself, for example, whether it supports IPv6, and which fields are available.
- It has a more fluent and flexible API (e.g.,
record.Get(ip2x.Latitude)
,record.GetString(ip2x.Latitude)
,record.GetFloat(ip2x.Latitude)
) - It has built-in support for pretty-printing records as strings or JSON.
- It supports both IP2Location databases in a single package with a unified API.
- It uses code generation to simplify adding new products/types/fields/documentation while reducing the likelihood of bugs (input, docs).
- It's written in idiomatic Go: correct error handling (rather than stuffing error strings into the record struct), useful zero values (an empty record will work properly), proper type names, etc.
- There are tests to ensure the output is consistent with this library, that a range of IPv4 (and their possible IPv6-mappings) address work correctly, and other things. There are also fuzz tests to ensure IPs can't crash the library and are IPv4/v6-mapped correctly.
This library is already being used in production at Northstar for game server geolocation and log analysis.
$ cd test && go test -bench=. -benchmem .
db: IP2Location DB11 2022-10-29 [city,country_code,country_name,latitude,longitude,region,time_zone,zip_code] (IPv4+IPv6)
goos: linux
goarch: amd64
pkg: github.com/pg9182/ip2x/test
cpu: AMD Ryzen 5 5600G with Radeon Graphics
BenchmarkIP2x_Init-12 17850333 67.91 ns/op 128 B/op 2 allocs/op
BenchmarkIP2x_LookupOnly-12 18722506 61.36 ns/op 48 B/op 1 allocs/op
BenchmarkIP2x_GetAll-12 1522696 812.2 ns/op 1688 B/op 14 allocs/op
BenchmarkIP2x_GetOneString-12 7839385 144.1 ns/op 304 B/op 2 allocs/op
BenchmarkIP2x_GetOneFloat-12 14312419 84.16 ns/op 48 B/op 1 allocs/op
BenchmarkIP2x_GetTwoString-12 4243560 244.9 ns/op 560 B/op 3 allocs/op
BenchmarkIP2x_GetTwoFloat-12 12198259 101.1 ns/op 48 B/op 1 allocs/op
BenchmarkIP2x_GetNonexistent-12 14834245 79.85 ns/op 48 B/op 1 allocs/op
BenchmarkIP2LocationV9_Init-12 602967 2191 ns/op 400 B/op 7 allocs/op
BenchmarkIP2LocationV9_LookupOnly-12 1473849 782.6 ns/op 672 B/op 24 allocs/op
BenchmarkIP2LocationV9_GetAll-12 819900 1324 ns/op 2268 B/op 36 allocs/op
BenchmarkIP2LocationV9_GetOneString-12 1346534 889.2 ns/op 936 B/op 26 allocs/op
BenchmarkIP2LocationV9_GetOneFloat-12 1441219 795.0 ns/op 672 B/op 24 allocs/op
BenchmarkIP2LocationV9_GetTwoString-12 546868 1866 ns/op 1883 B/op 53 allocs/op
BenchmarkIP2LocationV9_GetTwoFloat-12 693019 1561 ns/op 1345 B/op 49 allocs/op
BenchmarkIP2LocationV9_GetNonexistent-12 1399872 795.5 ns/op 672 B/op 24 allocs/op
Here's a summary of the benchmarks in a more readable form:
op | ns | allocs | bytes | |||
---|---|---|---|---|---|---|
ip2x | ip2loc/v9 | ip2x | ip2loc/v9 | ip2x | ip2loc/v9 | |
Init | 68 -97% | 2191 32.3x |
2 -5 | 7 3.5x |
128 -68% | 400 3.1x |
LookupOnly | 61 -92% | 783 12.8x |
1 -23 | 24 24.0x |
48 -93% | 672 14.0x |
GetAll | 812 -39% | 1324 1.6x |
14 -22 | 36 2.6x |
1688 -26% | 2268 1.3x |
GetOneString | 144 -84% | 889 6.2x |
2 -24 | 26 13.0x |
304 -68% | 936 3.1x |
GetOneFloat | 84 -89% | 795 9.4x |
1 -23 | 24 24.0x |
48 -93% | 672 14.0x |
GetTwoString | 245 -87% | 1866 7.6x |
3 -50 | 53 17.7x |
560 -70% | 1883 3.4x |
GetTwoFloat | 101 -94% | 1561 15.4x |
1 -48 | 49 49.0x |
48 -96% | 1345 28.0x |
GetNonexistent | 80 -90% | 796 10.0x |
1 -23 | 24 24.0x |
48 -93% | 672 14.0x |
Code
package main
import (
"bufio"
"fmt"
"math"
"os"
"regexp"
"strconv"
)
var re = regexp.MustCompile(`(?m)^Benchmark([^_]+)_([^-]+)[^\s]+\s+([0-9.]+)\s+([0-9.]+) ns/op\s+([0-9.]+) B/op\s+([0-9.]+) allocs/op`)
func main() {
type result struct {
Count int64
Nanoseconds float64
Bytes float64
Allocs float64
}
var (
bylib = map[string]map[string]result{}
benchnames = []string{}
benchnamesm = map[string]struct{}{}
)
sc := bufio.NewScanner(os.Stdin)
for sc.Scan() {
row := re.FindStringSubmatch(sc.Text())
if row == nil {
continue
}
if _, seen := benchnamesm[row[2]]; !seen {
benchnames = append(benchnames, row[2])
benchnamesm[row[2]] = struct{}{}
}
if _, ok := bylib[row[1]]; !ok {
bylib[row[1]] = map[string]result{}
}
var res result
res.Count, _ = strconv.ParseInt(row[3], 10, 64)
res.Nanoseconds, _ = strconv.ParseFloat(row[4], 64)
res.Bytes, _ = strconv.ParseFloat(row[5], 64)
res.Allocs, _ = strconv.ParseFloat(row[6], 64)
bylib[row[1]][row[2]] = res
}
if err := sc.Err(); err != nil {
panic(err)
}
const (
lib1 = "IP2x"
lib2 = "IP2LocationV9"
)
fmt.Println(`<table>`)
fmt.Println(`<thead>`)
fmt.Println(`<tr>`)
fmt.Println(`<th rowspan="2">op</th>`)
fmt.Println(`<th colspan="2">ns</th>`)
fmt.Println(`<th colspan="2">allocs</th>`)
fmt.Println(`<th colspan="2">bytes</th>`)
fmt.Println(`</tr>`)
fmt.Println(`<tr>`)
fmt.Println(`<th>ip2x</th><th>ip2loc/v9</th>`)
fmt.Println(`<th>ip2x</th><th>ip2loc/v9</th>`)
fmt.Println(`<th>ip2x</th><th>ip2loc/v9</th>`)
fmt.Println(`</tr>`)
fmt.Println(`</thead>`)
fmt.Println(`<tbody>`)
for _, benchname := range benchnames {
r1 := bylib[lib2][benchname]
r2 := bylib[lib1][benchname]
fmt.Printf("<tr>\n<td><b>%s</b></td>\n%s\n%s\n%s\n</tr>\n", benchname,
trow(true, r1.Nanoseconds, r2.Nanoseconds),
trow(false, r1.Allocs, r2.Allocs),
trow(true, r1.Bytes, r2.Bytes),
)
}
fmt.Println(`</tbody>`)
fmt.Println(`</table>`)
}
func trow(pct bool, v1, v2 float64) string {
c, cc := v2-v1, ""
if pct {
c /= math.Abs(v1)
c *= 100
cc = "%"
}
d := v1 / v2
return fmt.Sprintf(`<td align="right"><b>%.0f</b><br/><small><i>%+.0f%s</i></small></td><td align="right"><b>%.0f</b><br/><small><i>%.1fx</i></small></td>`, v2, c, cc, v1, d)
}
I've done a few more optimizations and some refactoring. I've also added automatic verification of the output of this library for every row in a few IP2Location databases (you can run it against any of them locally).
With this, I think I'm more or less finished ip2x.
It's been stable for a few months now, so I've released v1.
cool!
I've added support for DB26, but I can't test it since the sample database seems to return corrupt data, even when using the official library.
Result for 71.68.178.128
(random one chosen from the CSV version) in sample DB26 database (SHA1: 43ab840159c7c421f3b6620ea9670f8485dfe53d).
Field | 520cede | pg9182/ip2x@296a65e |
---|---|---|
address_type | "Pv6 ranges.\x01U\x01-\x04IAB1\x06IAB1-1\x06IAB1-2\x06IAB1-3\x06IAB1-4\x06IAB1-5\x06IAB1-6\x06IAB1-7\x05IAB" |
"Pv6 ranges.\x01U\x01-\x04IAB1\x06IAB1-1\x06IAB1-2\x06IAB1-3\x06IAB1-4\x06IAB1-5\x06IAB1-6\x06IAB1-7\x05IAB" |
area_code | "915" |
"915" |
as | <nil> |
"munications\x1aCharter Communications Inc-Chartres Metropole Innovations Numeriques SEM\x1dChartway Federal Credit " |
asn | <nil> |
"11414" |
category | "16\bIAB19-17\bIAB19-18\bIAB19-19\aIAB19-2\bIAB19-2" |
"16\bIAB19-17\bIAB19-18\bIAB19-19\aIAB19-2\bIAB19-2" |
city | "usen\x06Wendel\aWendell\vWendelsheim\vWendelstein\x06Wenden\fWendens Ambo\x12Wendisch Borschutz\nWendishain\bWen" |
"usen\x06Wendel\aWendell\vWendelsheim\vWendelstein\x06Wenden\fWendens Ambo\x12Wendisch Borschutz\nWendishain\bWen" |
country_code | "ing Islands\x02US\x18United States of America\x02UY\aUruguay\x02UZ\nUzbekistan\x02VA\bHoly See\x02VC Saint Vincent and The Grenadines\x02VE\"Venez" |
"ing Islands\x02US\x18United States of America\x02UY\aUruguay\x02UZ\nUzbekistan\x02VA\bHoly See\x02VC Saint Vincent and The Grenadines\x02VE\"Venez" |
country_name | " Islands\x02US\x18United States of America\x02UY\aUruguay\x02UZ\nUzbekistan\x02VA\bHoly See\x02VC Saint Vincent and The Gren" |
" Islands\x02US\x18United States of America\x02UY\aUruguay\x02UZ\nUzbekistan\x02VA\bHoly See\x02VC Saint Vincent and The Gren" |
domain | "systems.com\fspectrum.com\x0fspectrum.com.au\fspec" |
"systems.com\fspectrum.com\x0fspectrum.com.au\fspec" |
district | <nil> |
"akayama Shi\vWake County\bWake-gun\tWakefield\fWakkanai Shi\bWako-shi\x0eWakulla County\x06Walcha\f" |
elevation | 947 |
"947" |
idd_code | "6 ranges.\x01-\x011\x041242\x041246\x041264\x041268\x041284\x041340\x041345\x041441\x041473\x041649\x041664\x041670\x041671\x041684\x041721\x041758\x041767\x041784\x041829\x041868\x041869" |
"6 ranges.\x01-\x011\x041242\x041246\x041264\x041268\x041284\x041340\x041345\x041441\x041473\x041649\x041664\x041670\x041671\x041684\x041721\x041758\x041767\x041784\x041829\x041868\x041869" |
isp | "Holding Com\x1aCharter Communicatio" |
"Holding Com\x1aCharter Communicatio" |
last_seen | <nil> |
<nil> |
latitude | 35.78099 |
35.78099 |
longitude | -78.36972 |
-78.36972 |
mcc | "ckau\x06Zwolle\x01-\x03202\x03204\x03206\x03208\x03213\x03214\x03216\x03218\x03219\x03220\x03221\x03222\x03226\x03228\x03230\x03231\x03232\x03234\x03238\x03240\x03242\x03244\x03246" |
"ckau\x06Zwolle\x01-\x03202\x03204\x03206\x03208\x03213\x03214\x03216\x03218\x03219\x03220\x03221\x03222\x03226\x03228\x03230\x03231\x03232\x03234\x03238\x03240\x03242\x03244\x03246" |
mnc | "Pv6 ranges.\x01-\x0200\x0500/02\x0500/76\a000/120\x03001\x0f004/005/006/012\x0201\x0501/02\b01/02/0" |
"Pv6 ranges.\x01-\x0200\x0500/02\x0500/76\a000/120\x03001\x0f004/005/006/012\x0201\x0501/02\b01/02/0" |
mobile_brand | ".\t+7Telecom\x01-$1O1O / One2Free / New World Mobility\b2degrees\x013\x063 (2G)\x043Mob\x034ka\a9mobile\x02A1\x06A1.net\x03AIS\x04APTG\x10ASTELNET, " |
".\t+7Telecom\x01-$1O1O / One2Free / New World Mobility\b2degrees\x013\x063 (2G)\x043Mob\x034ka\a9mobile\x02A1\x06A1.net\x03AIS\x04APTG\x10ASTELNET, " |
net_speed | "-" |
"-" |
provider | <nil> |
<nil> |
proxy_type | <nil> |
<nil> |
region | "\nNorth Bank\x0eNorth Carolina\x16North Central Province\fNorth Dakota\fNorth Darfur\nNorth East\x0fNorth Eleuthera\x0eNorth Kordof" |
"\nNorth Bank\x0eNorth Carolina\x16North Central Province\fNorth Dakota\fNorth Darfur\nNorth East\x0fNorth Eleuthera\x0eNorth Kordof" |
threat | <nil> |
<nil> |
time_zone | "2:30\x06-03:00\x06-04:00\x06-05:00\x06-06:00\x06-07:00\x06-08:00\x06-" |
"2:30\x06-03:00\x06-04:00\x06-05:00\x06-06:00\x06-07:00\x06-08:00\x06-" |
usage_type | "DCH" |
"DCH" |
weather_station_code | "42\bUSNC0743\bUSNC0744\bUSNC0745\bUSNC0746\bUSNC0747\bUSNC074" |
"42\bUSNC0743\bUSNC0744\bUSNC0745\bUSNC0746\bUSNC0747\bUSNC074" |
weather_station_name | "chee\x06Wendel\aWendell\x06Wenden\bWendover\x06Wenham\x06Wenona\aWenonah\tWentworth\nWentzville\aWenzhou\x05Weott\fWernersville\vWernigerod" |
"chee\x06Wendel\aWendell\x06Wenden\bWendover\x06Wenham\x06Wenona\aWenonah\tWentworth\nWentzville\aWenzhou\x05Weott\fWernersville\vWernigerod" |
zip_code | "\x042759\x0527590\x0527591\x0527592\x0527593\x0527594\x0527595\x0527596\x0527597\x05275" |
"\x042759\x0527590\x0527591\x0527592\x0527593\x0527594\x0527595\x0527596\x0527597\x05275" |
New benchmarks as of 520cede (including #21):
- Official library now has much fewer allocations, but still many times more than ip2x.
- Official library is now about the same speed for getting all records, but still much slower for a subset.
- Official library is about twice as fast as before, but still many times slower than ip2x.
op | ns | allocs | bytes | |||
---|---|---|---|---|---|---|
ip2x | ip2loc/v9 | ip2x | ip2loc/v9 | ip2x | ip2loc/v9 | |
Init | 61 -96% | 1662 27.2x |
2 -15 | 17 8.5x |
128 -82% | 696 5.4x |
LookupOnly | 76 -76% | 312 4.1x |
1 -5 | 6 6.0x |
48 -79% | 229 4.8x |
GetAll | 640 -1% | 646 1.0x |
14 +2 | 12 0.9x |
1688 -4% | 1765 1.0x |
GetOneString | 137 -63% | 374 2.7x |
2 -5 | 7 3.5x |
304 -37% | 485 1.6x |
GetOneFloat | 91 -72% | 322 3.6x |
1 -5 | 6 6.0x |
48 -79% | 229 4.8x |
GetTwoString | 213 -72% | 767 3.6x |
3 -11 | 14 4.7x |
560 -42% | 970 1.7x |
GetTwoFloat | 106 -84% | 666 6.3x |
1 -11 | 12 12.0x |
48 -90% | 458 9.5x |
GetNonexistent | 90 -72% | 318 3.5x |
1 -5 | 6 6.0x |
48 -79% | 229 4.8x |
Actually, turns out that the IPv6 DB26 sample BIN has issues. The above was tested using the IPv4 sample BIN which is ok. Will update again once the IPv6 DB26 sample BIN has been fixed.
IPv6 DB26 sample BIN has been fixed.
Thanks @ip2location; it works fine now: