Skip to content

Conversation

@hertzcodes
Copy link
Contributor

Hello, regarding the issue: #65
I have studied about FactStore and implemented a semi-columnar storage for better reads in big queries. Write still needs optimizations and I'm thinking of it as of now. You can run the benchmarking tests in benchmark_test.go in factstore package. @burakemir I'm not sure if you meant I shouldn't have used maps or not but this is the benchmark so far. I'd appreciate if you could guide me through.

goos: linux
goarch: amd64
pkg: github.com/google/mangle/factstore
cpu: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
BenchmarkAdd/factstore.SimpleInMemoryStore-8             1798400               655.9 ns/op           136 B/op          6 allocs/op
BenchmarkAdd/factstore.IndexedInMemoryStore-8            1676612               655.9 ns/op           136 B/op          6 allocs/op
BenchmarkAdd/factstore.MultiIndexedInMemoryStore-8       1230781               935.8 ns/op           184 B/op          7 allocs/op
BenchmarkAdd/*factstore.MultiIndexedArrayInMemoryStore-8                 1000000              1575 ns/op             202 B/op          7 allocs/op
BenchmarkAdd/factstore.ConcurrentFactStore-8                             1847586               646.7 ns/op           136 B/op          6 allocs/op
BenchmarkAdd/*factstore.ColumnarStore-8                                  1000000              1191 ns/op             173 B/op          6 allocs/op
BenchmarkGetFacts/factstore.SimpleInMemoryStore-8                        1991113               604.7 ns/op            86 B/op          4 allocs/op
BenchmarkGetFacts/factstore.IndexedInMemoryStore-8                       3648225               330.2 ns/op            86 B/op          4 allocs/op
BenchmarkGetFacts/factstore.MultiIndexedInMemoryStore-8                  3327081               399.7 ns/op            86 B/op          4 allocs/op
BenchmarkGetFacts/*factstore.MultiIndexedArrayInMemoryStore-8            3412228               335.9 ns/op            86 B/op          4 allocs/op
BenchmarkGetFacts/factstore.ConcurrentFactStore-8                        2024418               624.5 ns/op            86 B/op          4 allocs/op
BenchmarkGetFacts/*factstore.ColumnarStore-8                             3007182               470.8 ns/op            86 B/op          4 allocs/op
BenchmarkGetFacts_BigQuery/factstore.SimpleInMemoryStore-8                     8         132573589 ns/op              96 B/op          2 allocs/op
BenchmarkGetFacts_BigQuery/factstore.IndexedInMemoryStore-8                   15          76971862 ns/op              96 B/op          2 allocs/op
BenchmarkGetFacts_BigQuery/factstore.MultiIndexedInMemoryStore-8              10         108997991 ns/op              96 B/op          2 allocs/op
BenchmarkGetFacts_BigQuery/*factstore.MultiIndexedArrayInMemoryStore-8                 8         146630297 ns/op              96 B/op          2 allocs/op
BenchmarkGetFacts_BigQuery/factstore.ConcurrentFactStore-8                             8         137718846 ns/op              96 B/op          2 allocs/op
BenchmarkGetFacts_BigQuery/*factstore.ColumnarStore-8                                 52          20060120 ns/op              96 B/op          2 allocs/op
BenchmarkMerge/SimpleInMemoryStore-8                                               22758             51623 ns/op           36278 B/op        227 allocs/op
BenchmarkMerge/ColumnarFactStore-8                                                  2590            484880 ns/op         1054459 B/op       1109 allocs/op
PASS
ok      github.com/google/mangle/factstore      70.689s

@burakemir
Copy link
Collaborator

Nice job on doing a new factstore implementation! I probably wouldn't call it columnar but a column-indexed list store or something, but the benchmark show that it can be faster in some scenarios and that seems useful.

@hertzcodes
Copy link
Contributor Author

yeah I didn't know what to call it either tbh. what do you think? this got potential to be better? I have actually searched about columnar databases for memory. I tried apache arrow itself but it wasn't good at all (2500ns/op). I found another go library https://github.com/kelindar/column it's using faster maps but I did not make an implementation. you think it's worth a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants