Skip to content

akshayajddn/Upbit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

1. To generate data, open random_gen.py(Upbit/Debug/Others/) first choose the value for nl(size of random sample) and nr(same value repeated). Loop is there to generate large amount of data. Output filename and entries that should be in the column may be set as indicated in the comments in file.
	Usage : python random_gen.py

2. To generate bitvectors (Uncompressed), open bitv.py(Upbit/Debug/Others/) and change the name variable to file_name of your data file, then there will be a folder created with the column bitvectors files. The variable name is name of the data file (this name should contain suffix "_un" preferably). This folder contains a file for each column. Each file contains all the bitvectors of corresponding column.
	Usage : python bitv.py
	Dependencies : numpy { install numpy using pip by running "sudo pip install numpy", if pip is not installed then run "sudo apt-get install python-pip"}

3. Set execute permission for merge_update and initializeub_update, located in "Upbit/Debug/".
	Example : sudo chmod +x merge_update

4. To generate queries,  we need to run query_rand.py present in Debug folder
	op1 is array of operations need to be queried; op3 is array of distinct values in column; op4 is array of column names
	reinitialize size to total number of rows in column
	number of queries and number of rows in columns, upto which to be queried for can be given at the places indicated in comments in the file

	Usage : python query_rand.py > queries.txt
		Queries generated are stored in 'queries.txt' file

5. Copy the folder of uncompressed bitvectors and queries.txt to the Debug folder

6. Load the project in an IDE. Build and run. You are prompted to enter names of uncompressed folder and query-set file (Both must be in Debug folder).
	Threshold and multiplier can be changed in master.h (in #define preprossesing directives)
	master.cpp contains main function. You can set the method to UpBit by resetting the multiprocessing flag to 0. You can set the method to parallelized UpBit by setting the same to 1.
	You may uncomment any line(s) in the main function to observe that particular function.

7. First WAH compression is done and folder with "_wah" suffix is generated
	It contains a subfolder for each column
	Each column-folder contains three subfolders: fp,vbv,ubv which respectively stand for fence pointers, value bit vectors and update bit vectors
	These subfolder contain a file for each bitvector

8. After queries are evaluated, overall latency is printed on the console. If multiprocessing flag is set to 0 (upbit method is selected), "latencies_upbit.txt" is generated(in Debug folder), which contains latencies of queries. In the other case, "latencies_multi.txt" is created.

9. 'Others' folder in Debug folder contains programs used to generate plots: plot.py, plot_bar.py. Results.txt contains results observed during evaluation. They are used for plots.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published