CSVInspector - A graphical interactive tool to inspect and process CSV files.
Copyright (C) 2020 J. Férard https://github.com/jferard
License: GPLv3
CSVInspector needs JRE8 (with embedded JavaFX) or JRE11 and Python 3.8:
$ mvn clean install
$ PYTHONPATH=lang/python:$PYTHONPATH /path/to/jre8/java -jar target/csv_inspector-0.0.1-SNAPSHOT.jar
Code samples are in the menu Help > Snippets.
CSVInspector main goal is to help user performing repetitive one shot tasks on small data sets. Those tasks may be data aggregation, table join, column selection or creation, ...
CSVInspector provides the show method that displays the current stat of a data set.
(If you work on stable data or if the data sets are big, you should consider using SQL.)
Typical use case is:
- load one or two light csv files (< 10 k lines);
- aggregate some columns in both tables;
- add some columns;
- join files;
- save the result.
CSVInspector is a very basic client/server application:
- The server is a Python module that wraps some features of Pandas to handle CSV data.
- The client is a Kotlin/JavaFX GUI that sends Python scripts to the server and displays the results.
Python (won't work for now in a virtual env):
$ pushd lang/python
$ pip install --user -r requirements.txt
$ popd
Kotlin:
$ mvn clean install
$ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -jar target/csv_inspector-1.0-SNAPSHOT.jar
#!/usr/bin/env python3.8
from csv_inspector import *
data = read_csv("fixtures/datasets-2020-02-22-12-33.csv") # this will open a MetaCSV panel to create/save the .mcsv file
data.show() # show the data
# do whatever you want here
data.show()
data.save_as("fixtures/datasets-2020-02-22-12-33-new.csv")
The wrapper provides the following instructions:
path.csvis the path to a csv file.
If the MetaCSV file
path.mcsvexists, return aDataobject. Else, detects the encoding, csv format and column types ofpath.csvand generate a sample MetaCSV file that may be edited and saved. (Will return aDataobject on next call.)
Shows the
Dataobject in a window.
Shows the stats of the
Dataobject in a window.
Returns a copy of the
Dataobject in a window.
path.csvis the path to a csv file.
Saves the
Dataobject to a file.
Note the square brackets.
Create a new col
xis an index, slice or tuple of slices/indices of column_indexfuncis the function to apply toxvaluescol_nameis the name of the new columncol_typeis the type of the new columnindexis the index of the new column
Drop the indices of the handle and select the other indices.
xis an index, slice or tuple of slices/indices
Filter data on a function.
xis an index, slice or tuple of slices/indices.funcis a function that takes thexvalues and returns a boolean
Create a grouper on some rows.
g = data[x].grouper() g[y].agg(func) g.group()
xis the index, slice or tuple of slices/indices of the rowsyis the index, slice or tuple of slices/indices of the aggregate columnsfuncis aggregate function
Make an inner join between two data sets.
xis the index, slice or tuple of slices/indices of the keydata2is anotherDataobjectyis the index, slice or tuple of slices/indices of the other keyfuncis the function to compare thexandyvalues
Make an inner join between two data sets.
xis the index, slice or tuple of slices/indices of the keydata2is anotherDataobjectyis the index, slice or tuple of slices/indices of the other keyfuncis the function to compare thexandyvalues
Create a new col by merging some columns. Those columns are consumed during the process.
xis an index, slice or tuple of slices/indices of column_indexfuncis the function to apply toxvaluescol_nameis the name of the new columncol_typeis the type of the new column
Move some column_group after a given index.
xis an index, slice or tuple of slices/indices of column_indexidxis the destination index
Move some column_group before a given index.
xis an index, slice or tuple of slices/indices of column_indexidxis the destination index
Make an outer join between two data sets.
xis the index, slice or tuple of slices/indices of the keydata2is anotherDataobjectyis the index, slice or tuple of slices/indices of the other keyfuncis the function to compare thexandyvalues
Rename one or more columns
xis the index, slice or tuple of slices/indices of the keynamesis a list of new names
Make an right join between two data sets.
xis the index, slice or tuple of slices/indices of the keydata2is anotherDataobjectyis the index, slice or tuple of slices/indices of the other keyfuncis the function to compare thexandyvalues
Sort the rows in reverse order.
xis the index, slice or tuple of slices/indices of the keyfuncis the key function
Select the indices of the handle and drop the other indices.
xis an index, slice or tuple of slices/indices
Show the first rows of this DataHandle. Expected format: CSV with comma
sort(self, func=None, reverse=False) Sort the rows.
xis the index, slice or tuple of slices/indices of the keyfuncis the key function
Show stats on the data
swap(self, other_handle: 'DataHandle') Swap two handles. Those handles may be backed by the same data or not.
xandyare indices, slices or tuples of slices/indices
Update some column using a function.
xis an indexfuncis a function ofdata[x](use numeric indices)
