Today, give a try to Techtonique web app, a tool designed to help you make informed, data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization
Data Frames are a way to represent tabular data, that is widely used and useful for Statistical Learning. Basically, a Data Frame = Tabular data + Named columns, and there are different implementations of this data structure, notably in R, Python and Apache Spark. The querier exposes a query language to retrieve data from Python pandas
Data Frames, inspired from SQL’s relational databases querying. Currently, the querier
can be installed from Github as:
pip install git+https://github.com/Techtonique/querier.git
There are 9 types of operations available in the querier
, with no plan to extend that list much further (to maintain a relatively simple mental model). These verbs will look familiar to dplyr
users, but the implementation (numpy
, pandas
and SQLite3
are used) and functions’ signatures are different:
concat
: concatenates 2 Data Frames, either horizontally or vertically
delete
: deletes rows from a Data Frame based on given criteria
drop
: drops columns from a Data Frame
filtr
: filters rows of the Data Frame based on given criteria
join
: joins 2 Data Frames based on given criteria (available for completeness of the interface, this operation is already straightforward in pandas)
select
: selects columns from the Data Frame
summarize
: obtains summaries of data based on grouping columns
update
: updates a column/creates a new column, using an operation given by the user
request
: for operations more complex than the previous 8 ones, makes it possible to directly use a SQL query on the Data Frame
The following notebooks present multiple examples of use of the querier
:
concat
exampledelete
exampledrop
examplefiltr
examplejoin
exampleselect
examplesummarize
exampleupdate
examplerequest
example
Contributions/remarks are welcome as usual, you can submit a pull request on Github.
Note: I am currently looking for a gig. You can hire me on Malt or send me an email: thierry dot moudiki at pm dot me. I can do descriptive statistics, data preparation, feature engineering, model calibration, training and validation, and model outputs’ interpretation. I am fluent in Python, R, SQL, Microsoft Excel, Visual Basic (among others) and French. My résumé? Here!
Under License Creative Commons Attribution 4.0 International.
Comments powered by Talkyard.