vaex 4.17.0 documentation
What is Vaex?
Installation
Tutorials
Guides
Advanced plotting examples
Arrow
Async programming with Vaex
Caching
Dask
Data Types
GraphQL
I/O Kung-Fu: get your data in and out of Vaex
Handling missing or invalid data
Machine Learning: the Iris dataset
Machine Learning: the Titanic dataset
Performance notes
Progress Bars
Vaex server
Configuration
API
Datasets
FAQ
.rst
.pdf
repository
open issue
suggest edit
Guides
¶
Advanced plotting examples
A single plot
Multiple plots of the same type
Multiple plots, same axes, different statistics
Multiple plots, different axes, different statistics
Slices in a 3rd dimension
Many plots with wrapping
Plotting selections
Overplotting a vector field on a heatmap
Plotting a healpix map
Arrow
Opens instantly
Quick viz of 146 million rows
Data cleansing: outliers
Shallow copies
Virtual column
Result
Interoperability
Tutorial
Async programming with Vaex
Using
delay=True
Using the
@delayed
decorator
Async
await
Async auto execute
Caching
Dask
Dask.array
Data Types
Supported Data Types in Vaex
General advice on data types in Vaex
Higher dimensional arrays
String support in Vaex
GraphQL
Pandas support
Server
GraphiQL
I/O Kung-Fu: get your data in and out of Vaex
Data input
Data export
Handling missing or invalid data
“nan” vs “missing” vs “na”
Examples
Machine Learning: the Iris dataset
PCA
Gradient boosting trees
Automatic pipelines
Production
Performance
Machine Learning: the Titanic dataset
Adjusting
matplotlib
parmeters
Get the data
Feature engineering
Modeling (part 1): gradient boosted trees
Modeling (part 2): Linear models & Ensembles
Performance notes
Virtual columns
Materializing the columns
Consideration in backends with multiple workers
Progress Bars
Basic progress bars
Rich based progress bars
Vaex server
Why
Starting the dataframe server
Python API
REST API
Example using plotly.js
Jupyter integration: interactivity
Advanced plotting examples