Datasets to download¶
Here we list a few datasets, that might be interesting to explore with vaex
New york taxi dataset¶
See for instance Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance for some ideas.
[2]:
import vaex
[12]:
df = vaex.open("/Users/users/breddels/.vaex/data/nyc_taxi/nyc_taxi2015.hdf5")
df.plot(df.col.pickup_longitude, df.col.pickup_latitude, f="log1p", show=True, limits="96%");
SDSS - dereddened¶
Only: ra
, dec
, g
, r
, g_r
(deredenned using Schlegel maps).
The original query at SDSS archive was (although split in small parts):
SELECT ra, dec, g, r from PhotoObjAll WHERE type = 6 and clean = 1 and r>=10.0 and r<23.5;
[22]:
sdss = vaex.open("/Users/maartenbreddels/vaex/data/sdss/sdss_dereddened.hdf5")
sdss.healpix_plot(sdss.col.healpix, show=True, f="log", healpix_max_level=9, healpix_level=9,
healpix_input='galactic', healpix_output='galactic', rotation=(0,45)
)
Gaia¶
See the Gaia Science Homepage for details, and you may want to try the Gaia Archive for ADQL (SQL like) queries.
- Gaia data release 2 (DR2)
- Gaia data release 1 (DR1)
[3]:
gaia = vaex.open("/data/users/gaia/gaia-dr2/gaia-dr2-sort-by-source_id.hdf5")
gaia.plot("ra", "dec", f="log", limits=[[360, 0], [-90, 90]], show=True);
Helmi & de Zeeuw 2000¶
Result of an N-body simulation of the accretion of 33 satellite galaxies into a Milky Way dark matter halo * 3 million rows - 252MB
[26]:
hdz = vaex.datasets.helmi_de_zeeuw.fetch() # this will download it on the fly
hdz.plot([["x", "y"], ["Lz", "E"]], f="log", figsize=(12,5), show=True);