Datasets¶
city_bike_nyc¶
Dataset description¶
The dataset contains information about what Citi Bikers do in NYC.
NAME |
REQUIRED |
TYPE |
---|---|---|
station_id |
false |
string |
num_bikes_available |
false |
int |
num_ebikes_available |
false |
string |
num_bikes_disabled |
false |
string |
num_docks_available |
false |
int |
num_docks_disabled |
false |
string |
is_installed |
false |
string |
is_renting |
false |
string |
is_returning |
false |
string |
station_status_last_reported |
false |
int |
station_name |
false |
string |
lat |
false |
string |
lon |
false |
string |
region_id |
false |
string |
capacity |
false |
string |
has_kiosk |
false |
string |
station_information_last_updated |
false |
string |
missing_station_information |
false |
boolean |
NUMBER OF ROWS: 366,676,951
taxi_fhvhv¶
NYC TLC Trip Record Data, available under the nyc.gov terms of use.
Dataset description¶
Dataset segment:
Trips from 2019/02 to 2023/07;
High Volume For-Hire Vehicle Trip Records only (fhvhv).
Please note that each row corresponds to one taxi trip.
Column name |
Required |
Type |
---|---|---|
access_a_ride_flag |
FALSE |
string |
airport_fee |
FALSE |
int |
base_passenger_fare |
FALSE |
double |
bcf |
FALSE |
double |
congestion_surcharge |
FALSE |
double |
dispatching_base_num |
FALSE |
string |
DOLocationID |
FALSE |
long |
driver_pay |
FALSE |
double |
dropoff_datetime |
FALSE |
timestamptz |
hvfhs_license_num |
FALSE |
string |
on_scene_datetime |
FALSE |
timestamptz |
originating_base_num |
FALSE |
string |
pickup_datetime |
FALSE |
timestamptz |
PULocationID |
FALSE |
long |
request_datetime |
FALSE |
timestamptz |
sales_tax |
FALSE |
double |
shared_match_flag |
FALSE |
string |
shared_request_flag |
FALSE |
string |
tips |
FALSE |
double |
tolls |
FALSE |
double |
trip_miles |
FALSE |
double |
trip_time |
FALSE |
long |
wav_match_flag |
FALSE |
string |
wav_request_flag |
FALSE |
string |
NUMBER OF ROWS: 899,297,740
taxi_zones¶
NYC Taxi Zones, available under the nyc.gov terms of use.
Dataset description¶
NYC Taxi Zones, which correspond to the pickup and drop-off zones, or LocationIDs, included in the Yellow, Green, and FHV Trip Records published to Open Data
NAME |
REQUIRED |
TYPE |
---|---|---|
LocationID |
false |
long |
Borough |
false |
string |
Zone |
false |
string |
service_zone |
false |
string |
NUMBER OF ROWS: 265
titanic¶
Titanic - Machine Learning from Disaster
Dataset description¶
The data includes only the train.csv
part of the original dataset.
The dataset contains the ground truth for each passenger of the Titanic.
Column name |
Required |
Type |
---|---|---|
PassengerId |
false |
long |
Survived |
false |
long |
Pclass |
false |
long |
Name |
false |
string |
Sex |
false |
string |
Age |
false |
double |
SibSp |
false |
long |
Parch |
false |
long |
Ticket |
false |
string |
Fare |
false |
double |
Cabin |
false |
string |
Embarked |
false |
string |
NUMBER OF ROWS: 891
wind_energy_sensor_data¶
The Dataste description¶
The dataset contains information from four German energy companies (50 Hertz, Amprion, TenneT TSO and TransnetBW). It contains power generation data (non-normalized) with an interval of 15 minutes, totalizing 96 points a day. Generation is in THw, with data collected between 23/08/2019 and 22/09/2020.
COLUMN NAME |
REQUIRED |
TYPE |
---|---|---|
hour_000000 |
false |
double |
hour_001500 |
false |
double |
hour_003000 |
false |
double |
… |
… |
… |
observation_date |
false |
date |
company |
false |
string |
Number of rows: 1588