KDnuggets, courses/blogs in Data Science, Machine Learning, AI & Analytics (link
Analytics Vidhya, courses/blogs/books in Data Science, Machine Learning (link)
Courses:
Andrew Ng, Machine Learning Specialization, 2012 (link)
Hal Daumé III, A Course in Machine Learning, 2017 (link)
Andrew Ng, The Deep Learning Specialization, 2018 (link)
Bayesian Methods in Machine Learning by Roman Garnett (Washington University in St. Louis) (link)
Date repositories used in ML literature:
UCI Machine Learning Repository. The Machine Learning Repository at UCI provides an up-to-date resource for open-source datasets (link).
Google Dataset Search. Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they are hosted, whether it’s a publisher’s site, a digital library, or an author’s web page. It contains over 25 million datasets (link).
Kaggle. Kaggle provides a large container of datasets, including over 50,000 public datasets and 400,000 public notebooks for the purpose of data exploratory analysis (EDA) (link).
VisualData. It includes computer vision datasets by category; it allows searchable queries (link).
CMU Libraries. This database includes high-quality datasets thanks to the collection of Huajin Wang, at CMU (link).
The Big Bad NLP Database. This cool dataset contains datasets for various natural language processing tasks, created and curated by Quantum Stat (link).
Hugging Face. This popular hub and framework contains 46,121 datasets used in state-of-the-art ML/DL research. This database includes all modalities of data like Natural Language Processing (NLP), Computer Vision (CV), Speech, Tabular, and Multimodal (link).
Data.world. This is yet another data source. Data.world calls itself a “collaborative data community” and holds over 100,000 datasets ranging from crime to social media (link).
Data.gov. This is a database for the US Government’s open data as an attempt to be more transparent. This database hosts over 300,000 datasets from different fields such as environment, ocean, and agriculture (link).
Earthdata. Earthdata has been created by NASA as a part of its Earth Science Data Systems Program called Earth Observing System Data and Information System (EOSDIS) (link).
Europeana Data. Open metadata on 20 million texts, images, videos and sounds gathered by Europeana (link)
Data Planet. The largest repository of standardized and structured statistical data (link)