Common Crawl

Common Crawl

Making web crawl data accessible and analyzable for everyone.

A non-profit initiative that builds and maintains a free, open repository of web crawl data. This data is accessible to anyone and is a valuable resource for researchers. With over 240 billion pages spanning 15 years, it's a treasure trove of information. It's also a primary training corpus in many LLM's and has been cited in over 8000 research papers.

You might also like

YunoHost

YunoHost

Self-hosting for everyone
Open Food Facts

Open Food Facts

A comprehensive food products database made by everyone, for ever
NotebookLM

NotebookLM

An AI notebook for everyone
Phosphor Icons

Phosphor Icons

A flexible icon family for everyone — 588 icons in 6 weights
shelf

shelf

Asset management infrastructure for everyone
FlagWhiz.com

FlagWhiz.com

Flag quiz for everyone
Quadratic

Quadratic

Analyze your data with AI, Python, SQL, and formulas
Pomofocus

Pomofocus

A simple and customizable pomodoro timer for the web
SiteGPT

SiteGPT

ChatGPT for every website
DevTerm

DevTerm

An Open Source Portable Terminal for Every Dev.
Randoma11y

Randoma11y

Accessible color combinations
Counter

Counter

Simple and free web analytics
Total.js Flow

Total.js Flow

Visual Programming Interface for everyone
Xata

Xata

The data platform for modern web applications
Store.app

Store.app

An app store for installable web apps
Learn Morse Code

Learn Morse Code

This web app makes learning morse code fun and easy with this int
Cut Bread

Cut Bread

A website all about cutting pieces of bread as evenly as possible
OpenRefine

OpenRefine

A free open source powerful tool for working with messy data
NumPad

NumPad

A powerful text editor and calculator for the web
PersonalData.info

PersonalData.info

What personal data you are exposing to the web.