WebsiteHunt is broughtBrought to you by Fotoia
Common Crawl

Common Crawl

Making web crawl data accessible and analyzable for everyone.

A non-profit initiative that builds and maintains a free, open repository of web crawl data. This data is accessible to anyone and is a valuable resource for researchers. With over 240 billion pages spanning 15 years, it's a treasure trove of information. It's also a primary training corpus in many LLM's and has been cited in over 8000 research papers.

Comments

Loading comments...

You might also like

One Million Checkboxes

One Million Checkboxes

Checking a box checks it for everyone!
YunoHost

YunoHost

Self-hosting for everyone
Open Food Facts

Open Food Facts

A comprehensive food products database made by everyone, for ever
NotebookLM

NotebookLM

An AI notebook for everyone
Phosphor Icons

Phosphor Icons

A flexible icon family for everyone — 588 icons in 6 weights
PromptQL

PromptQL

Agentic data access for your AI
Web-Check

Web-Check

All-in-one OSINT tool for analyzing any website
Get out of my <head>

Get out of my <head>

Make faster, more accessible, more environmentally friendly websi
shelf

shelf

Asset management infrastructure for everyone
FlagWhiz.com

FlagWhiz.com

Flag quiz for everyone
Quadratic

Quadratic

Analyze your data with AI, Python, SQL, and formulas
Pomofocus

Pomofocus

A simple and customizable pomodoro timer for the web
SiteGPT

SiteGPT

ChatGPT for every website
SankeyMATIC

SankeyMATIC

An online Sankey diagram builder for everyone
DevTerm

DevTerm

An Open Source Portable Terminal for Every Dev.
Counter

Counter

Simple and free web analytics
Randoma11y

Randoma11y

Accessible color combinations
Total.js Flow

Total.js Flow

Visual Programming Interface for everyone
Xata

Xata

The data platform for modern web applications
How Much Rent

How Much Rent

Rental Transparency for Everyone