I had just released https://versiondb.io a few hours ago. It's something where you're able to get a slice of what's running on the web without breaking the bank. The full version contains over 4M domains and over 3K detected technologies.
Is there data on what site search engine they use? This is hard to get as it is sitting deep in the backend but will be super useful information. built with doesn’t have this, but they do have a list mapping search engine (Bloomreach, Coveo, Algolia) to the website probably based on private data dumps. Being able to look this up for a website will be very useful.
This dataset does include Bloomreach Discovery, Coveo and Algolia. These were detected by looking through HTTP responses for publicly available web pages. For example, Coveo was detected by searching a script tag's src attribute for "static.cloud.coveo.com".
I had just released https://versiondb.io a few hours ago. It's something where you're able to get a slice of what's running on the web without breaking the bank. The full version contains over 4M domains and over 3K detected technologies.
Have fun and I hope you guys find it useful.
Is there data on what site search engine they use? This is hard to get as it is sitting deep in the backend but will be super useful information. built with doesn’t have this, but they do have a list mapping search engine (Bloomreach, Coveo, Algolia) to the website probably based on private data dumps. Being able to look this up for a website will be very useful.
This dataset does include Bloomreach Discovery, Coveo and Algolia. These were detected by looking through HTTP responses for publicly available web pages. For example, Coveo was detected by searching a script tag's src attribute for "static.cloud.coveo.com".
You can check out everything that was detected in the full version here: https://versiondb.io/detection_list.json
If you'd like to know how the others were detected, I can go through that as well. See if it's what you're after.