19 points | by datafreak_ 2 hours ago
2 comments
Nice one.
I've been doing the same bit wider scope, for the whole Crux list, pruned to apex domains, and looking for CMS signals - how's your throughput?
I'm not doing any headless browser stuff, or many requests, so hyper optimised for speed.
I do grab robots.txt - didn't really see much in llms.txt or humans.txt in the wild, does yours?
Ohh Cloudflare verified bot status, interesting I'll check that out.
I'm seeing about 6.6% block rate, but that does climb over time.
One extension, beyond stack: market category/domain/application - or any combo that tells me what the product does.
Fab project otherwise!
Nice one.
I've been doing the same bit wider scope, for the whole Crux list, pruned to apex domains, and looking for CMS signals - how's your throughput?
I'm not doing any headless browser stuff, or many requests, so hyper optimised for speed.
I do grab robots.txt - didn't really see much in llms.txt or humans.txt in the wild, does yours?
Ohh Cloudflare verified bot status, interesting I'll check that out.
I'm seeing about 6.6% block rate, but that does climb over time.
One extension, beyond stack: market category/domain/application - or any combo that tells me what the product does.
Fab project otherwise!