|All requests||>||Link Checker||>||Request new recommendation||>||Featured requests||>||No recommendations|
by Stephen Johns - 3 years ago (2017-08-28)
I need a way to find broken links in a Web site.
1. by Alekos Psimikakis - 3 years ago (2017-09-25) Reply
The description of your need is ambiguous. I believe you mean 'dead links├ó'. OK, but how do you want to use this 'finder'?
If you just need to check if a link exists, google <php check if a link exists> (w/o quotes). There are plenty of examples. Use the following, it's good: "How can I check if a URL exists via PHP? - Stack Overflow" stackoverflow.com / questions / 2280394 / how-can-i-check-if-a-url-exists-via-php
This class can be used to take a given URL and return the HTTP status code of the page, for example 404 for page not found, or 200 for found, or 301 for redirect, etc. It's not certain if you're looking to simply test a database/list of specific URLs or if you're looking to crawl a page/site looking for bad links. If you're looking to crawl it would be helpful to also know if you're looking for internal bad links or external links.
you can try this ...it has a static method to check if any given url is a broken link and it has 3 other methods to get all brokens internal links,all broken externals link,or simply all internal and external broken link of a given web page or local file...The package has many other method to get more details about a given page...
You can try this package, it will check all anchor links on a given site for existance (http status 200)
I do not think there is a package to handle that. It's basically send a request with the links and analyze the response. Use Curl to accomplish that.
It is a multi-part process. First you need to scrape the website and retrieve the links, which is fairly easy. Then you can use this class to send http requests to the linked sites and capture the response to check if they are returning a good request.
1. by Till Wehowski - 3 years ago (2017-08-30) Reply
I agree with Dave Smith to recommend https://www.phpclasses.org/package/3-PHP-HTTP-client-to-access-Web-site-pages.html for testing the http response code, you can fetch only the headers and check for the response code? To do the first task, fetching the links, I would recommend:
or just a simple REGEX:
$regexp = "<a\s[^>]href=(\"??)([^\" >]?)\\1[^>]>(.)<\/a>"; preg_match_all("/$regexp/siU", $this->content, $matches);
2. by Till Wehowski - 3 years ago (2017-08-30) in reply to comment 1 by Till Wehowski Reply
Somehow the regex in my answer was broken by the site, here it is as gist gist.github.com/wehowski/afc811cb4eb727e97e2a75b1b9d3e3c6
3. by Axel Hahn - 3 years ago (2017-10-06) Reply
I agree this too :-)
For a single webpage you can fetch it (with curl), then parse it (with DOM or regex) to get all links (can be in tags a, iframe, img, link, style, source, ...) and then check these.
To check a complete website you need a bit more, because you don't want to check each link only once, keep all results in a database. This cannot (should not) do a single class.
I currently write my own crawler and ressource checker with web browser interface, but it is still beta (and not linked in my phpclasses projects yet).