service@softron.in
support@softron.in
Visitors: , Visitors Today:
Home
Company
Clients
Product
IT Consultancy
Services
Our team
Training
Privacy Policy
Support
Payment Method
About Us
Online Marketing & Film Making
Online Advertising
Advertising Film
Film Production
Chrome Effects
Automation
Industry Planner
Advance Technology
Software Technology
Source Code
Blog
Article
News
Open Source
SEO
Blog : PHP uses DOM
PHP uses DOM
HTML parsing in PHP is done with the DOM module.
$dom = new DOMDocument;
$dom->loadHTML($html);
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
$image->setAttribute('src', 'http://example.com/' . $image->getAttribute('src'));
}
$html = $dom->saveHTML();
Here's an example for pulling out any
tags with the nofollow attribute:
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your HTML
$xpath = new DOMXPath($doc);
// returns a list of all links with rel=nofollow
$nlist = $xpath->query("//a[@rel='nofollow']");
A simple DOM program to extract Google result links
# Use the Curl extension to query Google and get back a page of results
$url = "http://www.google.com";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);
# Create a DOM parser object
$dom = new DOMDocument();
# Parse the HTML from Google.
# The @ before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
@$dom->loadHTML($html);
# Iterate over all the
tags
foreach($dom->getElementsByTagName('a') as $link) {
# Show the
echo $link->getAttribute('href');
echo "
";
}
?>
simple_html_dom
The simple_html_dom module is an alternative to the built-in-DOM module. Since it is a third-party module, you'll have to install it yourself.
Modifying links with simple_html_dom
Say you have some links in your HTML file that look like this:
and you want to convert them to:
but only the ones with a class of "someclass". Here's a program to do that:
$html = new simple_html_dom();
$html->load($input);
foreach($html->find('a[class=someclass]') as $link)
$link->href = 'http://www.example.com' . $link->href;
$result = $html->save();
find lets you easily query the DOM. The parameter is tagtype[attributeName=attributeValue] where the square brackets are an optional filter. Then you just iterate over every link this function finds, and prepend the href attribute with your domain. The href function is both a getter and setter.
Extracting text with simple_html_dom
A common task is to remove all tag markup from a page of HTML, leaving only the text. This is simple:
echo file_get_html('http://www.google.com/')->plaintext;
More alternative parsers for PHP
This thread on StackOverflow discusses a number of different parsing tools available for PHP.
Select Date Of Blogs
Copyright © 2011 - All Rights Reserved -
Softron.in
Template by
Softron Technology