Holger Schadeck

Holgers gesammelte (Programmier-) Erfahrungen

sourceforge.net: API ähnlich aktuelle Datei-Download-URLs ermitteln

An dieser Stelle nur ein kleines Beispiel, wie man mit PHP von sourceforge.net API ähnlich die Download URL zu einer Datei-Version eines Spiegel-Servers ermittelt mit Hilfe meiner sourceforge.net SF.net Klasse, die die Snoopy-Klasse erweitert, welche einen Web-Browser simuliert.

Hier die Klasse:

<?php
/**
 * @see http://snoopy.sourceforge.net/
 */
require_once (dirname(__FILE__) . DIRECTORY_SEPARATOR . 'Snoopy.class.php');
/**
 * small and easy API like sourceforge.net file access class
 *
 * @author Holger Schadeck <holger@schadeck.eu>
 * @access public
 */
class SFnet extends Snoopy {
	var $sProjectsBaseUrl = 'http://sourceforge.net/projects/';
	var $sProjectBaseUrl = 'http://sourceforge.net/project/';
	var $sDownloadBaseUrl = 'http://downloads.sourceforge.net/';
	var $host = 'sourceforge.net';
	var $agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4';
	var $aProjectsFilesData = array ();
	/**
	 * gets links and data
	 * 
	 * @access private
	 * @param string $sUrl URL
	 * @return mixed array of links data or boolean false
	 */
	function _getLinks($sUrl) {
		$expandlinks = $this->expandlinks;
		$this->expandlinks = false;
		if ($this->fetch($sUrl)) {
			$aLinks = $this->_striplinks($this->results);
			$aLinks = array_combine($aLinks, $this->_expandlinks($aLinks, $sUrl));
			$aReturnLinks = array ();
			foreach ($aLinks as $sLocal => $sRemote) {
				if (preg_match('/<a .*?href=[\'\"]' . preg_quote($sLocal, '/') . '[\'\"].*?>(.*?)< \/a>/i', $this->results, $aFound)) {
					$sRemote = html_entity_decode(utf8_decode($sRemote));
					$sLocal = html_entity_decode(utf8_decode($sLocal));
					$aReturnLinks[$sRemote] = array (
						'text' => trim(html_entity_decode(utf8_decode($aFound[1]
					))), 'url_download' => $sRemote, 'data' => parse_url($sRemote), 'args' => array ());
					parse_str($aReturnLinks[$sRemote]['data']['query'], $aReturnLinks[$sRemote]['args']);
				}
			}
			$mReturn = $aReturnLinks;
		} else {
			$mReturn = (false);
		}
		$this->expandlinks = $expandlinks;
		return $mReturn;
	}
	/**
	 * Gets and caches the file links data of all packages for a project
	 * 
	 * @access public
	 * @param string $sProject name of a project (exmaple: http://sourceforge.net/projects/snoopy/ => snoopy)
	 * @return array with project file links data arrays, each with the keys: "text", "url_download", "data" and "args"
	 */
	function getFilesOfProject($sProject) {
		$sProjectUrl = $this->sProjectsBaseUrl . $sProject . '/';
		$sDownloadUrl = $this->sDownloadBaseUrl . $sProject;
		if (!array_key_exists($sProject, $this->aProjectsFilesData) && $aLinks = $this->_getLinks($sProjectUrl)) {
			$aReturnLinks = array ();
			foreach (preg_grep('/^' . preg_quote($this->sProjectBaseUrl . 'showfiles.php', '/') . '.*group_id=([0-9]+?).*?#downloads$/', array_keys($aLinks)) as $iKey => $sUrl) {
				if ($aLinks = $this->_getLinks($sUrl)) {
					$this->aProjectsFilesData[$sProject] = array ();
					foreach (preg_grep('/^' . preg_quote($sDownloadUrl, '/') . '/', array_keys($aLinks)) as $iKey => $sFileUrl) {
						$this->aProjectsFilesData[$sProject][strtolower($aLinks[$sFileUrl]['text'])] = & $aLinks[$sFileUrl];
					}
					return $this->aProjectsFilesData[$sProject];
				} else {
					return array ();
				}
				break;
			}
			return array ();
		} else
			if (array_key_exists($sProject, $this->aProjectsFilesData)) {
				return $this->aProjectsFilesData[$sProject];
			} else {
				return array ();
			}
	}
	/**
	 * gets the url for a project file
	 * 
	 * @access public
	 * @uses SFnet::getFilesOfProject to get cached or load not cached project files data
	 * @param string $sProject name of the project
	 * @param string $sFile name of the release file 
	 * @return mixed string URL of project file download or false;
	 */
	function getDownloadUrlForProjectFile($sProject, $sFile) {
		$aProjectFilesData = & $this->getFilesOfProject($sProject);
		return ((array_key_exists($sFile, $aProjectFilesData)) ? $aProjectFilesData[$sFile]['url_download'] : (false));
	}
	/**
	 * gets the modification timestamp for a project file
	 * 
	 * @access public
	 * @uses SFnet::getFilesOfProject to get cached or load not cached project files data
	 * @param string $sProject name of the project
	 * @param string $sFile name of the release file 
	 * @return string timestamp of project file;
	 */
	function getTimeForProjectFile($sProject, $sFile) {
		$aProjectFilesData = & $this->getFilesOfProject($sProject);
		return ((array_key_exists($sFile, $aProjectFilesData) && array_key_exists('modtime', $aProjectFilesData[$sFile]['args'])) ? $aProjectFilesData[$sFile]['args']['modtime'] : (false));
	}
}
?>
</a>


Hier das Beispiel:
<?php
/**
 * Just an example of sourceforge.net.php
 * 
 * @access public
 * @author Holger Schadeck
 */
/**
 * including the SFnet class
 */
require_once ('lib/sourceforge.net.php');
$SFprojects = new SFnet();
/**
 * declare the target project
 */
$sProject = 'mingw';
/**
 * loop over project file
 */
foreach ($SFprojects->getFilesOfProject($sProject) as $sFile => $aFileData) {
	/**
	 * show each file with download URL and modification timestamp
	 * the use of SFnet::getDownloadUrlForProjectFile and SFnet::getTimeForProjectFile methods normally make no sense,
	 * if its project name is the same as the project name of the project files loop,
	 * because you can get the data from the file data array, here $aFileData['download_url'] and $aFileData['args'['modtime']];
	 * 
	 * @see SFnet::getFilesOfProject
	 */
	echo $sFile . ' => ' . $SFprojects->getDownloadUrlForProjectFile($sProject, $sFile) . ' (Modifiziert: ' . $SFprojects->getTimeForProjectFile($sProject, $sFile) . ')<hr />';
}
?>


Wie man vielleich sieht, kann man sogar den Timestamp des Änderungsdatums ermitteln.
Und hier das ganze zum Downloaden:

Tags: url, release, paket, newest, aktuellste, package, api, Allgemein, sourceforge, version, spiegel, PHP, mirror-server, download

Kommentare

Kommentieren

*


*