Generate static html files from any website with httrack

I came across a need to take a cakePHP dynamically generated site and turn it into a collection of HTML files. After some trial and error I came across httrack which fit the need beautifully (thanks to this site for pointing me there.)

To install httrack run the following (for ubuntu-based distros)

sudo add-apt-repository ppa:upubuntu-com/web
sudo apt-get update
sudo apt-get install webhttrack httrack

Oncne httrack is installed simply launch it from the command line:

httrack

Follow the prompts and in no time you’ll have a folder with static HTML files for your entire website. Easy!

Fix vmware-view RDP issues in Linux

I’ve been using vmware view as a means to remote into my computers at work for some time now. An update to the linux client appears to have broken my ability to remote into work machines over the RDP protocol. This issue affects multiple distros.

The symptom is the fact that after you log into vmware-view and double click on a computer you wish to connect to over the RDP protocol, the screen flashes for a second and then takes you right back to where you started – no error message. Frustrating.

If you launch vmware-view in a console you get a little more insight into what’s going on:

RDP Client(10222): WARNING: Unknown -r argument
2017-02-26 14:06:53.817-07:00: vmware-view 7858| RDP Client(10222): 
2017-02-26 14:06:53.817-07:00: vmware-view 7858| RDP Client(10222): Possible arguments are: comport, disk, lptport, printer, sound, clipboard, scard

After much frustration I was able to combine documentation from vmware and freerdp in order to finally get the right combination of arguments to get things working again. I read that freerdp works better than rdesktop with this version, so I tried launching vmware-view with this option:

vmware-view --rdpclient="xfreerdp"

Progress – at least now the error message was different.

RDP Client(20799): [14:04:04:097] [20799:20803] [ERROR][com.freerdp.crypto] - certificate not trusted, aborting.

After more investigation the culprit turned out to be crypto negotiation. Since I’m already connected to the truste work VMware server, I don’t really care about certificate validation. This is what finally got me up and running. The key components are the rdpclient and the /cert-ignore options.

vmware-view --rdpclient="xfreerdp" --xfreerdpOptions="-wallpaper /sound:sys:alsa /cert-ignore"

You can include these options in your ~/.vmware/view-preferences config file so you don’t have to manually add all those switches:

echo 'view.rdpClient = "xfreerdp"
view.xfreerdpOptions = "-wallpaper /sound:sys:alsa /cert-ignore"' >> ~/.vmware/view-preferences

Finally RDP via vmware-view is working in Linux again.


Update 11/18/2018: I was having a very annoying issue with sound on my Debian Jessie box. If the system I was remoted into made any sound, it would corrupt the sound on my machine. It would make every sound sound tinny and fuzzy. I would have to run pulseaudio -k to fix it.

It turns out this was caused by /sound:sys:alsa above. I changed that to be /sound:sys:pulse and all was well.

Purge Varnish cache by visiting URL

I came across a need to allow web developers to purge Varnish cache. The problem is the developers aren’t allowed access to the production machine and our web application firewall blocks purge requests. I needed for there to be a way for them to simply access a page hosted on the webserver and cause it to purge its own varnish cache.

I was able to accomplish this by placing a PHP file in the webserver’s directory and controlling access to it via .htaccess. Thanks to this site for the php script and this one for the .htaccess guidance.

Place this PHP file where the web devs can access it, making sure to modify the $hostname variable to suit your needs and to rename the file to have a .php extension.

<?php

#Simple script to purge varnish cache
#Adapted from the script from http://felipeferreira.net/index.php/2013/08/script-purge-varnish-cache/
#This script runs locally on the webserver and will purge all cached pages for the site specified in $hostname
#Modify the $hostname parameter below to match the hostname of your site

$hostname = $_SERVER['SERVER_NAME'];

header( 'Cache-Control: max-age=0' );

$port     = 80;
$debug    = false;
$URL      =  "/*";

purgeURL( $hostname, $port, $URL, $debug );

function purgeURL( $hostname, $port, $purgeURL, $debug )
{
    $finalURL = sprintf(
        "http://%s:%d%s", $hostname, $port, $purgeURL
    );

    print( "<br> Purging ${finalURL} <br><br>" );

    $curlOptionList = array(
        CURLOPT_RETURNTRANSFER    => true,
        CURLOPT_CUSTOMREQUEST     => 'PURGE',
        CURLOPT_HEADER            => true ,
        CURLOPT_NOBODY            => true,
        CURLOPT_URL               => $finalURL,
        CURLOPT_CONNECTTIMEOUT_MS => 2000
    );

    $fd = true;
    if( $debug == true ) {
        print "<br>---- Curl debug -----<br>";
        $fd = fopen("php://output", 'w+');
        $curlOptionList[CURLOPT_VERBOSE] = true;
        $curlOptionList[CURLOPT_STDERR]  = $fd;
    }

    $curlHandler = curl_init();
    curl_setopt_array( $curlHandler, $curlOptionList );
    $return = curl_exec($curlHandler);

    if(curl_error($curlHandler)) {
    print "<br><hr><br>CRITICAL - Error to connect to $hostname port $port - Error:  curl_error($curl) <br>";
    exit(2);
 }

    curl_close( $curlHandler );
    if( $fd !== false ) {
        fclose( $fd );
    }
    if( $debug == true ) {
        print "<br> Output: <br><br> $return <br><br><hr>";
    }
}
?>

<title>Purge cache</title>
Press submit to purge the cache
<form method="post" action="<?php echo $PHP_SELF;?>">
<input type="submit" value="submit" name="submit">
</form>

Place (or append) the following .htaccess code in the same directory you placed the php file:

#Only allow internal IPs to access cache purge page
<Files "purge.php">
    Order deny,allow
    Deny from all
    SetEnvIf X-Forwarded-For ^10\. env_allow_1
    Allow from env=env_allow_1
    Satisfy Any
</Files>

The above code only allows access to the purge.php page from IP addresses beginning with “10.” (internal IPs)

This PHP / .htaccess combo allowed the web devs to purge cache without any system access or firewall rule changes. Hooray!