Blog of Daniel Ruf

#php | #wordpress

millions of WordPress websites publicly exposed

16.07.2021

Every day attackers are scanning the internet for vulnerable WordPress websites and we can often see corresponding probing requests on most websites, even if they do not use WordPress.

But there is a way to create a big list of possible targets in a much more efficient way.

Automattic (the company behind WordPress) has created a plugin called Jetpack, which provides a growing set of features. One of these features is the support for shortlinks. The shortlinks have the following format: https://wp.me/[id], for example https://wp.me/4 which redirects to http://matt.blog/, the blog from Matt Mullenweg. Every website / domain gets its own permanent shortlink.

In the HTML source code of the website the shortlink is output like this:

<link rel='shortlink' href='https://wp.me/4'>

Jetpack is currently installed and enabled on

  • all WordPress instances hosted on wordpress.com
  • over 5 million active plugin installations according to the plugin statistics and the website

In total there are probably even more than 5 million domains, that use or used Jetpack from the plugin directory.

According to the live activity map every WordPress instance that has Jetpack installed sends also some statistics to Automattic.

This is a collection of stats from around WordPress.com that we’ve decided to share with the world. Interested in your own stats? Every WordPress.com blog includes an integrated stats system, also available for self-hosted WordPress sites with Jetpack.The following stats are for blogs we host here on WordPress.com, both on subdomains and their own domains, or externally-hosted blogs that use our Jetpack plugin and are part of our network.

Let’s take a quick look at the code and logic behind the shortlink generation:

function wpme_dec2sixtwo( $num ) {
    $index = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    $out = "";
    if ( $num < 0 ) {
        $out = '-';
        $num = abs( $num );
    }
    for ( $t = floor( log10( $num ) / log10( 62 ) ); $t >= 0; $t-- ) {
        $a = floor( $num / pow( 62, $t ) );
        $out = $out . substr( $index, $a, 1 );
        $num = $num - ( $a * pow( 62, $t ) );
    }
    return $out;
}

You might notice, that this function takes a number and converts it to the shortlink by applying an algorithm. Let’s try to pass some numbers and retrieve the generated shortlinks with the following PHP code:

(int)$counter = file_get_contents('.counter');
$url = "https://wp.me/" . wpme_dec2sixtwo($counter);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$headers = curl_exec($ch);
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($code === 200){
    if (preg_match('/^Location: (.+)$/im', $headers, $matches)) {
        $domain = trim($matches[1]);
        $fh = fopen('data.csv', 'a+');
        fwrite($fh, "${counter},\"${url}\",\"${domain}\"\n");
        fclose($fh);
    }
}

When we run this in a loop and increase the counter variable after each iteration, we get the following results:

0,"https://wp.me/","https://wordpress.com/"
1,"https://wp.me/1","http://wordpress.com/"
2,"https://wp.me/2","https://donncha.wordpress.com/"
4,"https://wp.me/4","http://matt.blog/"
6,"https://wp.me/6","https://anthony.wordpress.com/"
7,"https://wp.me/7","http://daryl.blog/"
8,"https://wp.me/8","https://ian.wordpress.com/"
14,"https://wp.me/e","https://geoffrey.wordpress.com/"
15,"https://wp.me/f","https://bart.wordpress.com/"
22,"https://wp.me/m","https://bob.wordpress.com/"
25,"https://wp.me/p","https://dmackenzie.wordpress.com/"
31,"https://wp.me/v","https://mnk.wordpress.com/"
32,"https://wp.me/w","https://gkainonapster.wordpress.com/"
63,"https://wp.me/11","https://wepence.wordpress.com/"
64,"https://wp.me/12","http://factoryjoe.com/"
65,"https://wp.me/13","https://unweak.wordpress.com/"
66,"https://wp.me/14","https://glenda.wordpress.com/"
67,"https://wp.me/15","https://mike.wordpress.com/"
68,"https://wp.me/16","https://fuegodesigns.wordpress.com/"
69,"https://wp.me/17","https://mary.wordpress.com/"
70,"https://wp.me/18","http://omgwtflolbbq.com/"
71,"https://wp.me/19","https://kae.wordpress.com/"
72,"https://wp.me/1a","https://lorelle.wordpress.com/"
74,"https://wp.me/1c","https://ozh.wordpress.com/"
75,"https://wp.me/1d","https://dangerousmeta2.wordpress.com/"
76,"https://wp.me/1e","https://vanfossen.wordpress.com/"
77,"https://wp.me/1f","https://jonasgoldstein.wordpress.com/"
78,"https://wp.me/1g","http://boren.blog/"
79,"https://wp.me/1h","https://james.wordpress.com/"
80,"https://wp.me/1i","https://sillygwailo.wordpress.com/"
...

We can clearly see a pattern here, which is not so good. The best solution is to not use incremental public identifiers but randomly generated ones based on UUID v4. This is a common API design error that also many big companies do. In this case here we can see that the generated letters are increased with the input number. And there is probably an entry for every website that uses or used Jetpack.

To verify this and to see how many entries can be retrieved within a short amount of time, I have created an optimized code with the help of Fabian Kupferschläger, who is a good friend and has experience with Python and working with bigger data sets.

After a few days we had more than 36 million shortlinks, and there are also entries for websites, which do not use WordPress anymore. So after you have installed and activated Jetpack, you were automatically part of this list. As there does not seem to be any opt-out option, this may be permanent.

We have done some further testing and it seems there are currently almost 200 million registered Jetpack instances.

This is an excerpt of the results from running the code for a short time:

...
36546401,"https://wp.me/2tlo5","https://socialgoodjobs.wordpress.com/"
36546403,"https://wp.me/2tlo7","https://ennylovesrio.wordpress.com/"
36546404,"https://wp.me/2tlo8","https://zoeabroad.wordpress.com/"
36546405,"https://wp.me/2tlo9","https://rolexreplica098.wordpress.com/"
36546406,"https://wp.me/2tloa","https://rsreliablesolution.wordpress.com/"
36546407,"https://wp.me/2tlob","https://alazdyb0he.wordpress.com/"
36546410,"https://wp.me/2tloe","https://rolexreplica908.wordpress.com/"
36546412,"https://wp.me/2tlog","https://cjkmpa.wordpress.com/"
36546413,"https://wp.me/2tloh","https://bosco821.wordpress.com/"
36546414,"https://wp.me/2tloi","https://daavdotorg.wordpress.com/"
36546415,"https://wp.me/2tloj","https://motivationonlive.wordpress.com/"
36546416,"https://wp.me/2tlok","https://rolexreplica803.wordpress.com/"
36546417,"https://wp.me/2tlol","https://kotakkecildsignlab.wordpress.com/"
36546419,"https://wp.me/2tlon","http://ipresswebsites.com/"
36546420,"https://wp.me/2tloo","https://cigarettesonlinestore245.wordpress.com/"
36546421,"https://wp.me/2tlop","https://empoweredtorelease.wordpress.com/"
36546422,"https://wp.me/2tloq","https://mp3elite39.wordpress.com/"
36546423,"https://wp.me/2tlor","https://companyroderickrg.wordpress.com/"
36546424,"https://wp.me/2tlos","https://dirrrrrrectioner1d.wordpress.com/"
36546425,"https://wp.me/2tlot","https://ambrosehandb1024.wordpress.com/"
36546426,"https://wp.me/2tlou","https://aaqzcblog.wordpress.com/"
...

Actually, reversing the values is quite easy. You might have noticed, that the function is called wpme_dec2sixtwo and it converts the provided decimal number with the base number 62 to a string. This is a simple base62 obfuscation that can be reversed like any other similar function (base64_encode and base64_decode for example use the base number 64).

This can be tested by utilizing some base62 decode function like the one from Tiny:

function wpme_sixtwo2dec($str) {
    $set = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';

    $radix = strlen($set);
    $strlen = strlen($str);
    $n = 0;
    for ($i = 0; $i < $strlen; $i++) {
        $n += strpos($set, $str[$i]) * pow($radix, ($strlen - $i - 1));
    }
    return $n;
}

Let’s verify this with 2tlou and 36546426:

echo var_export(wpme_dec2sixtwo(36546426) === '2tlou') . "\n";
// true
echo var_export(wpme_sixtwo2dec('2tlou') === 36546426) . "\n";
// true

After Jetpack is installed and enabled, the shortlink is automatically generated. This was the default behavior until May 2019.

In April 2019 the developers changed the default setting. The changes are documented in a pull request but these are not clearly communicated in the changelog for version 7.3 of Jetpack, which was released on the 7th of May 2019. Also the blogpost for this release does not contain much more details.

But we can see the new default option in the support document for the Jetpack shortlinks:

May 06, 2019

This feature is activated by default.

May 08, 2019

This feature is deactivated by default.

Additionally some services by other Automattic companies (like Pressable) can be used to find managed WordPress instances. You can test this by changing the number in the following URL:

https://149378260.v2.pressablecdn.com


This blogpost is the result of a manual code review in 2019 after seeing the almost identical shortlinks of a few websites which were created on the same day.