PBWorks
This article is a stub. You can help the IndieWeb wiki by expanding it.
PBWorks is a wiki content-hosting silo that was originally called "PBWiki", created and launched at an SHDH hack event, and adopted by many independent individuals, movements, and events in the mid 2000s.
How to
How to leave
There are some steps you can take to start leaving your PBWorks wiki.
Most importantly is to start owning your wiki page URLs.
You can do this by:
- Figure out where you would want to put wiki pages on your site, e.g. in a /wiki/ directory, or even "just" /w/ or perhaps /page/ to indicate a named wiki-page in contrast to date-time stamped posts that may start with a YYYY year (like /2022/) in their URL.
- Set up redirect from your site at that path to your PBWiki.
- E.g. Tantek Γelik has setup /w/ as a redirect for his wiki pages, like
- https://tantek.com/w/WatchFilms -> http://tantek.pbworks.com/WatchFilms which is then redirected by PBWiki to a longer URL with a page-unique-numerical-id etc.
- E.g. Tantek Γelik has setup /w/ as a redirect for his wiki pages, like
This way you can immediately start sharing (and linking to) wiki page links that have your own domain instead of PBWiki so that when you eventually switch over the serving of your wiki pages to your own domain, you don't have to worry about having to update those links.
This also has the advantage of sharing https: URLs (using your own domain), instead of having to share http: (insecure) URLs to your PBWiki.
How to stop creating
If youβre able to setup a redirect like the above, the next thing you can do is stop making your future task of archiving/exporting/importing harder, by not creating new pages.
Instead of creating new pages on your PBWiki, consider uploading a "simple" static HTML page to your own website, at a static path, perhaps even at the path you designed above. Then modify your redirect to only be a 404 handler for that directory, that is, only when a page is not found, should it redirect to your old PBWiki.
How to archive
PBworks does not provide any tools for exporting a usable archive of a wiki. It is possible to archive an entire PBworks wiki, including the history of each page, as a set of static HTML files using wget. Below are the steps involved to make it work, as well as nginx config for serving the archived site.
This method will result in an archive of the PBworks site with all original URLs intact, including page revision URLs.
Create links to all pages
Since PBworks does not have a page that lists all pages, you'll need to ensure all wiki pages are linked from the home page. One way to do this is to create a page called "AllPages" and link to all of your wiki pages from there. Note: This is not necessary if you are sure that every page on your wiki is linked to from some other page.
Create a mirror with wget
Create a full mirror of all the HTML pages using wget. The following wget command will download all pages linked from the home page, including linked CSS and JS files.
PBworks hosts a number of static assets on a different domain, vs1.pbworks.com. By default, wget doesn't cross to other domains to download assets there, so it has to be explicitly enabled.
wget --mirror --page-requisites --convert-links -e robots=off -P . --domains=wiki.oauth.net,vs1.pbworks.com --span-hosts http://wiki.oauth.net/
This will create two folders in the current directory, "wiki.oauth.net" and "vs1.pbworks.com". If you have hot-linked any images from other domains, such as Flickr, you'll want to include those domains in the list when you run the command.
After downloading the initial archive, you can test for which Flickr hosts are referenced by running this command on the resulting files:
cd wiki.oauth.net grep -hRo farm[[:digit:]].static.flickr.com . | sort | uniq
The result will be a list such as:
farm1.static.flickr.com farm2.static.flickr.com farm3.static.flickr.com farm4.static.flickr.com
Then you can re-run the wget command and include those subdomains in the list:
wget --mirror --page-requisites --convert-links -e robots=off -P . --domains=wiki.oauth.net,vs1.pbworks.com,farm1.static.flickr.com,farm2.static.flickr.com,farm3.static.flickr.com,farm4.static.flickr.com --span-hosts http://wiki.oauth.net/
After wget finishes downloading all the files, it rewrites the HTML in each file to point img and links to the relative location of the other domains on disk. This way viewing a page that previously hot-linked to a flickr image will actually view the local file, since it changed the src attribute to reference the local file.
To prepare to serve the archive from nginx, the last step is to rearrange your folder structure slightly. You'll want to move the other domains into your primary domain's folder. In this example, you would execute the following commands:
mv vs1.pbworks.com wiki.oauth.net/ mv farm1.static.flickr.com wiki.oauth.net/ mv farm2.static.flickr.com wiki.oauth.net/ mv farm3.static.flickr.com wiki.oauth.net/ mv farm4.static.flickr.com wiki.oauth.net/
This will work because the src attribute of an image that previously pointed to http://farm1.static.flickr.com/img.png
will be rewritten as ../farm1.static.flickr.com/img.png
. When viewed in a browser from a page such as /index.html
, the browser will attempt to go up one folder which will fail, and will instead resolve to /farm1.static.flickr.com/img.png
.
Configure nginx to serve the archive
You'll want to make a new server directive in your nginx config to serve this archive folder. There are a few tricks required in order for this to work properly. Below is the full config I used in this case.
server { listen 80; server_name wiki.oauth.net; root /web/sites/wiki.oauth.net; index index.html; rewrite ^/$ /w/page/FrontPage permanent; try_files $uri$is_args$args =404; default_type text/html; location ^~ /theme_image { try_files $uri$is_args$args =404; default_type image/png; } }
The tricks required for this are explained below.
rewrite ^/$ /w/page/FrontPage permanent;
PBworks does not actually serve a page from the root, so you need to redirect your root domain to the "FrontPage" page.try_files $uri$is_args$args =404;
Because the history pages are saved on disk with "?" and "&" characters in the filename, you need to force nginx to actually look for a file on disk that matches the full request URI including "?" instead of parsing the query string.default_type text/html;
Tells nginx to default to html content-type for unknown files, since most of your files will be named things like "FrontPage" with no extension on disk.location ^~ /theme_image
This config block tells nginx to send the image/png content-type for paths that match/theme_image
. This for the images that make up the borders of the page, since PBworks serves all of them from a php script named/theme_image.php
with query string parameters to select the proper image.
IndieWeb Examples
Tantek
Tantek has a PBwiki at http://tantek.pbworks.com/ and has a longterm project/goal of transfering the content and roughly equivalent functionality to his own site.
For now he has setup a URL redirect from his own site, so he can share wikipage links that he controls, in the ops that the pages will eventually be there at his own site, or some other placeon his site.
- https://tantek.com/w/ redirects to his PBWiki home page
- e.g. https://tantek.com/w/Markdown redirects to his notes on improving/replacing Markdown
Brainstorming
How to export
We need a much simpler set of steps, or maybe a tool or a service that can produce a simple high-fidelity export that contains all the information from your PBWiki (pages, content, links) intact.
How to setup on your own site
Separately we need options for how to setup a replacement for PBWiki on your own site.
- The wiki-project page has some ideas/brainstorms for how to add wiki-like pages/features to your site
Criticism
Cannot edit without JS
If you have JS turned off or disabled like with the NoScript extension, the "Edit" button doesn't work (it doesn't do anything).
No page that lists all pages
There is no list of all wiki pages by default except in the sidebar loaded in via Javascript. This means making an archive of the site using a tool like wget will not find all the pages. To compensate for this, you can create a wiki page called AllPages or similar, and add a link to every page from that page. Then wget will follow those links since they appear in the HTML and will actually download every page.
Auto-reclaim site-deaths
After 11 months of inactivity (lack of edits?) PBWorks will send an email (presumably to admins?) with a warning, and then in 30 days, delete the entire wiki (name.pbworks.com) and all content therein. E.g. for workspace "ifbt": (actual text of email received)
Hello,
We noticed that you haven't used your workspace named: ifbt for over 11 months.
As you may have heard, we reclaim workspaces that have fallen into disuse (PBworks Spring Cleaning).
Reclaiming these idle workspaces frees up thousands of potentially useful URLs for people who will actually put them to use. We're planning to reclaim your workspace in 30 days.
If you want to keep your workspace, click here. If you're not currently logged into your PBworks account, you'll be asked to log in. You'll know that your workspace has been removed from the deletion list once the warning message disappears.
If you're truly no longer using your workspace, simply do nothing, and in 30 days, we'll delete the unused workspace and reclaim its URL.
Thanks,
The PBworks Team
Who knows how many site-deaths have occurred due to this auto-reclamation.
No https on individual wikis
Individual wikis, subdomains on PBWiki, do not have https support and can ony be accessed via http:.
The login UI at least supports https: https://my.pbworks.com/
See Also
- wiki
- silos
- 2021-12-30 Alan Levine Claiming and Reclaiming a 15 Year Old PB Wiki, some nostalgia and discussion related to why own your own site along with some details about how he archived his wiki to his own site