Blogging Daily is a lot of Work
If you write multiple blog posts a day, it's a big waste of time to go to each of the N number of publishing services out there and fill out all of their forms. Any task like that can be automated.
The problem is, most of these services don't want robots spamming their system with junk posts. So, there's no automation possible.
Enter Ruby's Mechanize.
Automate Filling Out Mixx's Forms With Mechanize
Mixx is a service that allows you to submit a url to an engine that gets a lot of traffic. Readers can then sort through categories and tags of posts as they are added.
Here is the manual workflow for doing that:
- Open http://www.mixx.com/
- Scroll to the login box and fill it out.
- Click "Submit a Link" on the next page.
- Enter the URL on that page
- Click submit
- Enter the title, description, tags, and categories on the next page
- Fill out the captcha
- Submit
Do that multiple times a day for multiple services and you have a full time job ;).
Here's what you can do to automate that with Mechanize:
The Cool Part: Programmatically Submitting a Captcha
In the middle of the above snippet, I programmatically submit a captcha with Ruby. Here's what happens...
First, look for the captcha in the current page. Sometimes it's just an image with an input, sometimes it's in an iframe, sometimes it's created with Javascript... Whatever it is, you just need to be able to:
- Get the URL of the image for the Captcha
- Get the input field name for the Captcha that you're supposed to fill out
In Mixx, they create the Captcha with Javascript, but they have a noscript version in an iframe. So I do this:
- Use Nokogiri and XPath to grab the url for the iframe (Mechanize is built around Nokogiri)
- Go to the iframe
- Scrape out the dynamically generated image url from the captcha iframe
- Programmatically open the browser window to show you the image:
- On Macs:
system("open", "http://viatropos.com") - On Windows:
system("start", "http://viatropos.com")
- On Macs:
- Console asks you to enter the Captcha text, so you look at browser image, and type it into the terminal and press enter.
- Mechanize submits the form, and scrapes out the captcha secret.
- I add that secret to the form on the original page
Pretty cool, it cuts out every step in publishing to mixx except interpreting the captcha. Hopefully that saves you time.
Automating Blog Post Submission
So this is currently a static blog hosted on Github Pages. It allows me to write everything in Textmate in Markdown, write metadata using YAML like Jekyll does, and use that information to automatically fill out the form on Mixx.
This document looks like this in Markdown:
I wrote that information down when it was most fresh in my mind, and I don't want to think about it again. Thanks Mechanize, you rock.